Re: [FFmpeg-devel] [PATCH v11 1/5] avcodec/jpegxl: add Jpeg XL image codec and parser
1 Apr 2022, 04:13 by d...@lynne.ee: > 1 Apr 2022, 02:20 by leo.i...@gmail.com: > >> This commit adds support to libavcodec to read and parse >> encoded Jpeg XL images. Jpeg XL is intended to be an >> extended-life replacement to legacy mjpeg. >> --- >> MAINTAINERS| 2 + >> libavcodec/Makefile| 1 + >> libavcodec/codec_desc.c| 9 + >> libavcodec/codec_id.h | 1 + >> libavcodec/jpegxl.h| 43 ++ >> libavcodec/jpegxl_parser.c | 951 + >> +} >> +} >> +if (header->color_space == FF_JPEGXL_CS_GRAY) { >> +if (header->bits_per_sample <= 8) >> +return alpha ? AV_PIX_FMT_YA8 : AV_PIX_FMT_GRAY8; >> +if (header->bits_per_sample > 16 || header->exp_bits_per_sample) >> +return alpha ? AV_PIX_FMT_NONE : AV_PIX_FMT_GRAYF32; >> +return alpha ? AV_PIX_FMT_YA16 : AV_PIX_FMT_GRAY16; >> +} else if (header->color_space == FF_JPEGXL_CS_RGB >> +|| header->color_space == FF_JPEGXL_CS_XYB) { >> +if (header->bits_per_sample <= 8) >> +return alpha ? AV_PIX_FMT_RGBA : AV_PIX_FMT_RGB24; >> +if (header->bits_per_sample > 16 || header->exp_bits_per_sample) >> +return alpha ? AV_PIX_FMT_GBRAPF32 : AV_PIX_FMT_GBRPF32; >> +return alpha ? AV_PIX_FMT_RGBA64 : AV_PIX_FMT_RGB48; >> +} >> +return AV_PIX_FMT_NONE; >> > > YUV is supported, via the do_YCbCr flag in the spec. I think > we ought to set the pixel format to YUV444P/16 in that case, > as the codec requires YUV to be upsampled during decoding, > and doing unnecessary colorspace conversions inside > decoders is something we don't want. > Decoders are free to change what the parser sets, so if > users link to libjxl, then RGB will be reported as lavf will > decode the first frame during probing. > Otherwise, the native decoder would match what the parser > reports and output YUV444-frames when signalled. > Come to think of it, we better output XYB instead of RGB. But that can be changed later, for now I think it's fine if the parser always reports either RGB or Gray, so this is fine as-is. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v11 1/5] avcodec/jpegxl: add Jpeg XL image codec and parser
1 Apr 2022, 02:20 by leo.i...@gmail.com: > This commit adds support to libavcodec to read and parse > encoded Jpeg XL images. Jpeg XL is intended to be an > extended-life replacement to legacy mjpeg. > --- > MAINTAINERS| 2 + > libavcodec/Makefile| 1 + > libavcodec/codec_desc.c| 9 + > libavcodec/codec_id.h | 1 + > libavcodec/jpegxl.h| 43 ++ > libavcodec/jpegxl_parser.c | 951 + > +} > +} > +if (header->color_space == FF_JPEGXL_CS_GRAY) { > +if (header->bits_per_sample <= 8) > +return alpha ? AV_PIX_FMT_YA8 : AV_PIX_FMT_GRAY8; > +if (header->bits_per_sample > 16 || header->exp_bits_per_sample) > +return alpha ? AV_PIX_FMT_NONE : AV_PIX_FMT_GRAYF32; > +return alpha ? AV_PIX_FMT_YA16 : AV_PIX_FMT_GRAY16; > +} else if (header->color_space == FF_JPEGXL_CS_RGB > +|| header->color_space == FF_JPEGXL_CS_XYB) { > +if (header->bits_per_sample <= 8) > +return alpha ? AV_PIX_FMT_RGBA : AV_PIX_FMT_RGB24; > +if (header->bits_per_sample > 16 || header->exp_bits_per_sample) > +return alpha ? AV_PIX_FMT_GBRAPF32 : AV_PIX_FMT_GBRPF32; > +return alpha ? AV_PIX_FMT_RGBA64 : AV_PIX_FMT_RGB48; > +} > +return AV_PIX_FMT_NONE; > YUV is supported, via the do_YCbCr flag in the spec. I think we ought to set the pixel format to YUV444P/16 in that case, as the codec requires YUV to be upsampled during decoding, and doing unnecessary colorspace conversions inside decoders is something we don't want. Decoders are free to change what the parser sets, so if users link to libjxl, then RGB will be reported as lavf will decode the first frame during probing. Otherwise, the native decoder would match what the parser reports and output YUV444-frames when signalled. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] MAINTAINERS: add myself as maintainer for libsrt protocol
On Wed, Mar 30, 2022 at 09:44:08PM +0200, Marton Balint wrote: > > > On Fri, 25 Mar 2022, Zhao Zhili wrote: > > > Signed-off-by: Zhao Zhili > > --- > > MAINTAINERS | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/MAINTAINERS b/MAINTAINERS > > index 931cf4bd2c..5daa6f8e03 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -516,6 +516,7 @@ Protocols: > > bluray.c Petri Hintukainen > > ftp.c Lukasz Marek > > http.cRonald S. Bultje > > + libsrt.c Zhao Zhili > > libssh.c Lukasz Marek > > libzmq.c Andriy Gelman > > mms*.cRonald S. Bultje > > LGTM, thanks. Applied, thanks. > > Marton > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". -- Thanks, Limin Wang ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v11 1/1] avformat: Add IPFS protocol support.
On Fri, Apr 1, 2022 at 2:09 AM Mark Gaiser wrote: > This patch adds support for: > - ffplay ipfs:// > - ffplay ipns:// > > IPFS data can be played from so called "ipfs gateways". > A gateway is essentially a webserver that gives access to the > distributed IPFS network. > > This protocol support (ipfs and ipns) therefore translates > ipfs:// and ipns:// to a http:// url. This resulting url is > then handled by the http protocol. It could also be https > depending on the gateway provided. > > To use this protocol, a gateway must be provided. > If you do nothing it will try to find it in your > $HOME/.ipfs/gateway file. The ways to set it manually are: > 1. Define a -gateway to the gateway. > 2. Define $IPFS_GATEWAY with the full http link to the gateway. > 3. Define $IPFS_PATH and point it to the IPFS data path. > 4. Have IPFS running in your local user folder (under $HOME/.ipfs). > > Signed-off-by: Mark Gaiser > --- > configure | 2 + > doc/protocols.texi| 30 > libavformat/Makefile | 2 + > libavformat/ipfsgateway.c | 328 ++ > libavformat/protocols.c | 2 + > 5 files changed, 364 insertions(+) > create mode 100644 libavformat/ipfsgateway.c > > diff --git a/configure b/configure > index e4d36aa639..55af90957a 100755 > --- a/configure > +++ b/configure > @@ -3579,6 +3579,8 @@ udp_protocol_select="network" > udplite_protocol_select="network" > unix_protocol_deps="sys_un_h" > unix_protocol_select="network" > +ipfs_protocol_select="https_protocol" > +ipns_protocol_select="https_protocol" > > # external library protocols > libamqp_protocol_deps="librabbitmq" > diff --git a/doc/protocols.texi b/doc/protocols.texi > index d207df0b52..7c9c0a4808 100644 > --- a/doc/protocols.texi > +++ b/doc/protocols.texi > @@ -2025,5 +2025,35 @@ decoding errors. > > @end table > > +@section ipfs > + > +InterPlanetary File System (IPFS) protocol support. One can access files > stored > +on the IPFS network through so called gateways. Those are http(s) > endpoints. > +This protocol wraps the IPFS native protocols (ipfs:// and ipns://) to be > send > +to such a gateway. Users can (and should) host their own node which means > this > +protocol will use your local machine gateway to access files on the IPFS > network. > + > +If a user doesn't have a node of their own then the public gateway > dweb.link is > +used by default. > + > +You can use this protocol in 2 ways. Using IPFS: > +@example > +ffplay ipfs://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T > +@end example > + > +Or the IPNS protocol (IPNS is mutable IPFS): > +@example > +ffplay ipns://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T > +@end example > + > +You can also change the gateway to be used: > + > +@table @option > + > +@item gateway > +Defines the gateway to use. When nothing is provided the protocol will > first try > +your local gateway. If that fails dweb.link will be used. > + > +@end table > > @c man end PROTOCOLS > diff --git a/libavformat/Makefile b/libavformat/Makefile > index d7182d6bd8..e3233fd7ac 100644 > --- a/libavformat/Makefile > +++ b/libavformat/Makefile > @@ -660,6 +660,8 @@ OBJS-$(CONFIG_SRTP_PROTOCOL) += > srtpproto.o srtp.o > OBJS-$(CONFIG_SUBFILE_PROTOCOL) += subfile.o > OBJS-$(CONFIG_TEE_PROTOCOL) += teeproto.o tee_common.o > OBJS-$(CONFIG_TCP_PROTOCOL) += tcp.o > +OBJS-$(CONFIG_IPFS_PROTOCOL) += ipfsgateway.o > +OBJS-$(CONFIG_IPNS_PROTOCOL) += ipfsgateway.o > TLS-OBJS-$(CONFIG_GNUTLS)+= tls_gnutls.o > TLS-OBJS-$(CONFIG_LIBTLS)+= tls_libtls.o > TLS-OBJS-$(CONFIG_MBEDTLS) += tls_mbedtls.o > diff --git a/libavformat/ipfsgateway.c b/libavformat/ipfsgateway.c > new file mode 100644 > index 00..725cc5e474 > --- /dev/null > +++ b/libavformat/ipfsgateway.c > @@ -0,0 +1,328 @@ > +/* > + * IPFS and IPNS protocol support through IPFS Gateway. > + * Copyright (c) 2022 Mark Gaiser > + * > + * This file is part of FFmpeg. > + * > + * FFmpeg is free software; you can redistribute it and/or > + * modify it under the terms of the GNU Lesser General Public > + * License as published by the Free Software Foundation; either > + * version 2.1 of the License, or (at your option) any later version. > + * > + * FFmpeg is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * Lesser General Public License for more details. > + * > + * You should have received a copy of the GNU Lesser General Public > + * License along with FFmpeg; if not, write to the Free Software > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA > 02110-1301 USA > + */ > + > +#include "libavutil/avstring.h" > +#include "libavutil/opt.h" > +#include "url.h" > +#include > + > +typedef struct IPFSGatewayContext { > +
[FFmpeg-devel] [PATCH v11 3/5] avcodec/libjxl: add Jpeg XL encoding via libjxl
This commit adds encoding support to libavcodec for Jpeg XL images via the external library libjxl. --- configure | 3 +- libavcodec/Makefile| 1 + libavcodec/allcodecs.c | 1 + libavcodec/libjxlenc.c | 379 + 4 files changed, 383 insertions(+), 1 deletion(-) create mode 100644 libavcodec/libjxlenc.c diff --git a/configure b/configure index 969b13eba3..85a1a8b53c 100755 --- a/configure +++ b/configure @@ -240,7 +240,7 @@ External library support: --enable-libiec61883 enable iec61883 via libiec61883 [no] --enable-libilbc enable iLBC de/encoding via libilbc [no] --enable-libjack enable JACK audio sound server [no] - --enable-libjxl enable JPEG XL decoding via libjxl [no] + --enable-libjxl enable JPEG XL de/encoding via libjxl [no] --enable-libklvanc enable Kernel Labs VANC processing [no] --enable-libkvazaar enable HEVC encoding via libkvazaar [no] --enable-liblensfun enable lensfun lens correction [no] @@ -3332,6 +3332,7 @@ libgsm_ms_encoder_deps="libgsm" libilbc_decoder_deps="libilbc" libilbc_encoder_deps="libilbc" libjxl_decoder_deps="libjxl libjxl_threads" +libjxl_encoder_deps="libjxl libjxl_threads" libkvazaar_encoder_deps="libkvazaar" libmodplug_demuxer_deps="libmodplug" libmp3lame_encoder_deps="libmp3lame" diff --git a/libavcodec/Makefile b/libavcodec/Makefile index c00b0d3246..b208cc0097 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -1061,6 +1061,7 @@ OBJS-$(CONFIG_LIBGSM_MS_ENCODER) += libgsmenc.o OBJS-$(CONFIG_LIBILBC_DECODER)+= libilbc.o OBJS-$(CONFIG_LIBILBC_ENCODER)+= libilbc.o OBJS-$(CONFIG_LIBJXL_DECODER) += libjxldec.o libjxl.o +OBJS-$(CONFIG_LIBJXL_ENCODER) += libjxlenc.o libjxl.o OBJS-$(CONFIG_LIBKVAZAAR_ENCODER) += libkvazaar.o OBJS-$(CONFIG_LIBMP3LAME_ENCODER) += libmp3lame.o OBJS-$(CONFIG_LIBOPENCORE_AMRNB_DECODER) += libopencore-amr.o diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c index a9cd69dfce..db92fb7af5 100644 --- a/libavcodec/allcodecs.c +++ b/libavcodec/allcodecs.c @@ -750,6 +750,7 @@ extern const FFCodec ff_libgsm_ms_decoder; extern const FFCodec ff_libilbc_encoder; extern const FFCodec ff_libilbc_decoder; extern const FFCodec ff_libjxl_decoder; +extern const FFCodec ff_libjxl_encoder; extern const FFCodec ff_libmp3lame_encoder; extern const FFCodec ff_libopencore_amrnb_encoder; extern const FFCodec ff_libopencore_amrnb_decoder; diff --git a/libavcodec/libjxlenc.c b/libavcodec/libjxlenc.c new file mode 100644 index 00..deacc0f1f8 --- /dev/null +++ b/libavcodec/libjxlenc.c @@ -0,0 +1,379 @@ +/* + * JPEG XL encoding support via libjxl + * Copyright (c) 2021 Leo Izen + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * JPEG XL encoder using libjxl + */ + +#include "libavutil/avutil.h" +#include "libavutil/error.h" +#include "libavutil/frame.h" +#include "libavutil/libm.h" +#include "libavutil/opt.h" +#include "libavutil/pixdesc.h" +#include "libavutil/pixfmt.h" +#include "libavutil/version.h" + +#include "avcodec.h" +#include "codec_internal.h" + +#include +#include +#include "libjxl.h" + +typedef struct LibJxlEncodeContext { +AVClass *class; +void *runner; +JxlEncoder *encoder; +JxlEncoderFrameSettings *options; +int effort; +float distance; +int modular; +uint8_t *buffer; +size_t buffer_size; +} LibJxlEncodeContext; + +/** + * Map a quality setting for -qscale roughly from libjpeg + * quality numbers to libjxl's butteraugli distance for + * photographic content. + * + * Setting distance explicitly is preferred, but this will + * allow qscale to be used as a fallback. + * + * This function is continuous and injective on [0, 100] which + * makes it monotonic. + * + * @param quality 0.0 to 100.0 quality setting, libjpeg quality + * @return Butteraugli distance between 0.0 and 15.0 + */ +static float quality_to_distance(float quality) +{ +if (quality >= 100.0) +return 0.0; +else if (quality >= 90.0) +return (100.0 - quality) * 0.10; +else if (quality >= 30.0) +return 0.1 + (100.0 - quality) * 0.09; +
[FFmpeg-devel] [PATCH v11 1/5] avcodec/jpegxl: add Jpeg XL image codec and parser
This commit adds support to libavcodec to read and parse encoded Jpeg XL images. Jpeg XL is intended to be an extended-life replacement to legacy mjpeg. --- MAINTAINERS| 2 + libavcodec/Makefile| 1 + libavcodec/codec_desc.c| 9 + libavcodec/codec_id.h | 1 + libavcodec/jpegxl.h| 43 ++ libavcodec/jpegxl_parser.c | 951 + libavcodec/parsers.c | 1 + libavcodec/version.h | 2 +- 8 files changed, 1009 insertions(+), 1 deletion(-) create mode 100644 libavcodec/jpegxl.h create mode 100644 libavcodec/jpegxl_parser.c diff --git a/MAINTAINERS b/MAINTAINERS index 76e1332ad8..9ab08bad8e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -188,6 +188,7 @@ Codecs: interplayvideo.c Mike Melanson jni*, ffjni* Matthieu Bouron jpeg2000* Nicolas Bertrand + jpegxl.h, jpegxl_parser.c Leo Izen jvdec.c Peter Ross lcl*.cRoberto Togni, Reimar Doeffinger libcelt_dec.c Nicolas George @@ -617,6 +618,7 @@ Haihao Xiang (haihao) 1F0C 31E8 B4FE F7A4 4DC1 DC99 E0F5 76D4 76FC 437F Jaikrishnan Menon 61A1 F09F 01C9 2D45 78E1 C862 25DC 8831 AF70 D368 James Almer 7751 2E8C FD94 A169 57E6 9A7A 1463 01AD 7376 59E0 Jean Delvare 7CA6 9F44 60F1 BDC4 1FD2 C858 A552 6B9B B3CD 4E6A +Leo Izen (thebombzen) B6FD 3CFC 7ACF 83FC 9137 6945 5A71 C331 FD2F A19A Loren Merritt ABD9 08F4 C920 3F65 D8BE 35D7 1540 DAA7 060F 56DE Lynne FE50 139C 6805 72CA FD52 1F8D A2FE A5F0 3F03 4464 Michael Niedermayer 9FF2 128B 147E F673 0BAD F133 611E C787 040B 0FAB diff --git a/libavcodec/Makefile b/libavcodec/Makefile index fb8b0e824b..3723601b3d 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -44,6 +44,7 @@ OBJS = ac3_parser.o \ dv_profile.o \ encode.o \ imgconvert.o \ + jpegxl_parser.o \ jni.o\ mathtables.o \ mediacodec.o \ diff --git a/libavcodec/codec_desc.c b/libavcodec/codec_desc.c index 81f3b3c640..1b82870aaa 100644 --- a/libavcodec/codec_desc.c +++ b/libavcodec/codec_desc.c @@ -1863,6 +1863,15 @@ static const AVCodecDescriptor codec_descriptors[] = { .long_name = NULL_IF_CONFIG_SMALL("GEM Raster image"), .props = AV_CODEC_PROP_LOSSY, }, +{ +.id= AV_CODEC_ID_JPEGXL, +.type = AVMEDIA_TYPE_VIDEO, +.name = "jpegxl", +.long_name = NULL_IF_CONFIG_SMALL("JPEG XL"), +.props = AV_CODEC_PROP_INTRA_ONLY | AV_CODEC_PROP_LOSSY | + AV_CODEC_PROP_LOSSLESS, +.mime_types= MT("image/jxl"), +}, /* various PCM "codecs" */ { diff --git a/libavcodec/codec_id.h b/libavcodec/codec_id.h index 3ffb9bd22e..dbc4f3a208 100644 --- a/libavcodec/codec_id.h +++ b/libavcodec/codec_id.h @@ -308,6 +308,7 @@ enum AVCodecID { AV_CODEC_ID_SIMBIOSIS_IMX, AV_CODEC_ID_SGA_VIDEO, AV_CODEC_ID_GEM, +AV_CODEC_ID_JPEGXL, /* various PCM "codecs" */ AV_CODEC_ID_FIRST_AUDIO = 0x1, ///< A dummy id pointing at the start of audio codecs diff --git a/libavcodec/jpegxl.h b/libavcodec/jpegxl.h new file mode 100644 index 00..a0f266c4ff --- /dev/null +++ b/libavcodec/jpegxl.h @@ -0,0 +1,43 @@ +/* + * JPEG XL header + * Copyright (c) 2021 Leo Izen + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * JPEG XL header + */ + +#ifndef AVCODEC_JPEGXL_H +#define AVCODEC_JPEGXL_H + +#include + +/* these are also used in avformat/img2dec.c */ +#define FF_JPEGXL_CODESTREAM_SIGNATURE_LE 0x0aff +#define
[FFmpeg-devel] [PATCH v11 0/5] Jpeg XL Patch Set
This patchset adds the Jpeg XL Image format and a parser for this format, as well as a decoder and encoder for it based on the external reference implementation library, libjxl. Changes: v11: - Fix regression I introduced in v10 with skipping boxes v10: - Make changes requested by Andreas Reinhardt from v9 v9: - v8 with a typo fix v8: - v7, but with stylistic changes as requested by Lynne and others on IRC v7: - Fully implement the parser and test it against the conformance samples Leo Izen (5): avcodec/jpegxl: add Jpeg XL image codec and parser avcodec/libjxl: add Jpeg XL decoding via libjxl avcodec/libjxl: add Jpeg XL encoding via libjxl avformat/image2: add Jpeg XL as image2 format fate/jpegxl: add Jpeg XL demux and parse FATE test MAINTAINERS | 3 + configure | 6 + doc/general_contents.texi | 7 + libavcodec/Makefile | 3 + libavcodec/allcodecs.c | 2 + libavcodec/codec_desc.c | 9 + libavcodec/codec_id.h | 1 + libavcodec/jpegxl.h | 43 ++ libavcodec/jpegxl_parser.c | 951 libavcodec/libjxl.c | 70 ++ libavcodec/libjxl.h | 48 ++ libavcodec/libjxldec.c | 301 + libavcodec/libjxlenc.c | 379 +++ libavcodec/parsers.c| 1 + libavcodec/version.h| 2 +- libavformat/allformats.c| 1 + libavformat/img2.c | 1 + libavformat/img2dec.c | 21 + libavformat/img2enc.c | 6 +- libavformat/mov.c | 1 + libavformat/version.h | 4 +- tests/fate/image.mak| 10 + tests/ref/fate/jxl-parse-codestream | 6 + tests/ref/fate/jxl-parse-container | 6 + 24 files changed, 1876 insertions(+), 6 deletions(-) create mode 100644 libavcodec/jpegxl.h create mode 100644 libavcodec/jpegxl_parser.c create mode 100644 libavcodec/libjxl.c create mode 100644 libavcodec/libjxl.h create mode 100644 libavcodec/libjxldec.c create mode 100644 libavcodec/libjxlenc.c create mode 100644 tests/ref/fate/jxl-parse-codestream create mode 100644 tests/ref/fate/jxl-parse-container -- 2.35.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH v11 5/5] fate/jpegxl: add Jpeg XL demux and parse FATE test
Add a fate test for the Jpeg XL parser in libavcodec and its image2 wrapper inside libavformat. --- tests/fate/image.mak| 10 ++ tests/ref/fate/jxl-parse-codestream | 6 ++ tests/ref/fate/jxl-parse-container | 6 ++ 3 files changed, 22 insertions(+) create mode 100644 tests/ref/fate/jxl-parse-codestream create mode 100644 tests/ref/fate/jxl-parse-container diff --git a/tests/fate/image.mak b/tests/fate/image.mak index 573d398915..15b6145c58 100644 --- a/tests/fate/image.mak +++ b/tests/fate/image.mak @@ -357,6 +357,16 @@ FATE_JPEGLS-$(call DEMDEC, IMAGE2, JPEGLS) += $(FATE_JPEGLS) FATE_IMAGE += $(FATE_JPEGLS-yes) fate-jpegls: $(FATE_JPEGLS-yes) +FATE_JPEGXL += fate-jxl-parse-codestream +fate-jxl-parse-codestream: CMD = framecrc -i $(TARGET_SAMPLES)/jxl/belgium.jxl -c:v copy + +FATE_JPEGXL += fate-jxl-parse-container +fate-jxl-parse-container: CMD = framecrc -i $(TARGET_SAMPLES)/jxl/lenna-256.jxl -c:v copy + +FATE_JPEGXL-$(call DEMDEC, IMAGE2) += $(FATE_JPEGXL) +FATE_IMAGE += $(FATE_JPEGXL-yes) +fate-jxl: $(FATE_JPEGXL-yes) + FATE_IMAGE-$(call DEMDEC, IMAGE2, QDRAW) += fate-pict fate-pict: CMD = framecrc -i $(TARGET_SAMPLES)/quickdraw/TRU256.PCT -pix_fmt rgb24 diff --git a/tests/ref/fate/jxl-parse-codestream b/tests/ref/fate/jxl-parse-codestream new file mode 100644 index 00..b2fe5035ac --- /dev/null +++ b/tests/ref/fate/jxl-parse-codestream @@ -0,0 +1,6 @@ +#tb 0: 1/25 +#media_type 0: video +#codec_id 0: jpegxl +#dimensions 0: 768x512 +#sar 0: 0/1 +0, 0, 0,1, 32, 0xa2930a20 diff --git a/tests/ref/fate/jxl-parse-container b/tests/ref/fate/jxl-parse-container new file mode 100644 index 00..99233d612a --- /dev/null +++ b/tests/ref/fate/jxl-parse-container @@ -0,0 +1,6 @@ +#tb 0: 1/25 +#media_type 0: video +#codec_id 0: jpegxl +#dimensions 0: 256x256 +#sar 0: 0/1 +0, 0, 0,1, 8088, 0xbbfea9bd -- 2.35.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH v11 4/5] avformat/image2: add Jpeg XL as image2 format
This commit adds support to libavformat for muxing and demuxing Jpeg XL images as image2 streams. --- libavformat/allformats.c | 1 + libavformat/img2.c | 1 + libavformat/img2dec.c| 21 + libavformat/img2enc.c| 6 +++--- libavformat/mov.c| 1 + libavformat/version.h| 4 ++-- 6 files changed, 29 insertions(+), 5 deletions(-) diff --git a/libavformat/allformats.c b/libavformat/allformats.c index 587ad59b3c..941f3643f8 100644 --- a/libavformat/allformats.c +++ b/libavformat/allformats.c @@ -510,6 +510,7 @@ extern const AVInputFormat ff_image_gif_pipe_demuxer; extern const AVInputFormat ff_image_j2k_pipe_demuxer; extern const AVInputFormat ff_image_jpeg_pipe_demuxer; extern const AVInputFormat ff_image_jpegls_pipe_demuxer; +extern const AVInputFormat ff_image_jpegxl_pipe_demuxer; extern const AVInputFormat ff_image_pam_pipe_demuxer; extern const AVInputFormat ff_image_pbm_pipe_demuxer; extern const AVInputFormat ff_image_pcx_pipe_demuxer; diff --git a/libavformat/img2.c b/libavformat/img2.c index 4153102c92..13b1b997b8 100644 --- a/libavformat/img2.c +++ b/libavformat/img2.c @@ -87,6 +87,7 @@ const IdStrMap ff_img_tags[] = { { AV_CODEC_ID_GEM,"img" }, { AV_CODEC_ID_GEM,"ximg" }, { AV_CODEC_ID_GEM,"timg" }, +{ AV_CODEC_ID_JPEGXL, "jxl" }, { AV_CODEC_ID_NONE, NULL } }; diff --git a/libavformat/img2dec.c b/libavformat/img2dec.c index b9c06c5b54..32cadacb9d 100644 --- a/libavformat/img2dec.c +++ b/libavformat/img2dec.c @@ -32,6 +32,7 @@ #include "libavutil/parseutils.h" #include "libavutil/intreadwrite.h" #include "libavcodec/gif.h" +#include "libavcodec/jpegxl.h" #include "avformat.h" #include "avio_internal.h" #include "internal.h" @@ -836,6 +837,25 @@ static int jpegls_probe(const AVProbeData *p) return 0; } +static int jpegxl_probe(const AVProbeData *p) +{ +const uint8_t *b = p->buf; + +/* ISOBMFF-based container */ +/* 0x4a584c20 == "JXL " */ +if (AV_RL64(b) == FF_JPEGXL_CONTAINER_SIGNATURE_LE) +return AVPROBE_SCORE_EXTENSION + 1; +#if CONFIG_JPEGXL_PARSER +/* Raw codestreams all start with 0xff0a */ +if (AV_RL16(b) != FF_JPEGXL_CODESTREAM_SIGNATURE_LE) +return 0; +if (avpriv_jpegxl_verify_codestream_header(NULL, p->buf, p->buf_size) == 0) +return AVPROBE_SCORE_MAX - 2; +#endif +return 0; +} + + static int pcx_probe(const AVProbeData *p) { const uint8_t *b = p->buf; @@ -1165,6 +1185,7 @@ IMAGEAUTO_DEMUXER(gif, GIF) IMAGEAUTO_DEMUXER_EXT(j2k, JPEG2000, J2K) IMAGEAUTO_DEMUXER_EXT(jpeg, MJPEG, JPEG) IMAGEAUTO_DEMUXER(jpegls,JPEGLS) +IMAGEAUTO_DEMUXER(jpegxl,JPEGXL) IMAGEAUTO_DEMUXER(pam, PAM) IMAGEAUTO_DEMUXER(pbm, PBM) IMAGEAUTO_DEMUXER(pcx, PCX) diff --git a/libavformat/img2enc.c b/libavformat/img2enc.c index 9b3b8741c8..e6ec6a50aa 100644 --- a/libavformat/img2enc.c +++ b/libavformat/img2enc.c @@ -263,9 +263,9 @@ static const AVClass img2mux_class = { const AVOutputFormat ff_image2_muxer = { .name = "image2", .long_name = NULL_IF_CONFIG_SMALL("image2 sequence"), -.extensions = "bmp,dpx,exr,jls,jpeg,jpg,ljpg,pam,pbm,pcx,pfm,pgm,pgmyuv,png," - "ppm,sgi,tga,tif,tiff,jp2,j2c,j2k,xwd,sun,ras,rs,im1,im8,im24," - "sunras,xbm,xface,pix,y", +.extensions = "bmp,dpx,exr,jls,jpeg,jpg,jxl,ljpg,pam,pbm,pcx,pfm,pgm,pgmyuv," + "png,ppm,sgi,tga,tif,tiff,jp2,j2c,j2k,xwd,sun,ras,rs,im1,im8," + "im24,sunras,xbm,xface,pix,y", .priv_data_size = sizeof(VideoMuxData), .video_codec= AV_CODEC_ID_MJPEG, .write_header = write_header, diff --git a/libavformat/mov.c b/libavformat/mov.c index 6c847de164..c4b8873b0a 100644 --- a/libavformat/mov.c +++ b/libavformat/mov.c @@ -7697,6 +7697,7 @@ static int mov_probe(const AVProbeData *p) if (tag == MKTAG('f','t','y','p') && ( AV_RL32(p->buf + offset + 8) == MKTAG('j','p','2',' ') || AV_RL32(p->buf + offset + 8) == MKTAG('j','p','x',' ') +|| AV_RL32(p->buf + offset + 8) == MKTAG('j','x','l',' ') )) { score = FFMAX(score, 5); } else { diff --git a/libavformat/version.h b/libavformat/version.h index f4a26c2870..683184d5da 100644 --- a/libavformat/version.h +++ b/libavformat/version.h @@ -31,8 +31,8 @@ #include "version_major.h" -#define LIBAVFORMAT_VERSION_MINOR 20 -#define LIBAVFORMAT_VERSION_MICRO 101 +#define LIBAVFORMAT_VERSION_MINOR 21 +#define LIBAVFORMAT_VERSION_MICRO 100 #define LIBAVFORMAT_VERSION_INT AV_VERSION_INT(LIBAVFORMAT_VERSION_MAJOR, \ LIBAVFORMAT_VERSION_MINOR, \ -- 2.35.1 ___ ffmpeg-devel mailing list
[FFmpeg-devel] [PATCH v11 2/5] avcodec/libjxl: add Jpeg XL decoding via libjxl
This commit adds decoding support to libavcodec for Jpeg XL images via the external library libjxl. --- MAINTAINERS | 1 + configure | 5 + doc/general_contents.texi | 7 + libavcodec/Makefile | 1 + libavcodec/allcodecs.c| 1 + libavcodec/libjxl.c | 70 + libavcodec/libjxl.h | 48 ++ libavcodec/libjxldec.c| 301 ++ 8 files changed, 434 insertions(+) create mode 100644 libavcodec/libjxl.c create mode 100644 libavcodec/libjxl.h create mode 100644 libavcodec/libjxldec.c diff --git a/MAINTAINERS b/MAINTAINERS index 9ab08bad8e..fd79234d23 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -195,6 +195,7 @@ Codecs: libcodec2.c Tomas Härdin libdirac* David Conrad libdavs2.cHuiwen Ren + libjxl*.c, libjxl.h Leo Izen libgsm.c Michel Bardiaux libkvazaar.c Arttu Ylä-Outinen libopenh264enc.c Martin Storsjo, Linjie Fu diff --git a/configure b/configure index e4d36aa639..969b13eba3 100755 --- a/configure +++ b/configure @@ -240,6 +240,7 @@ External library support: --enable-libiec61883 enable iec61883 via libiec61883 [no] --enable-libilbc enable iLBC de/encoding via libilbc [no] --enable-libjack enable JACK audio sound server [no] + --enable-libjxl enable JPEG XL decoding via libjxl [no] --enable-libklvanc enable Kernel Labs VANC processing [no] --enable-libkvazaar enable HEVC encoding via libkvazaar [no] --enable-liblensfun enable lensfun lens correction [no] @@ -1833,6 +1834,7 @@ EXTERNAL_LIBRARY_LIST=" libiec61883 libilbc libjack +libjxl libklvanc libkvazaar libmodplug @@ -3329,6 +3331,7 @@ libgsm_ms_decoder_deps="libgsm" libgsm_ms_encoder_deps="libgsm" libilbc_decoder_deps="libilbc" libilbc_encoder_deps="libilbc" +libjxl_decoder_deps="libjxl libjxl_threads" libkvazaar_encoder_deps="libkvazaar" libmodplug_demuxer_deps="libmodplug" libmp3lame_encoder_deps="libmp3lame" @@ -6541,6 +6544,8 @@ enabled libgsm&& { for gsm_hdr in "gsm.h" "gsm/gsm.h"; do check_lib libgsm "${gsm_hdr}" gsm_create -lgsm && break; done || die "ERROR: libgsm not found"; } enabled libilbc && require libilbc ilbc.h WebRtcIlbcfix_InitDecode -lilbc $pthreads_extralibs +enabled libjxl&& require_pkg_config libjxl "libjxl >= 0.7.0" jxl/decode.h JxlDecoderVersion && + require_pkg_config libjxl_threads "libjxl_threads >= 0.7.0" jxl/thread_parallel_runner.h JxlThreadParallelRunner enabled libklvanc && require libklvanc libklvanc/vanc.h klvanc_context_create -lklvanc enabled libkvazaar&& require_pkg_config libkvazaar "kvazaar >= 0.8.1" kvazaar.h kvz_api_get enabled liblensfun&& require_pkg_config liblensfun lensfun lensfun.h lf_db_new diff --git a/doc/general_contents.texi b/doc/general_contents.texi index fcd9da1b34..a893347fbe 100644 --- a/doc/general_contents.texi +++ b/doc/general_contents.texi @@ -171,6 +171,13 @@ Go to @url{https://github.com/TimothyGu/libilbc} and follow the instructions for installing the library. Then pass @code{--enable-libilbc} to configure to enable it. +@section libjxl + +JPEG XL is an image format intended to fully replace legacy JPEG for an extended +period of life. See @url{https://jpegxl.info/} for more information, and see +@url{https://github.com/libjxl/libjxl} for the library source. You can pass +@code{--enable-libjxl} to configure in order enable the libjxl wrapper. + @section libvpx FFmpeg can make use of the libvpx library for VP8/VP9 decoding and encoding. diff --git a/libavcodec/Makefile b/libavcodec/Makefile index 3723601b3d..c00b0d3246 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -1060,6 +1060,7 @@ OBJS-$(CONFIG_LIBGSM_MS_DECODER) += libgsmdec.o OBJS-$(CONFIG_LIBGSM_MS_ENCODER) += libgsmenc.o OBJS-$(CONFIG_LIBILBC_DECODER)+= libilbc.o OBJS-$(CONFIG_LIBILBC_ENCODER)+= libilbc.o +OBJS-$(CONFIG_LIBJXL_DECODER) += libjxldec.o libjxl.o OBJS-$(CONFIG_LIBKVAZAAR_ENCODER) += libkvazaar.o OBJS-$(CONFIG_LIBMP3LAME_ENCODER) += libmp3lame.o OBJS-$(CONFIG_LIBOPENCORE_AMRNB_DECODER) += libopencore-amr.o diff --git a/libavcodec/allcodecs.c b/libavcodec/allcodecs.c index 22d56760ec..a9cd69dfce 100644 --- a/libavcodec/allcodecs.c +++ b/libavcodec/allcodecs.c @@ -749,6 +749,7 @@ extern const FFCodec ff_libgsm_ms_encoder; extern const FFCodec ff_libgsm_ms_decoder; extern const FFCodec ff_libilbc_encoder; extern const FFCodec ff_libilbc_decoder; +extern const FFCodec ff_libjxl_decoder; extern const FFCodec ff_libmp3lame_encoder; extern const FFCodec
[FFmpeg-devel] [PATCH v11 1/1] avformat: Add IPFS protocol support.
This patch adds support for: - ffplay ipfs:// - ffplay ipns:// IPFS data can be played from so called "ipfs gateways". A gateway is essentially a webserver that gives access to the distributed IPFS network. This protocol support (ipfs and ipns) therefore translates ipfs:// and ipns:// to a http:// url. This resulting url is then handled by the http protocol. It could also be https depending on the gateway provided. To use this protocol, a gateway must be provided. If you do nothing it will try to find it in your $HOME/.ipfs/gateway file. The ways to set it manually are: 1. Define a -gateway to the gateway. 2. Define $IPFS_GATEWAY with the full http link to the gateway. 3. Define $IPFS_PATH and point it to the IPFS data path. 4. Have IPFS running in your local user folder (under $HOME/.ipfs). Signed-off-by: Mark Gaiser --- configure | 2 + doc/protocols.texi| 30 libavformat/Makefile | 2 + libavformat/ipfsgateway.c | 328 ++ libavformat/protocols.c | 2 + 5 files changed, 364 insertions(+) create mode 100644 libavformat/ipfsgateway.c diff --git a/configure b/configure index e4d36aa639..55af90957a 100755 --- a/configure +++ b/configure @@ -3579,6 +3579,8 @@ udp_protocol_select="network" udplite_protocol_select="network" unix_protocol_deps="sys_un_h" unix_protocol_select="network" +ipfs_protocol_select="https_protocol" +ipns_protocol_select="https_protocol" # external library protocols libamqp_protocol_deps="librabbitmq" diff --git a/doc/protocols.texi b/doc/protocols.texi index d207df0b52..7c9c0a4808 100644 --- a/doc/protocols.texi +++ b/doc/protocols.texi @@ -2025,5 +2025,35 @@ decoding errors. @end table +@section ipfs + +InterPlanetary File System (IPFS) protocol support. One can access files stored +on the IPFS network through so called gateways. Those are http(s) endpoints. +This protocol wraps the IPFS native protocols (ipfs:// and ipns://) to be send +to such a gateway. Users can (and should) host their own node which means this +protocol will use your local machine gateway to access files on the IPFS network. + +If a user doesn't have a node of their own then the public gateway dweb.link is +used by default. + +You can use this protocol in 2 ways. Using IPFS: +@example +ffplay ipfs://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T +@end example + +Or the IPNS protocol (IPNS is mutable IPFS): +@example +ffplay ipns://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T +@end example + +You can also change the gateway to be used: + +@table @option + +@item gateway +Defines the gateway to use. When nothing is provided the protocol will first try +your local gateway. If that fails dweb.link will be used. + +@end table @c man end PROTOCOLS diff --git a/libavformat/Makefile b/libavformat/Makefile index d7182d6bd8..e3233fd7ac 100644 --- a/libavformat/Makefile +++ b/libavformat/Makefile @@ -660,6 +660,8 @@ OBJS-$(CONFIG_SRTP_PROTOCOL) += srtpproto.o srtp.o OBJS-$(CONFIG_SUBFILE_PROTOCOL) += subfile.o OBJS-$(CONFIG_TEE_PROTOCOL) += teeproto.o tee_common.o OBJS-$(CONFIG_TCP_PROTOCOL) += tcp.o +OBJS-$(CONFIG_IPFS_PROTOCOL) += ipfsgateway.o +OBJS-$(CONFIG_IPNS_PROTOCOL) += ipfsgateway.o TLS-OBJS-$(CONFIG_GNUTLS)+= tls_gnutls.o TLS-OBJS-$(CONFIG_LIBTLS)+= tls_libtls.o TLS-OBJS-$(CONFIG_MBEDTLS) += tls_mbedtls.o diff --git a/libavformat/ipfsgateway.c b/libavformat/ipfsgateway.c new file mode 100644 index 00..725cc5e474 --- /dev/null +++ b/libavformat/ipfsgateway.c @@ -0,0 +1,328 @@ +/* + * IPFS and IPNS protocol support through IPFS Gateway. + * Copyright (c) 2022 Mark Gaiser + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/avstring.h" +#include "libavutil/opt.h" +#include "url.h" +#include + +typedef struct IPFSGatewayContext { +AVClass *class; +URLContext *inner; +// Is filled by the -gateway argument and not changed after. +char *gateway; +// If the above gateway is non null, it will be copied into this buffer. +// Else this buffer will contain the auto detected gateway. +// In either case, the gateway to use
[FFmpeg-devel] [PATCH v11 0/1] Add IPFS protocol support.
Hi, This patch series adds support for IPFS. V11: - Cleaned up the headers. What's there is actually needed now. - Some more strict checking (namely on fgets) - Merged long log in one log entry. - Another allocation check (this time for "fulluri") - Lots of formatting changes (not visual) to be more in line with the soft 80 char limit. V10: - Removed free on c->gateway in ipfs_close to fix a double free. V9: - dweb.link as fallback gateway. This is managed by Protocol Labs (like IPFS). - Change all errors to warnings as not having a gateway still gives you a working video playback. - Changed the console output to be more clear. V8: - Removed unnecessary change to set the first gateway_buffer character to 0. It made no sense as the buffer is always overwritten in the function context. - Change %li to %zu (it's intended to print the sizeof in all cases) V7: - Removed sanitize_ipfs_gateway. Only the http/https check stayed and that's now in translate_ipfs_to_http. - Added a check for ipfs_cid. It's only to show an error is someone happens to profide `ffplay ipfs://` without a cid. - All snprintf usages are now checked. - Adding a / to a gateway if it didn't end with it is now done in the same line that composes the full resulting url. - And a couple more minor things. V6: - Moved the gateway buffer (now called gateway_buffer) to IPFSGatewayContext - Changed nearly all PATH_MAX uses to sizeof(...) uses for future flexibility - The rest is relatively minor feedback changes V5: - "c->gateway" is now not modified anymore - Moved most variables to the stack - Even more strict checks with the auto detection logic - Errors are now AVERROR :) - Added more logging and changed some debug ones to info ones as they are valuable to aid debugging as a user when something goes wrong. V3 (V4): - V4: title issue from V3.. - A lot of style changes - Made url checks a lot more strict - av_asprintf leak fixes - So many changes that a diff to v2 is again not sensible. V2: - Squashed and changed so much that a diff to v1 was not sensible. The following is a short summary. In the IPFS ecosystem you access it's content by a "Content IDentifier" (CID). This CID is, in simplified terms, a hash of the content. IPFS itself is a distributed network where any user can run a node to be part of the network and access files by their CID. If any reachable node within that network has the CID, you can get it. IPFS (as a technology) has two protocols, ipfs and ipns. The ipfs protocol is the immutable way to access content. The ipns protocol is a mutable layer on top of it. It's essentially a new CID that points to a ipfs CID. This "pointer" if you will can be changed to point to something else. Much more information on how this technology works can be found here [1]. This patch series allows to interact natively with IPFS. That means being able to access files like: - ffplay ipfs:// - ffplay ipns:// There are multiple ways to access files on the IPFS network. This patch series uses the gateway driven way. An IPFS node - by default - exposes a local gateway (say http://localhost:8080) which is then used to get content from IPFS. The gateway functionality on the IPFS side contains optimizations to be as ideal to streaming data as it can be. Optimizations that the http protocol in ffmpeg also has and are thus reused for free in this approach. A note on other "more appropiate" ways, as I received some feedback on that. For ffmpeg purposes the gateway approach is ideal! There is a "libipfs" but that would spin up an ipfs node with the overhead of: - bootstrapping - connecting to nodes - finding other nodes to connect too - finally finding your file This alternative approach could take minutes before a file is played. The gateway approach immediately connects to an already running node thus gives the file the fastest. Much of the logic in this patch series is to find that gateway and essentially rewrite: "ipfs://" to: "http://localhost:8080/ipfs/" Once that's found it's forwared to the protocol handler where eventually the http protocol is going to handle it. Note that it could also be https. There's enough flexibility in the implementation to allow the user to provide a gateway. There are also public https gateways which can be used just as well. After this patch is accepted, I'll work on getting IPFS supported in: - mpv (requires this ffmpeg patch) - vlc (prefers this patch but can be made to work without this patch) - kodi (requires this ffmpeg patch) Best regards, Mark Gaiser [1] https://docs.ipfs.io/concepts/ Mark Gaiser (1): avformat: Add IPFS protocol support. configure | 2 + doc/protocols.texi| 30 libavformat/Makefile | 2 + libavformat/ipfsgateway.c | 328 ++ libavformat/protocols.c | 2 + 5 files changed, 364 insertions(+) create mode 100644 libavformat/ipfsgateway.c -- 2.35.1
Re: [FFmpeg-devel] [PATCH v10 1/1] avformat: Add IPFS protocol support.
On Fri, Apr 1, 2022 at 1:01 AM Andreas Rheinhardt < andreas.rheinha...@outlook.com> wrote: > Mark Gaiser: > > On Thu, Mar 31, 2022 at 11:44 PM Andreas Rheinhardt < > > andreas.rheinha...@outlook.com> wrote: > > > >> Mark Gaiser: > >>> On Wed, Mar 30, 2022 at 3:57 PM Andreas Rheinhardt < > >>> andreas.rheinha...@outlook.com> wrote: > >>> > Mark Gaiser: > > On Wed, Mar 30, 2022 at 2:21 PM Andreas Rheinhardt < > > andreas.rheinha...@outlook.com> wrote: > > > >> Mark Gaiser: > >>> This patch adds support for: > >>> - ffplay ipfs:// > >>> - ffplay ipns:// > >>> > >>> IPFS data can be played from so called "ipfs gateways". > >>> A gateway is essentially a webserver that gives access to the > >>> distributed IPFS network. > >>> > >>> This protocol support (ipfs and ipns) therefore translates > >>> ipfs:// and ipns:// to a http:// url. This resulting url is > >>> then handled by the http protocol. It could also be https > >>> depending on the gateway provided. > >>> > >>> To use this protocol, a gateway must be provided. > >>> If you do nothing it will try to find it in your > >>> $HOME/.ipfs/gateway file. The ways to set it manually are: > >>> 1. Define a -gateway to the gateway. > >>> 2. Define $IPFS_GATEWAY with the full http link to the gateway. > >>> 3. Define $IPFS_PATH and point it to the IPFS data path. > >>> 4. Have IPFS running in your local user folder (under $HOME/.ipfs). > >>> > >>> Signed-off-by: Mark Gaiser > >>> --- > >>> configure | 2 + > >>> doc/protocols.texi| 30 > >>> libavformat/Makefile | 2 + > >>> libavformat/ipfsgateway.c | 309 > >> ++ > >>> libavformat/protocols.c | 2 + > >>> 5 files changed, 345 insertions(+) > >>> create mode 100644 libavformat/ipfsgateway.c > >>> > >>> diff --git a/configure b/configure > >>> index e4d36aa639..55af90957a 100755 > >>> --- a/configure > >>> +++ b/configure > >>> @@ -3579,6 +3579,8 @@ udp_protocol_select="network" > >>> udplite_protocol_select="network" > >>> unix_protocol_deps="sys_un_h" > >>> unix_protocol_select="network" > >>> +ipfs_protocol_select="https_protocol" > >>> +ipns_protocol_select="https_protocol" > >>> > >>> # external library protocols > >>> libamqp_protocol_deps="librabbitmq" > >>> diff --git a/doc/protocols.texi b/doc/protocols.texi > >>> index d207df0b52..7c9c0a4808 100644 > >>> --- a/doc/protocols.texi > >>> +++ b/doc/protocols.texi > >>> @@ -2025,5 +2025,35 @@ decoding errors. > >>> > >>> @end table > >>> > >>> +@section ipfs > >>> + > >>> +InterPlanetary File System (IPFS) protocol support. One can access > >> files stored > >>> +on the IPFS network through so called gateways. Those are http(s) > >> endpoints. > >>> +This protocol wraps the IPFS native protocols (ipfs:// and > ipns://) > >> to > >> be send > >>> +to such a gateway. Users can (and should) host their own node > which > >> means this > >>> +protocol will use your local machine gateway to access files on > the > >> IPFS network. > >>> + > >>> +If a user doesn't have a node of their own then the public gateway > >> dweb.link is > >>> +used by default. > >>> + > >>> +You can use this protocol in 2 ways. Using IPFS: > >>> +@example > >>> +ffplay ipfs://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T > >>> +@end example > >>> + > >>> +Or the IPNS protocol (IPNS is mutable IPFS): > >>> +@example > >>> +ffplay ipns://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T > >>> +@end example > >>> + > >>> +You can also change the gateway to be used: > >>> + > >>> +@table @option > >>> + > >>> +@item gateway > >>> +Defines the gateway to use. When nothing is provided the protocol > >> will > >> first try > >>> +your local gateway. If that fails dweb.link will be used. > >>> + > >>> +@end table > >>> > >>> @c man end PROTOCOLS > >>> diff --git a/libavformat/Makefile b/libavformat/Makefile > >>> index d7182d6bd8..e3233fd7ac 100644 > >>> --- a/libavformat/Makefile > >>> +++ b/libavformat/Makefile > >>> @@ -660,6 +660,8 @@ OBJS-$(CONFIG_SRTP_PROTOCOL) += > >> srtpproto.o srtp.o > >>> OBJS-$(CONFIG_SUBFILE_PROTOCOL) += subfile.o > >>> OBJS-$(CONFIG_TEE_PROTOCOL) += teeproto.o > tee_common.o > >>> OBJS-$(CONFIG_TCP_PROTOCOL) += tcp.o > >>> +OBJS-$(CONFIG_IPFS_PROTOCOL) += ipfsgateway.o > >>> +OBJS-$(CONFIG_IPNS_PROTOCOL) += ipfsgateway.o > >>> TLS-OBJS-$(CONFIG_GNUTLS)+= tls_gnutls.o > >>> TLS-OBJS-$(CONFIG_LIBTLS)+= tls_libtls.o > >>>
Re: [FFmpeg-devel] [PATCH v5 2/2] lavf/mpegenc: fix termination on error conditions
Nicolas Gaullier: > Avoid an infinite 'retry' loop in output_packet when flushing. > > Signed-off-by: Nicolas Gaullier > --- > libavformat/mpegenc.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/libavformat/mpegenc.c b/libavformat/mpegenc.c > index e0955a7d33..e113a42867 100644 > --- a/libavformat/mpegenc.c > +++ b/libavformat/mpegenc.c > @@ -1002,7 +1002,7 @@ static int output_packet(AVFormatContext *ctx, int > flush) > MpegMuxContext *s = ctx->priv_data; > AVStream *st; > StreamInfo *stream; > -int i, avail_space = 0, es_size, trailer_size; > +int i, has_avail_data = 0, avail_space = 0, es_size, trailer_size; > int best_i = -1; > int best_score = INT_MIN; > int ignore_constraints = 0; > @@ -1028,6 +1028,7 @@ retry: > if (avail_data == 0) > continue; > av_assert0(avail_data > 0); > +has_avail_data = 1; > > if (space < s->packet_size && !ignore_constraints) > continue; > @@ -1048,6 +1049,8 @@ retry: > int64_t best_dts = INT64_MAX; > int has_premux = 0; > > +if (!has_avail_data) > +return 0; > for (i = 0; i < ctx->nb_streams; i++) { > AVStream *st = ctx->streams[i]; > StreamInfo *stream = st->priv_data; in case of errors, the context is left in an inconsistent state: The PacketDesc linked-list claims that there is data in the FIFO although this is wrong. I always prefer avoiding such scenarios over fixing them lateron. In this case, fixing them would mean growing the FIFO before allocating the new PacketDesc (if the FIFO needs growing at all). Shall I do this or do you want to? (In any case, thanks for reporting this issue.) - Andreas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v10 1/1] avformat: Add IPFS protocol support.
Mark Gaiser: > On Thu, Mar 31, 2022 at 11:44 PM Andreas Rheinhardt < > andreas.rheinha...@outlook.com> wrote: > >> Mark Gaiser: >>> On Wed, Mar 30, 2022 at 3:57 PM Andreas Rheinhardt < >>> andreas.rheinha...@outlook.com> wrote: >>> Mark Gaiser: > On Wed, Mar 30, 2022 at 2:21 PM Andreas Rheinhardt < > andreas.rheinha...@outlook.com> wrote: > >> Mark Gaiser: >>> This patch adds support for: >>> - ffplay ipfs:// >>> - ffplay ipns:// >>> >>> IPFS data can be played from so called "ipfs gateways". >>> A gateway is essentially a webserver that gives access to the >>> distributed IPFS network. >>> >>> This protocol support (ipfs and ipns) therefore translates >>> ipfs:// and ipns:// to a http:// url. This resulting url is >>> then handled by the http protocol. It could also be https >>> depending on the gateway provided. >>> >>> To use this protocol, a gateway must be provided. >>> If you do nothing it will try to find it in your >>> $HOME/.ipfs/gateway file. The ways to set it manually are: >>> 1. Define a -gateway to the gateway. >>> 2. Define $IPFS_GATEWAY with the full http link to the gateway. >>> 3. Define $IPFS_PATH and point it to the IPFS data path. >>> 4. Have IPFS running in your local user folder (under $HOME/.ipfs). >>> >>> Signed-off-by: Mark Gaiser >>> --- >>> configure | 2 + >>> doc/protocols.texi| 30 >>> libavformat/Makefile | 2 + >>> libavformat/ipfsgateway.c | 309 >> ++ >>> libavformat/protocols.c | 2 + >>> 5 files changed, 345 insertions(+) >>> create mode 100644 libavformat/ipfsgateway.c >>> >>> diff --git a/configure b/configure >>> index e4d36aa639..55af90957a 100755 >>> --- a/configure >>> +++ b/configure >>> @@ -3579,6 +3579,8 @@ udp_protocol_select="network" >>> udplite_protocol_select="network" >>> unix_protocol_deps="sys_un_h" >>> unix_protocol_select="network" >>> +ipfs_protocol_select="https_protocol" >>> +ipns_protocol_select="https_protocol" >>> >>> # external library protocols >>> libamqp_protocol_deps="librabbitmq" >>> diff --git a/doc/protocols.texi b/doc/protocols.texi >>> index d207df0b52..7c9c0a4808 100644 >>> --- a/doc/protocols.texi >>> +++ b/doc/protocols.texi >>> @@ -2025,5 +2025,35 @@ decoding errors. >>> >>> @end table >>> >>> +@section ipfs >>> + >>> +InterPlanetary File System (IPFS) protocol support. One can access >> files stored >>> +on the IPFS network through so called gateways. Those are http(s) >> endpoints. >>> +This protocol wraps the IPFS native protocols (ipfs:// and ipns://) >> to >> be send >>> +to such a gateway. Users can (and should) host their own node which >> means this >>> +protocol will use your local machine gateway to access files on the >> IPFS network. >>> + >>> +If a user doesn't have a node of their own then the public gateway >> dweb.link is >>> +used by default. >>> + >>> +You can use this protocol in 2 ways. Using IPFS: >>> +@example >>> +ffplay ipfs://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T >>> +@end example >>> + >>> +Or the IPNS protocol (IPNS is mutable IPFS): >>> +@example >>> +ffplay ipns://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T >>> +@end example >>> + >>> +You can also change the gateway to be used: >>> + >>> +@table @option >>> + >>> +@item gateway >>> +Defines the gateway to use. When nothing is provided the protocol >> will >> first try >>> +your local gateway. If that fails dweb.link will be used. >>> + >>> +@end table >>> >>> @c man end PROTOCOLS >>> diff --git a/libavformat/Makefile b/libavformat/Makefile >>> index d7182d6bd8..e3233fd7ac 100644 >>> --- a/libavformat/Makefile >>> +++ b/libavformat/Makefile >>> @@ -660,6 +660,8 @@ OBJS-$(CONFIG_SRTP_PROTOCOL) += >> srtpproto.o srtp.o >>> OBJS-$(CONFIG_SUBFILE_PROTOCOL) += subfile.o >>> OBJS-$(CONFIG_TEE_PROTOCOL) += teeproto.o tee_common.o >>> OBJS-$(CONFIG_TCP_PROTOCOL) += tcp.o >>> +OBJS-$(CONFIG_IPFS_PROTOCOL) += ipfsgateway.o >>> +OBJS-$(CONFIG_IPNS_PROTOCOL) += ipfsgateway.o >>> TLS-OBJS-$(CONFIG_GNUTLS)+= tls_gnutls.o >>> TLS-OBJS-$(CONFIG_LIBTLS)+= tls_libtls.o >>> TLS-OBJS-$(CONFIG_MBEDTLS) += tls_mbedtls.o >>> diff --git a/libavformat/ipfsgateway.c b/libavformat/ipfsgateway.c >>> new file mode 100644 >>> index 00..1a039589c0 >>> --- /dev/null >>> +++ b/libavformat/ipfsgateway.c >>> @@ -0,0 +1,309 @@ >>> +/* >>> + * IPFS and IPNS protocol support through IPFS Gateway.
Re: [FFmpeg-devel] [PATCH v10 1/1] avformat: Add IPFS protocol support.
On Fri, Apr 1, 2022 at 12:17 AM Mark Gaiser wrote: > On Thu, Mar 31, 2022 at 11:44 PM Andreas Rheinhardt < > andreas.rheinha...@outlook.com> wrote: > >> Mark Gaiser: >> > On Wed, Mar 30, 2022 at 3:57 PM Andreas Rheinhardt < >> > andreas.rheinha...@outlook.com> wrote: >> > >> >> Mark Gaiser: >> >>> On Wed, Mar 30, 2022 at 2:21 PM Andreas Rheinhardt < >> >>> andreas.rheinha...@outlook.com> wrote: >> >>> >> Mark Gaiser: >> > This patch adds support for: >> > - ffplay ipfs:// >> > - ffplay ipns:// >> > >> > IPFS data can be played from so called "ipfs gateways". >> > A gateway is essentially a webserver that gives access to the >> > distributed IPFS network. >> > >> > This protocol support (ipfs and ipns) therefore translates >> > ipfs:// and ipns:// to a http:// url. This resulting url is >> > then handled by the http protocol. It could also be https >> > depending on the gateway provided. >> > >> > To use this protocol, a gateway must be provided. >> > If you do nothing it will try to find it in your >> > $HOME/.ipfs/gateway file. The ways to set it manually are: >> > 1. Define a -gateway to the gateway. >> > 2. Define $IPFS_GATEWAY with the full http link to the gateway. >> > 3. Define $IPFS_PATH and point it to the IPFS data path. >> > 4. Have IPFS running in your local user folder (under $HOME/.ipfs). >> > >> > Signed-off-by: Mark Gaiser >> > --- >> > configure | 2 + >> > doc/protocols.texi| 30 >> > libavformat/Makefile | 2 + >> > libavformat/ipfsgateway.c | 309 >> ++ >> > libavformat/protocols.c | 2 + >> > 5 files changed, 345 insertions(+) >> > create mode 100644 libavformat/ipfsgateway.c >> > >> > diff --git a/configure b/configure >> > index e4d36aa639..55af90957a 100755 >> > --- a/configure >> > +++ b/configure >> > @@ -3579,6 +3579,8 @@ udp_protocol_select="network" >> > udplite_protocol_select="network" >> > unix_protocol_deps="sys_un_h" >> > unix_protocol_select="network" >> > +ipfs_protocol_select="https_protocol" >> > +ipns_protocol_select="https_protocol" >> > >> > # external library protocols >> > libamqp_protocol_deps="librabbitmq" >> > diff --git a/doc/protocols.texi b/doc/protocols.texi >> > index d207df0b52..7c9c0a4808 100644 >> > --- a/doc/protocols.texi >> > +++ b/doc/protocols.texi >> > @@ -2025,5 +2025,35 @@ decoding errors. >> > >> > @end table >> > >> > +@section ipfs >> > + >> > +InterPlanetary File System (IPFS) protocol support. One can access >> files stored >> > +on the IPFS network through so called gateways. Those are http(s) >> endpoints. >> > +This protocol wraps the IPFS native protocols (ipfs:// and >> ipns://) to >> be send >> > +to such a gateway. Users can (and should) host their own node which >> means this >> > +protocol will use your local machine gateway to access files on the >> IPFS network. >> > + >> > +If a user doesn't have a node of their own then the public gateway >> dweb.link is >> > +used by default. >> > + >> > +You can use this protocol in 2 ways. Using IPFS: >> > +@example >> > +ffplay ipfs://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T >> > +@end example >> > + >> > +Or the IPNS protocol (IPNS is mutable IPFS): >> > +@example >> > +ffplay ipns://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T >> > +@end example >> > + >> > +You can also change the gateway to be used: >> > + >> > +@table @option >> > + >> > +@item gateway >> > +Defines the gateway to use. When nothing is provided the protocol >> will >> first try >> > +your local gateway. If that fails dweb.link will be used. >> > + >> > +@end table >> > >> > @c man end PROTOCOLS >> > diff --git a/libavformat/Makefile b/libavformat/Makefile >> > index d7182d6bd8..e3233fd7ac 100644 >> > --- a/libavformat/Makefile >> > +++ b/libavformat/Makefile >> > @@ -660,6 +660,8 @@ OBJS-$(CONFIG_SRTP_PROTOCOL) += >> srtpproto.o srtp.o >> > OBJS-$(CONFIG_SUBFILE_PROTOCOL) += subfile.o >> > OBJS-$(CONFIG_TEE_PROTOCOL) += teeproto.o tee_common.o >> > OBJS-$(CONFIG_TCP_PROTOCOL) += tcp.o >> > +OBJS-$(CONFIG_IPFS_PROTOCOL) += ipfsgateway.o >> > +OBJS-$(CONFIG_IPNS_PROTOCOL) += ipfsgateway.o >> > TLS-OBJS-$(CONFIG_GNUTLS)+= tls_gnutls.o >> > TLS-OBJS-$(CONFIG_LIBTLS)+= tls_libtls.o >> > TLS-OBJS-$(CONFIG_MBEDTLS) += tls_mbedtls.o >> > diff --git a/libavformat/ipfsgateway.c b/libavformat/ipfsgateway.c >> > new file mode 100644 >> > index 00..1a039589c0 >> >
Re: [FFmpeg-devel] [PATCH v10 1/1] avformat: Add IPFS protocol support.
On Thu, Mar 31, 2022 at 11:44 PM Andreas Rheinhardt < andreas.rheinha...@outlook.com> wrote: > Mark Gaiser: > > On Wed, Mar 30, 2022 at 3:57 PM Andreas Rheinhardt < > > andreas.rheinha...@outlook.com> wrote: > > > >> Mark Gaiser: > >>> On Wed, Mar 30, 2022 at 2:21 PM Andreas Rheinhardt < > >>> andreas.rheinha...@outlook.com> wrote: > >>> > Mark Gaiser: > > This patch adds support for: > > - ffplay ipfs:// > > - ffplay ipns:// > > > > IPFS data can be played from so called "ipfs gateways". > > A gateway is essentially a webserver that gives access to the > > distributed IPFS network. > > > > This protocol support (ipfs and ipns) therefore translates > > ipfs:// and ipns:// to a http:// url. This resulting url is > > then handled by the http protocol. It could also be https > > depending on the gateway provided. > > > > To use this protocol, a gateway must be provided. > > If you do nothing it will try to find it in your > > $HOME/.ipfs/gateway file. The ways to set it manually are: > > 1. Define a -gateway to the gateway. > > 2. Define $IPFS_GATEWAY with the full http link to the gateway. > > 3. Define $IPFS_PATH and point it to the IPFS data path. > > 4. Have IPFS running in your local user folder (under $HOME/.ipfs). > > > > Signed-off-by: Mark Gaiser > > --- > > configure | 2 + > > doc/protocols.texi| 30 > > libavformat/Makefile | 2 + > > libavformat/ipfsgateway.c | 309 > ++ > > libavformat/protocols.c | 2 + > > 5 files changed, 345 insertions(+) > > create mode 100644 libavformat/ipfsgateway.c > > > > diff --git a/configure b/configure > > index e4d36aa639..55af90957a 100755 > > --- a/configure > > +++ b/configure > > @@ -3579,6 +3579,8 @@ udp_protocol_select="network" > > udplite_protocol_select="network" > > unix_protocol_deps="sys_un_h" > > unix_protocol_select="network" > > +ipfs_protocol_select="https_protocol" > > +ipns_protocol_select="https_protocol" > > > > # external library protocols > > libamqp_protocol_deps="librabbitmq" > > diff --git a/doc/protocols.texi b/doc/protocols.texi > > index d207df0b52..7c9c0a4808 100644 > > --- a/doc/protocols.texi > > +++ b/doc/protocols.texi > > @@ -2025,5 +2025,35 @@ decoding errors. > > > > @end table > > > > +@section ipfs > > + > > +InterPlanetary File System (IPFS) protocol support. One can access > files stored > > +on the IPFS network through so called gateways. Those are http(s) > endpoints. > > +This protocol wraps the IPFS native protocols (ipfs:// and ipns://) > to > be send > > +to such a gateway. Users can (and should) host their own node which > means this > > +protocol will use your local machine gateway to access files on the > IPFS network. > > + > > +If a user doesn't have a node of their own then the public gateway > dweb.link is > > +used by default. > > + > > +You can use this protocol in 2 ways. Using IPFS: > > +@example > > +ffplay ipfs://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T > > +@end example > > + > > +Or the IPNS protocol (IPNS is mutable IPFS): > > +@example > > +ffplay ipns://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T > > +@end example > > + > > +You can also change the gateway to be used: > > + > > +@table @option > > + > > +@item gateway > > +Defines the gateway to use. When nothing is provided the protocol > will > first try > > +your local gateway. If that fails dweb.link will be used. > > + > > +@end table > > > > @c man end PROTOCOLS > > diff --git a/libavformat/Makefile b/libavformat/Makefile > > index d7182d6bd8..e3233fd7ac 100644 > > --- a/libavformat/Makefile > > +++ b/libavformat/Makefile > > @@ -660,6 +660,8 @@ OBJS-$(CONFIG_SRTP_PROTOCOL) += > srtpproto.o srtp.o > > OBJS-$(CONFIG_SUBFILE_PROTOCOL) += subfile.o > > OBJS-$(CONFIG_TEE_PROTOCOL) += teeproto.o tee_common.o > > OBJS-$(CONFIG_TCP_PROTOCOL) += tcp.o > > +OBJS-$(CONFIG_IPFS_PROTOCOL) += ipfsgateway.o > > +OBJS-$(CONFIG_IPNS_PROTOCOL) += ipfsgateway.o > > TLS-OBJS-$(CONFIG_GNUTLS)+= tls_gnutls.o > > TLS-OBJS-$(CONFIG_LIBTLS)+= tls_libtls.o > > TLS-OBJS-$(CONFIG_MBEDTLS) += tls_mbedtls.o > > diff --git a/libavformat/ipfsgateway.c b/libavformat/ipfsgateway.c > > new file mode 100644 > > index 00..1a039589c0 > > --- /dev/null > > +++ b/libavformat/ipfsgateway.c > > @@ -0,0 +1,309 @@ > > +/* > > + * IPFS and IPNS protocol support through IPFS Gateway. > > + * Copyright
Re: [FFmpeg-devel] [PATCH v3 00/10] avcodec/vc1: Arm optimisations
On Thu, 31 Mar 2022, Ben Avison wrote: The VC1 decoder was missing lots of important fast paths for Arm, especially for 64-bit Arm. This submission fills in implementations for all functions where a fast path already existed and the fallback C implementation was taking 1% or more of the runtime, and adds a new fast path to permit vc1_unescape_buffer() to be overridden. I've measured the playback speed on a 1.5 GHz Cortex-A72 (Raspberry Pi 4) using `ffmpeg -i -f null -` for a couple of example streams: Architecture: AArch32AArch32AArch64AArch64 Stream:1 2 1 2 Before speed: 1.22x 0.82x 1.00x 0.67x After speed: 1.31x 0.98x 1.39x 1.06x Improvement: 7.4% 20%39%58% `make fate` passes on both AArch32 and AArch64. Changes in v2: * Refactor checkasm tests to convert some macros into functions. * Remove cast-to-void of checked_call. * Limit 16-bit values in idctdsp checkasm test to +/-0x100. * Reinstate ff_add_pixels_clamped_arm. * Adapt vc1 deblocking filters to specify stride as ptrdiff_t. * Add align specifiers to a few VLD/VST instructions for AArch32 deblocking filter, and adapt checkasm test not to test with tighter alignment than is encountered in normal use. * Correct unescape buffer memcmp length. * Update benchmarks for AArch64 idctdsp. Thanks! From a quick readthrough, this version of the patchset seems good to me! I'll run it through some more testing, and push it if everything seems to work fine (tomorrow or so). // Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v10 1/1] avformat: Add IPFS protocol support.
Mark Gaiser: > On Wed, Mar 30, 2022 at 3:57 PM Andreas Rheinhardt < > andreas.rheinha...@outlook.com> wrote: > >> Mark Gaiser: >>> On Wed, Mar 30, 2022 at 2:21 PM Andreas Rheinhardt < >>> andreas.rheinha...@outlook.com> wrote: >>> Mark Gaiser: > This patch adds support for: > - ffplay ipfs:// > - ffplay ipns:// > > IPFS data can be played from so called "ipfs gateways". > A gateway is essentially a webserver that gives access to the > distributed IPFS network. > > This protocol support (ipfs and ipns) therefore translates > ipfs:// and ipns:// to a http:// url. This resulting url is > then handled by the http protocol. It could also be https > depending on the gateway provided. > > To use this protocol, a gateway must be provided. > If you do nothing it will try to find it in your > $HOME/.ipfs/gateway file. The ways to set it manually are: > 1. Define a -gateway to the gateway. > 2. Define $IPFS_GATEWAY with the full http link to the gateway. > 3. Define $IPFS_PATH and point it to the IPFS data path. > 4. Have IPFS running in your local user folder (under $HOME/.ipfs). > > Signed-off-by: Mark Gaiser > --- > configure | 2 + > doc/protocols.texi| 30 > libavformat/Makefile | 2 + > libavformat/ipfsgateway.c | 309 ++ > libavformat/protocols.c | 2 + > 5 files changed, 345 insertions(+) > create mode 100644 libavformat/ipfsgateway.c > > diff --git a/configure b/configure > index e4d36aa639..55af90957a 100755 > --- a/configure > +++ b/configure > @@ -3579,6 +3579,8 @@ udp_protocol_select="network" > udplite_protocol_select="network" > unix_protocol_deps="sys_un_h" > unix_protocol_select="network" > +ipfs_protocol_select="https_protocol" > +ipns_protocol_select="https_protocol" > > # external library protocols > libamqp_protocol_deps="librabbitmq" > diff --git a/doc/protocols.texi b/doc/protocols.texi > index d207df0b52..7c9c0a4808 100644 > --- a/doc/protocols.texi > +++ b/doc/protocols.texi > @@ -2025,5 +2025,35 @@ decoding errors. > > @end table > > +@section ipfs > + > +InterPlanetary File System (IPFS) protocol support. One can access files stored > +on the IPFS network through so called gateways. Those are http(s) endpoints. > +This protocol wraps the IPFS native protocols (ipfs:// and ipns://) to be send > +to such a gateway. Users can (and should) host their own node which means this > +protocol will use your local machine gateway to access files on the IPFS network. > + > +If a user doesn't have a node of their own then the public gateway dweb.link is > +used by default. > + > +You can use this protocol in 2 ways. Using IPFS: > +@example > +ffplay ipfs://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T > +@end example > + > +Or the IPNS protocol (IPNS is mutable IPFS): > +@example > +ffplay ipns://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T > +@end example > + > +You can also change the gateway to be used: > + > +@table @option > + > +@item gateway > +Defines the gateway to use. When nothing is provided the protocol will first try > +your local gateway. If that fails dweb.link will be used. > + > +@end table > > @c man end PROTOCOLS > diff --git a/libavformat/Makefile b/libavformat/Makefile > index d7182d6bd8..e3233fd7ac 100644 > --- a/libavformat/Makefile > +++ b/libavformat/Makefile > @@ -660,6 +660,8 @@ OBJS-$(CONFIG_SRTP_PROTOCOL) += srtpproto.o srtp.o > OBJS-$(CONFIG_SUBFILE_PROTOCOL) += subfile.o > OBJS-$(CONFIG_TEE_PROTOCOL) += teeproto.o tee_common.o > OBJS-$(CONFIG_TCP_PROTOCOL) += tcp.o > +OBJS-$(CONFIG_IPFS_PROTOCOL) += ipfsgateway.o > +OBJS-$(CONFIG_IPNS_PROTOCOL) += ipfsgateway.o > TLS-OBJS-$(CONFIG_GNUTLS)+= tls_gnutls.o > TLS-OBJS-$(CONFIG_LIBTLS)+= tls_libtls.o > TLS-OBJS-$(CONFIG_MBEDTLS) += tls_mbedtls.o > diff --git a/libavformat/ipfsgateway.c b/libavformat/ipfsgateway.c > new file mode 100644 > index 00..1a039589c0 > --- /dev/null > +++ b/libavformat/ipfsgateway.c > @@ -0,0 +1,309 @@ > +/* > + * IPFS and IPNS protocol support through IPFS Gateway. > + * Copyright (c) 2022 Mark Gaiser > + * > + * This file is part of FFmpeg. > + * > + * FFmpeg is free software; you can redistribute it and/or > + * modify it under the terms of the GNU Lesser General Public > + * License as published by the Free Software Foundation; either > + * version 2.1 of the License, or (at your option)
Re: [FFmpeg-devel] [PATCH 08/10] avcodec/idctdsp: Arm 64-bit NEON block add and clamp fast paths
On Thu, 31 Mar 2022, Ben Avison wrote: On 30/03/2022 15:14, Martin Storsjö wrote: On Fri, 25 Mar 2022, Ben Avison wrote: +// Clamp 16-bit signed block coefficients to signed 8-bit (biased by 128) +// On entry: +// x0 -> array of 64x 16-bit coefficients +// x1 -> 8-bit results +// x2 = row stride for results, bytes +function ff_put_signed_pixels_clamped_neon, export=1 + ld1 {v0.16b, v1.16b, v2.16b, v3.16b}, [x0], #64 + movi v4.8b, #128 + ld1 {v16.16b, v17.16b, v18.16b, v19.16b}, [x0] + sqxtn v0.8b, v0.8h + sqxtn v1.8b, v1.8h + sqxtn v2.8b, v2.8h + sqxtn v3.8b, v3.8h + sqxtn v5.8b, v16.8h + add v0.8b, v0.8b, v4.8b Here you could save 4 add instructions with sqxtn2 and adding .16b vectors, but I'm not sure if it's wortwhile. (It reduces the checkasm numbers by 0.7 for Cortex A72, by 0.3 for A73, but increases the runtime by 1.0 on A53.) Stranegely enough, I get much smaller numbers on my A72 than you got. That's weird. As you say, it should be independent of clock-frequency. FWIW, I'm benchmarking on a Raspberry Pi 4; I'd assume all its board variants' Cortex-A72 cores are of identical revision. Now I run it again, I'm getting these figures: idctdsp.add_pixels_clamped_c: 313.3 idctdsp.add_pixels_clamped_neon: 24.3 idctdsp.put_pixels_clamped_c: 220.3 idctdsp.put_pixels_clamped_neon: 15.5 idctdsp.put_signed_pixels_clamped_c: 210.5 idctdsp.put_signed_pixels_clamped_neon: 19.5 which is more in line with what you see! I am getting a lot of variability between runs though - from a small sample, I'm seeing add_pixels_clamped_neon coming out as anything from 21 to 30, which is well above the sort of differences you're seeing between alternate implementations. That's indeed weird. I don't have a Raspberry Pi 4 myself though, but for functions in this size range on the devboards I test on, I get essentially perfectly stable numbers each time - which is great for empirically testing different implementation strategies. This sort of case is always going to be difficult to schedule optimally for multiple core - factors like how much dual-issuing is possible, latency before values can be used, load speed and the granularity of scoreboarding parts of vectors, all vary widely. Yup, indeed. In most cases, an implementation that is good for one core is usually decent for the other ones as well, but sometimes it ends up a compromise, where optimizing for one makes things worse for another one. As long as the chosen implementation isn't very suboptimal for some common cores, it probably doesn't matter much though. // Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 07/10] avcodec/vc1: Arm 64-bit NEON inverse transform fast paths
On Thu, 31 Mar 2022, Ben Avison wrote: On 30/03/2022 14:49, Martin Storsjö wrote: Looks generally reasonable. Is it possible to factorize out the individual transforms (so that you'd e.g. invoke the same macro twice in the 8x8 and 4x4 functions) without too much loss? There is a close analogy here with the vertical/horizontal deblocking filters, because while there are similarities between the two matrix multiplications within a transform, one of them follows a series of loads and the other follows a matrix transposition. If you look for example at ff_vc1_inv_trans_8x8_neon, you'll see I was able to do a fair amount of overlap between sections of the function - particularly between the transpose and the second matrix multiplication, but to a lesser extent between the loads and the first matrix multiplication and between the second multiplication and the stores. This sort of overlapping is tricky to maintain when using macros. Also, it means the the order of operations within each matrix multiply ended up quite different. At first sight, you might think that the multiplies from the 8x8 function (which you might also view as kind of 8-tap filter) would be re-usable for the size-8 multiplies in the 8x4 or 4x8 function. Yes, the instructions are similar, save for using .4h elements rather than .8h elements, but that has significant impacts on scheduling. For example, the Cortex-A72, which is my primary target, can only do NEON bit-shifts in one pipeline at once, irrespective of whether the vectors are 64-bit or 128-bit long, while other instructions don't have such restrictions. So while in theory you could factor some of this code out more, I suspect any attempt to do so would have a detrimental effect on performance. Ok, fair enough. Yes, it's always a trade off between code simplicity and getting the optimal interleaving. As you've spent the effort on making it efficient with respect to that, let's go with that then! (FWIW, for future endeavours, having the checkasm tests in place while developing/tuning the implementation does allow getting good empirical data on how much you gain from different alternative scheduling choices. I usually don't follow the optimization guides for any specific core, but track the benchmark numbers for a couple different cores and try to pick a scheduling that is a decent compromise for all of them.) Also, for future work - if you have checkasm tests in place while working on the assembly, I usually amend the test with debug printouts that visualize the output of the reference and the tested function, and a map showing which elements differ - which makes tracking down issues a whole lot easier. I don't think any of the checkasm tests in ffmpeg have such printouts though, but within e.g. the dav1d project, the checkasm tool is extended with helpers for comparing and printing such debug aids. // Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH v2 1/8] fate/filter-refcmp-*: make refcmp_metadata fail on empty or truncated input
On empty input the awk script was always successful which caused the filter-refcmp tests to always succeed. Also fix the command lines for refcmp_metadata compare function because it needs auto conversion filters, and update reference of test filter-refcmp-psnr-rgb because it was missed in a7fc78c1a638a32c3695c06f727774c740d675c2 but was never noticed due to the original issue... Signed-off-by: Marton Balint --- tests/fate-run.sh | 2 +- tests/ref/fate/filter-refcmp-psnr-rgb | 80 +-- tests/refcmp-metadata.awk | 5 +- 3 files changed, 45 insertions(+), 42 deletions(-) diff --git a/tests/fate-run.sh b/tests/fate-run.sh index fbfc0a925d..5e8d607d88 100755 --- a/tests/fate-run.sh +++ b/tests/fate-run.sh @@ -377,7 +377,7 @@ refcmp_metadata(){ refcmp=$1 pixfmt=$2 fuzz=${3:-0.001} -ffmpeg $FLAGS $ENC_OPTS \ +ffmpeg -auto_conversion_filters $FLAGS $ENC_OPTS \ -lavfi "testsrc2=size=300x200:rate=1:duration=5,format=${pixfmt},split[ref][tmp];[tmp]avgblur=4[enc];[enc][ref]${refcmp},metadata=print:file=-" \ -f null /dev/null | awk -v ref=${ref} -v fuzz=${fuzz} -f ${base}/refcmp-metadata.awk - } diff --git a/tests/ref/fate/filter-refcmp-psnr-rgb b/tests/ref/fate/filter-refcmp-psnr-rgb index f06db575ac..20abd3dc5a 100644 --- a/tests/ref/fate/filter-refcmp-psnr-rgb +++ b/tests/ref/fate/filter-refcmp-psnr-rgb @@ -1,45 +1,45 @@ frame:0pts:0 pts_time:0 -lavfi.psnr.mse.r=1381.80 -lavfi.psnr.psnr.r=16.73 -lavfi.psnr.mse.g=896.00 -lavfi.psnr.psnr.g=18.61 -lavfi.psnr.mse.b=277.38 -lavfi.psnr.psnr.b=23.70 -lavfi.psnr.mse_avg=851.73 -lavfi.psnr.psnr_avg=18.83 +lavfi.psnr.mse.r=1367.642090 +lavfi.psnr.psnr.r=16.771078 +lavfi.psnr.mse.g=885.804382 +lavfi.psnr.psnr.g=18.657425 +lavfi.psnr.mse.b=274.825073 +lavfi.psnr.psnr.b=23.740240 +lavfi.psnr.mse_avg=842.757202 +lavfi.psnr.psnr_avg=18.873779 frame:1pts:1 pts_time:1 -lavfi.psnr.mse.r=1380.37 -lavfi.psnr.psnr.r=16.73 -lavfi.psnr.mse.g=975.91 -lavfi.psnr.psnr.g=18.24 -lavfi.psnr.mse.b=435.72 -lavfi.psnr.psnr.b=21.74 -lavfi.psnr.mse_avg=930.67 -lavfi.psnr.psnr_avg=18.44 +lavfi.psnr.mse.r=1356.681152 +lavfi.psnr.psnr.r=16.806026 +lavfi.psnr.mse.g=958.161560 +lavfi.psnr.psnr.g=18.316416 +lavfi.psnr.mse.b=428.238312 +lavfi.psnr.psnr.b=21.813948 +lavfi.psnr.mse_avg=914.360352 +lavfi.psnr.psnr_avg=18.519630 frame:2pts:2 pts_time:2 -lavfi.psnr.mse.r=1403.20 -lavfi.psnr.psnr.r=16.66 -lavfi.psnr.mse.g=954.05 -lavfi.psnr.psnr.g=18.34 -lavfi.psnr.mse.b=494.22 -lavfi.psnr.psnr.b=21.19 -lavfi.psnr.mse_avg=950.49 -lavfi.psnr.psnr_avg=18.35 +lavfi.psnr.mse.r=1387.254883 +lavfi.psnr.psnr.r=16.709242 +lavfi.psnr.mse.g=939.230957 +lavfi.psnr.psnr.g=18.403080 +lavfi.psnr.mse.b=493.913757 +lavfi.psnr.psnr.b=21.194292 +lavfi.psnr.mse_avg=940.133179 +lavfi.psnr.psnr_avg=18.398911 frame:3pts:3 pts_time:3 -lavfi.psnr.mse.r=1452.80 -lavfi.psnr.psnr.r=16.51 -lavfi.psnr.mse.g=1001.02 -lavfi.psnr.psnr.g=18.13 -lavfi.psnr.mse.b=557.39 -lavfi.psnr.psnr.b=20.67 -lavfi.psnr.mse_avg=1003.74 -lavfi.psnr.psnr_avg=18.11 +lavfi.psnr.mse.r=1433.291260 +lavfi.psnr.psnr.r=16.567459 +lavfi.psnr.mse.g=990.005859 +lavfi.psnr.psnr.g=18.174425 +lavfi.psnr.mse.b=550.512329 +lavfi.psnr.psnr.b=20.723133 +lavfi.psnr.mse_avg=991.269836 +lavfi.psnr.psnr_avg=18.168884 frame:4pts:4 pts_time:4 -lavfi.psnr.mse.r=1401.25 -lavfi.psnr.psnr.r=16.67 -lavfi.psnr.mse.g=1009.80 -lavfi.psnr.psnr.g=18.09 -lavfi.psnr.mse.b=602.42 -lavfi.psnr.psnr.b=20.33 -lavfi.psnr.mse_avg=1004.49 -lavfi.psnr.psnr_avg=18.11 +lavfi.psnr.mse.r=1385.949341 +lavfi.psnr.psnr.r=16.713329 +lavfi.psnr.mse.g=997.065796 +lavfi.psnr.psnr.g=18.143566 +lavfi.psnr.mse.b=601.962952 +lavfi.psnr.psnr.b=20.335106 +lavfi.psnr.mse_avg=994.992676 +lavfi.psnr.psnr_avg=18.152605 diff --git a/tests/refcmp-metadata.awk b/tests/refcmp-metadata.awk index fa21aad0e0..850aaac5a3 100644 --- a/tests/refcmp-metadata.awk +++ b/tests/refcmp-metadata.awk @@ -50,13 +50,16 @@ BEGIN { } END { +result = result && (NR == ref_nr); if (result) { for (i = 1; i <= ref_nr; i++) print ref_lines[i]; } else { for (i = 1; i <= NR; i++) print cmp_lines[i]; -if (NR != ref_nr) +if (NR == 0) +print "[refcmp] no input" > "/dev/stderr"; +else if (NR != ref_nr) print "[refcmp] lines: " NR " != " ref_nr > "/dev/stderr"; if (delta_max >= fuzz) print "[refcmp] delta_max: " delta_max " >= " fuzz > "/dev/stderr"; -- 2.31.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 05/10] avcodec/vc1: Arm 64-bit NEON deblocking filter fast paths
On Thu, 31 Mar 2022, Ben Avison wrote: On 30/03/2022 13:35, Martin Storsjö wrote: Overall, the code looks sensible to me. Would it make sense to share the core of the filter between the horizontal/vertical cases with e.g. a macro? (I didn't check in detail if there's much differences in the core of the filter. At most some differences in condition registers for partial writeout in the horizontal forms?) Well, looking at the comments at the right-hand side of the source, which give the logical meaning of the results of each instruction, I admit there's a resemblance in the middle of the 8-pixel-pair function. Actually, I didn't try to follow/compare it to that level, I just assumed them to be similar. However, the physical register assignments are quite different, and attempting to reassign the registers in one to match the other isn't a trivial task. It's hard enough when you start register assignment from the top of a function and work your way down, as I have done here. In the 16-pixel-pair case, the fact that the input values arrive in a different order as the result of them, in one case, being loaded in regularly-increasing address order, and in the other, falling out of a matrix transposition, has resulted in even the logical order of instructions being quite different in the two cases. In the 4-pixel-pair case, the values are packed differently into registers in the two cases, because in the v case, we're loading 4 pixels between row-strides, which means it's easy to place each row in its own vector, whereas in the h case we load 4 rows of 8 pixels each and transpose, which leaves the values in 4 vectors rather than 8. Some of the filtering steps can be performed with the data packed in this way (calculating a1 and a2) while waiting for it to be restructured in order to calculate the other metrics, but it's not worth packing the data together in this way in the v case given that it starts off already separated. So the two implementations end up quite different in the operations they perform, not just the scheduling of instructions and in register assignment terms. Some background: as you may have guessed, I didn't start out writing these functions as they currently appear. Prototype versions didn't care much for scheduling or keeping to a small number of registers. They were primarily for checking the correctness of the mathematics, and they'd use all available vectors, sometimes shuffling values between registers or to the stack to make room. Once I'd verified correctness, I then reworked them to keep to a minimal number of registers and to minimise stalls as far as possible. I'm targeting the Cortex-A72, since that's what the Raspberry Pi 4 uses and it's on the cusp of having enough power to decode VC-1 BluRay streams, so I deliberately didn't take too much consideration of the requirements of earlier cores. Yes, it's an out-of-order core, but I reckoned there are probably limits to how wisely it can select instructions to execute (there have got to be limits to instruction queue lengths, for example). So based on the pipeline structure documented in Arm's Cortex-A72 software opimization guide, I arranged the instructions to best keep all pipelines busy as much as possible, then assigned registers to keep the instructions in this order. For the most part, I was able to keep the number of vectors used low enough that no callee-saving was required - or failing that, at least avoiding having to spill values to the stack mid-function. But it came pretty close at times - witness for example the peculiar order in which vectors had to be loaded in the AArch32 version of ff_vc1_h_loop_filter16_neon. There's reason behind that! In short, I'd really rather not tamper with these larger assembly functions any more unless I really have to. Ok, fair enough. FWIW, my point of view was from implementing the loop filters for VP9 and AV1, where I did the core filter as one shared implementation for both variants, and where the frontend functions just load (and transpose) data into the registers used as input for the common core filter, and vice versa. But I presume that a custom implementation for each of them can be more optimal, at the cost of more code to maintain (but if there are no bugs, it usually doesn't need maintainance either). Thus - fair enough, this code probably is ok then. // Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH 2/2] avcodec/vp9_raw_reorder_bsf: Merge close and flush
Also mark the function as av_cold while at it. Signed-off-by: Andreas Rheinhardt --- libavcodec/vp9_raw_reorder_bsf.c | 16 +++- 1 file changed, 3 insertions(+), 13 deletions(-) diff --git a/libavcodec/vp9_raw_reorder_bsf.c b/libavcodec/vp9_raw_reorder_bsf.c index 368dcb26c2..d36093316c 100644 --- a/libavcodec/vp9_raw_reorder_bsf.c +++ b/libavcodec/vp9_raw_reorder_bsf.c @@ -390,7 +390,7 @@ fail: return err; } -static void vp9_raw_reorder_flush(AVBSFContext *bsf) +static av_cold void vp9_raw_reorder_flush_close(AVBSFContext *bsf) { VP9RawReorderContext *ctx = bsf->priv_data; @@ -400,16 +400,6 @@ static void vp9_raw_reorder_flush(AVBSFContext *bsf) ctx->sequence = 0; } -static void vp9_raw_reorder_close(AVBSFContext *bsf) -{ -VP9RawReorderContext *ctx = bsf->priv_data; -int s; - -for (s = 0; s < FRAME_SLOTS; s++) -vp9_raw_reorder_clear_slot(ctx, s); -vp9_raw_reorder_frame_free(>next_frame); -} - static const enum AVCodecID vp9_raw_reorder_codec_ids[] = { AV_CODEC_ID_VP9, AV_CODEC_ID_NONE, }; @@ -418,7 +408,7 @@ const FFBitStreamFilter ff_vp9_raw_reorder_bsf = { .p.name = "vp9_raw_reorder", .p.codec_ids= vp9_raw_reorder_codec_ids, .priv_data_size = sizeof(VP9RawReorderContext), -.close = _raw_reorder_close, -.flush = _raw_reorder_flush, .filter = _raw_reorder_filter, +.flush = _raw_reorder_flush_close, +.close = _raw_reorder_flush_close, }; -- 2.32.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH 1/2] avcodec/vp9_raw_reorder_bsf: Fix leak of cached packet
In case the BSF has not been drained before flushing/closing, the context's next_frame might be set; yet it is not freed in flush or close. The former only zeroes it (which automatically causes a leak in case it was set). So do this when closing and flushing. Signed-off-by: Andreas Rheinhardt --- libavcodec/vp9_raw_reorder_bsf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libavcodec/vp9_raw_reorder_bsf.c b/libavcodec/vp9_raw_reorder_bsf.c index e7d301cb85..368dcb26c2 100644 --- a/libavcodec/vp9_raw_reorder_bsf.c +++ b/libavcodec/vp9_raw_reorder_bsf.c @@ -396,7 +396,7 @@ static void vp9_raw_reorder_flush(AVBSFContext *bsf) for (int s = 0; s < FRAME_SLOTS; s++) vp9_raw_reorder_clear_slot(ctx, s); -ctx->next_frame = NULL; +vp9_raw_reorder_frame_free(>next_frame); ctx->sequence = 0; } @@ -407,6 +407,7 @@ static void vp9_raw_reorder_close(AVBSFContext *bsf) for (s = 0; s < FRAME_SLOTS; s++) vp9_raw_reorder_clear_slot(ctx, s); +vp9_raw_reorder_frame_free(>next_frame); } static const enum AVCodecID vp9_raw_reorder_codec_ids[] = { -- 2.32.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] avcodec/libvpxenc: enable dynamic max quantizer parameter reconfiguration
On Wed, Mar 30, 2022 at 11:25 AM Danil Chapovalov wrote: > > --- > doc/encoders.texi | 3 +++ > libavcodec/libvpxenc.c | 6 ++ > 2 files changed, 9 insertions(+) > lgtm. I'll submit this with a patch version bump soon if there aren't any further comments. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH v3 10/10] avcodec/vc1: Arm 32-bit NEON unescape fast path
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. vc1dsp.vc1_unescape_buffer_c: 918624.7 vc1dsp.vc1_unescape_buffer_neon: 142958.0 Signed-off-by: Ben Avison --- libavcodec/arm/vc1dsp_init_neon.c | 61 +++ libavcodec/arm/vc1dsp_neon.S | 118 ++ 2 files changed, 179 insertions(+) diff --git a/libavcodec/arm/vc1dsp_init_neon.c b/libavcodec/arm/vc1dsp_init_neon.c index f5f5c702d7..48cb816b70 100644 --- a/libavcodec/arm/vc1dsp_init_neon.c +++ b/libavcodec/arm/vc1dsp_init_neon.c @@ -19,6 +19,7 @@ #include #include "libavutil/attributes.h" +#include "libavutil/intreadwrite.h" #include "libavcodec/vc1dsp.h" #include "vc1dsp.h" @@ -84,6 +85,64 @@ void ff_put_vc1_chroma_mc4_neon(uint8_t *dst, uint8_t *src, ptrdiff_t stride, void ff_avg_vc1_chroma_mc4_neon(uint8_t *dst, uint8_t *src, ptrdiff_t stride, int h, int x, int y); +int ff_vc1_unescape_buffer_helper_neon(const uint8_t *src, int size, uint8_t *dst); + +static int vc1_unescape_buffer_neon(const uint8_t *src, int size, uint8_t *dst) +{ +/* Dealing with starting and stopping, and removing escape bytes, are + * comparatively less time-sensitive, so are more clearly expressed using + * a C wrapper around the assembly inner loop. Note that we assume a + * little-endian machine that supports unaligned loads. */ +int dsize = 0; +while (size >= 4) +{ +int found = 0; +while (!found && (((uintptr_t) dst) & 7) && size >= 4) +{ +found = (AV_RL32(src) &~ 0x0300) == 0x0003; +if (!found) +{ +*dst++ = *src++; +--size; +++dsize; +} +} +if (!found) +{ +int skip = size - ff_vc1_unescape_buffer_helper_neon(src, size, dst); +dst += skip; +src += skip; +size -= skip; +dsize += skip; +while (!found && size >= 4) +{ +found = (AV_RL32(src) &~ 0x0300) == 0x0003; +if (!found) +{ +*dst++ = *src++; +--size; +++dsize; +} +} +} +if (found) +{ +*dst++ = *src++; +*dst++ = *src++; +++src; +size -= 3; +dsize += 2; +} +} +while (size > 0) +{ +*dst++ = *src++; +--size; +++dsize; +} +return dsize; +} + #define FN_ASSIGN(X, Y) \ dsp->put_vc1_mspel_pixels_tab[0][X+4*Y] = ff_put_vc1_mspel_mc##X##Y##_16_neon; \ dsp->put_vc1_mspel_pixels_tab[1][X+4*Y] = ff_put_vc1_mspel_mc##X##Y##_neon @@ -130,4 +189,6 @@ av_cold void ff_vc1dsp_init_neon(VC1DSPContext *dsp) dsp->avg_no_rnd_vc1_chroma_pixels_tab[0] = ff_avg_vc1_chroma_mc8_neon; dsp->put_no_rnd_vc1_chroma_pixels_tab[1] = ff_put_vc1_chroma_mc4_neon; dsp->avg_no_rnd_vc1_chroma_pixels_tab[1] = ff_avg_vc1_chroma_mc4_neon; + +dsp->vc1_unescape_buffer = vc1_unescape_buffer_neon; } diff --git a/libavcodec/arm/vc1dsp_neon.S b/libavcodec/arm/vc1dsp_neon.S index ba54221ef6..96014fbebc 100644 --- a/libavcodec/arm/vc1dsp_neon.S +++ b/libavcodec/arm/vc1dsp_neon.S @@ -1804,3 +1804,121 @@ function ff_vc1_h_loop_filter16_neon, export=1 4: vpop{d8-d15} pop {r4-r6,pc} endfunc + +@ Copy at most the specified number of bytes from source to destination buffer, +@ stopping at a multiple of 16 bytes, none of which are the start of an escape sequence +@ On entry: +@ r0 -> source buffer +@ r1 = max number of bytes to copy +@ r2 -> destination buffer, optimally 8-byte aligned +@ On exit: +@ r0 = number of bytes not copied +function ff_vc1_unescape_buffer_helper_neon, export=1 +@ Offset by 48 to screen out cases that are too short for us to handle, +@ and also make it easy to test for loop termination, or to determine +@ whether we need an odd number of half-iterations of the loop. +subsr1, r1, #48 +bmi 90f + +@ Set up useful constants +vmov.i32q0, #0x300 +vmov.i32q1, #0x3 + +tst r1, #16 +bne 1f + + vld1.8 {q8, q9}, [r0]! + vbicq12, q8, q0 + vext.8 q13, q8, q9, #1 + vext.8 q14, q8, q9, #2 + vext.8 q15, q8, q9, #3 + veorq12, q12, q1 + vbicq13, q13, q0 + vbicq14, q14, q0 + vbicq15, q15, q0 + vceq.i32q12, q12, #0 + veorq13, q13, q1 + veorq14, q14, q1 + veorq15, q15, q1 + vceq.i32q13, q13, #0 + vceq.i32q14, q14, #0 + vceq.i32q15,
[FFmpeg-devel] [PATCH v3 09/10] avcodec/vc1: Arm 64-bit NEON unescape fast path
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. vc1dsp.vc1_unescape_buffer_c: 655617.7 vc1dsp.vc1_unescape_buffer_neon: 118237.0 Signed-off-by: Ben Avison --- libavcodec/aarch64/vc1dsp_init_aarch64.c | 61 libavcodec/aarch64/vc1dsp_neon.S | 176 +++ 2 files changed, 237 insertions(+) diff --git a/libavcodec/aarch64/vc1dsp_init_aarch64.c b/libavcodec/aarch64/vc1dsp_init_aarch64.c index e0eb52dd63..a7976fd596 100644 --- a/libavcodec/aarch64/vc1dsp_init_aarch64.c +++ b/libavcodec/aarch64/vc1dsp_init_aarch64.c @@ -21,6 +21,7 @@ #include "libavutil/attributes.h" #include "libavutil/cpu.h" #include "libavutil/aarch64/cpu.h" +#include "libavutil/intreadwrite.h" #include "libavcodec/vc1dsp.h" #include "config.h" @@ -51,6 +52,64 @@ void ff_put_vc1_chroma_mc4_neon(uint8_t *dst, uint8_t *src, ptrdiff_t stride, void ff_avg_vc1_chroma_mc4_neon(uint8_t *dst, uint8_t *src, ptrdiff_t stride, int h, int x, int y); +int ff_vc1_unescape_buffer_helper_neon(const uint8_t *src, int size, uint8_t *dst); + +static int vc1_unescape_buffer_neon(const uint8_t *src, int size, uint8_t *dst) +{ +/* Dealing with starting and stopping, and removing escape bytes, are + * comparatively less time-sensitive, so are more clearly expressed using + * a C wrapper around the assembly inner loop. Note that we assume a + * little-endian machine that supports unaligned loads. */ +int dsize = 0; +while (size >= 4) +{ +int found = 0; +while (!found && (((uintptr_t) dst) & 7) && size >= 4) +{ +found = (AV_RL32(src) &~ 0x0300) == 0x0003; +if (!found) +{ +*dst++ = *src++; +--size; +++dsize; +} +} +if (!found) +{ +int skip = size - ff_vc1_unescape_buffer_helper_neon(src, size, dst); +dst += skip; +src += skip; +size -= skip; +dsize += skip; +while (!found && size >= 4) +{ +found = (AV_RL32(src) &~ 0x0300) == 0x0003; +if (!found) +{ +*dst++ = *src++; +--size; +++dsize; +} +} +} +if (found) +{ +*dst++ = *src++; +*dst++ = *src++; +++src; +size -= 3; +dsize += 2; +} +} +while (size > 0) +{ +*dst++ = *src++; +--size; +++dsize; +} +return dsize; +} + av_cold void ff_vc1dsp_init_aarch64(VC1DSPContext *dsp) { int cpu_flags = av_get_cpu_flags(); @@ -76,5 +135,7 @@ av_cold void ff_vc1dsp_init_aarch64(VC1DSPContext *dsp) dsp->avg_no_rnd_vc1_chroma_pixels_tab[0] = ff_avg_vc1_chroma_mc8_neon; dsp->put_no_rnd_vc1_chroma_pixels_tab[1] = ff_put_vc1_chroma_mc4_neon; dsp->avg_no_rnd_vc1_chroma_pixels_tab[1] = ff_avg_vc1_chroma_mc4_neon; + +dsp->vc1_unescape_buffer = vc1_unescape_buffer_neon; } } diff --git a/libavcodec/aarch64/vc1dsp_neon.S b/libavcodec/aarch64/vc1dsp_neon.S index 0201db4f78..9a96c2523c 100644 --- a/libavcodec/aarch64/vc1dsp_neon.S +++ b/libavcodec/aarch64/vc1dsp_neon.S @@ -1368,3 +1368,179 @@ function ff_vc1_h_loop_filter16_neon, export=1 st2 {v2.b, v3.b}[7], [x6] 4: ret endfunc + +// Copy at most the specified number of bytes from source to destination buffer, +// stopping at a multiple of 32 bytes, none of which are the start of an escape sequence +// On entry: +// x0 -> source buffer +// w1 = max number of bytes to copy +// x2 -> destination buffer, optimally 8-byte aligned +// On exit: +// w0 = number of bytes not copied +function ff_vc1_unescape_buffer_helper_neon, export=1 +// Offset by 80 to screen out cases that are too short for us to handle, +// and also make it easy to test for loop termination, or to determine +// whether we need an odd number of half-iterations of the loop. +subsw1, w1, #80 +b.mi90f + +// Set up useful constants +moviv20.4s, #3, lsl #24 +moviv21.4s, #3, lsl #16 + +tst w1, #32 +b.ne1f + + ld1 {v0.16b, v1.16b, v2.16b}, [x0], #48 + ext v25.16b, v0.16b, v1.16b, #1 + ext v26.16b, v0.16b, v1.16b, #2 + ext v27.16b, v0.16b, v1.16b, #3 + ext v29.16b, v1.16b, v2.16b, #1 + ext v30.16b, v1.16b, v2.16b, #2 + ext v31.16b, v1.16b, v2.16b, #3 + bic v24.16b, v0.16b, v20.16b + bic v25.16b, v25.16b, v20.16b + bic v26.16b, v26.16b, v20.16b + bic
[FFmpeg-devel] [PATCH v3 08/10] avcodec/idctdsp: Arm 64-bit NEON block add and clamp fast paths
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. idctdsp.add_pixels_clamped_c: 313.3 idctdsp.add_pixels_clamped_neon: 24.3 idctdsp.put_pixels_clamped_c: 220.3 idctdsp.put_pixels_clamped_neon: 15.5 idctdsp.put_signed_pixels_clamped_c: 210.5 idctdsp.put_signed_pixels_clamped_neon: 19.5 Signed-off-by: Ben Avison --- libavcodec/aarch64/Makefile | 3 +- libavcodec/aarch64/idctdsp_init_aarch64.c | 26 +++-- libavcodec/aarch64/idctdsp_neon.S | 130 ++ 3 files changed, 150 insertions(+), 9 deletions(-) create mode 100644 libavcodec/aarch64/idctdsp_neon.S diff --git a/libavcodec/aarch64/Makefile b/libavcodec/aarch64/Makefile index 5b25e4dfb9..c8935f205e 100644 --- a/libavcodec/aarch64/Makefile +++ b/libavcodec/aarch64/Makefile @@ -44,7 +44,8 @@ NEON-OBJS-$(CONFIG_H264PRED)+= aarch64/h264pred_neon.o NEON-OBJS-$(CONFIG_H264QPEL)+= aarch64/h264qpel_neon.o \ aarch64/hpeldsp_neon.o NEON-OBJS-$(CONFIG_HPELDSP) += aarch64/hpeldsp_neon.o -NEON-OBJS-$(CONFIG_IDCTDSP) += aarch64/simple_idct_neon.o +NEON-OBJS-$(CONFIG_IDCTDSP) += aarch64/idctdsp_neon.o \ + aarch64/simple_idct_neon.o NEON-OBJS-$(CONFIG_MDCT)+= aarch64/mdct_neon.o NEON-OBJS-$(CONFIG_MPEGAUDIODSP)+= aarch64/mpegaudiodsp_neon.o NEON-OBJS-$(CONFIG_PIXBLOCKDSP) += aarch64/pixblockdsp_neon.o diff --git a/libavcodec/aarch64/idctdsp_init_aarch64.c b/libavcodec/aarch64/idctdsp_init_aarch64.c index 742a3372e3..eec21aa5a2 100644 --- a/libavcodec/aarch64/idctdsp_init_aarch64.c +++ b/libavcodec/aarch64/idctdsp_init_aarch64.c @@ -27,19 +27,29 @@ #include "libavcodec/idctdsp.h" #include "idct.h" +void ff_put_pixels_clamped_neon(const int16_t *, uint8_t *, ptrdiff_t); +void ff_put_signed_pixels_clamped_neon(const int16_t *, uint8_t *, ptrdiff_t); +void ff_add_pixels_clamped_neon(const int16_t *, uint8_t *, ptrdiff_t); + av_cold void ff_idctdsp_init_aarch64(IDCTDSPContext *c, AVCodecContext *avctx, unsigned high_bit_depth) { int cpu_flags = av_get_cpu_flags(); -if (have_neon(cpu_flags) && !avctx->lowres && !high_bit_depth) { -if (avctx->idct_algo == FF_IDCT_AUTO || -avctx->idct_algo == FF_IDCT_SIMPLEAUTO || -avctx->idct_algo == FF_IDCT_SIMPLENEON) { -c->idct_put = ff_simple_idct_put_neon; -c->idct_add = ff_simple_idct_add_neon; -c->idct = ff_simple_idct_neon; -c->perm_type = FF_IDCT_PERM_PARTTRANS; +if (have_neon(cpu_flags)) { +if (!avctx->lowres && !high_bit_depth) { +if (avctx->idct_algo == FF_IDCT_AUTO || +avctx->idct_algo == FF_IDCT_SIMPLEAUTO || +avctx->idct_algo == FF_IDCT_SIMPLENEON) { +c->idct_put = ff_simple_idct_put_neon; +c->idct_add = ff_simple_idct_add_neon; +c->idct = ff_simple_idct_neon; +c->perm_type = FF_IDCT_PERM_PARTTRANS; +} } + +c->add_pixels_clamped= ff_add_pixels_clamped_neon; +c->put_pixels_clamped= ff_put_pixels_clamped_neon; +c->put_signed_pixels_clamped = ff_put_signed_pixels_clamped_neon; } } diff --git a/libavcodec/aarch64/idctdsp_neon.S b/libavcodec/aarch64/idctdsp_neon.S new file mode 100644 index 00..7f47611206 --- /dev/null +++ b/libavcodec/aarch64/idctdsp_neon.S @@ -0,0 +1,130 @@ +/* + * IDCT AArch64 NEON optimisations + * + * Copyright (c) 2022 Ben Avison + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include "libavutil/aarch64/asm.S" + +// Clamp 16-bit signed block coefficients to unsigned 8-bit +// On entry: +// x0 -> array of 64x 16-bit coefficients +// x1 -> 8-bit results +// x2 = row stride for results, bytes +function ff_put_pixels_clamped_neon, export=1 +ld1 {v0.16b, v1.16b, v2.16b, v3.16b}, [x0], #64 +ld1 {v4.16b, v5.16b, v6.16b, v7.16b}, [x0] +sqxtun v0.8b, v0.8h +sqxtun v1.8b, v1.8h +sqxtun v2.8b, v2.8h +
[FFmpeg-devel] [PATCH v3 07/10] avcodec/vc1: Arm 64-bit NEON inverse transform fast paths
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. vc1dsp.vc1_inv_trans_4x4_c: 158.2 vc1dsp.vc1_inv_trans_4x4_neon: 65.7 vc1dsp.vc1_inv_trans_4x4_dc_c: 86.5 vc1dsp.vc1_inv_trans_4x4_dc_neon: 26.5 vc1dsp.vc1_inv_trans_4x8_c: 335.2 vc1dsp.vc1_inv_trans_4x8_neon: 106.2 vc1dsp.vc1_inv_trans_4x8_dc_c: 151.2 vc1dsp.vc1_inv_trans_4x8_dc_neon: 25.5 vc1dsp.vc1_inv_trans_8x4_c: 365.7 vc1dsp.vc1_inv_trans_8x4_neon: 97.2 vc1dsp.vc1_inv_trans_8x4_dc_c: 139.7 vc1dsp.vc1_inv_trans_8x4_dc_neon: 16.5 vc1dsp.vc1_inv_trans_8x8_c: 547.7 vc1dsp.vc1_inv_trans_8x8_neon: 137.0 vc1dsp.vc1_inv_trans_8x8_dc_c: 268.2 vc1dsp.vc1_inv_trans_8x8_dc_neon: 30.5 Signed-off-by: Ben Avison --- libavcodec/aarch64/vc1dsp_init_aarch64.c | 19 + libavcodec/aarch64/vc1dsp_neon.S | 678 +++ 2 files changed, 697 insertions(+) diff --git a/libavcodec/aarch64/vc1dsp_init_aarch64.c b/libavcodec/aarch64/vc1dsp_init_aarch64.c index 8f96e4802d..e0eb52dd63 100644 --- a/libavcodec/aarch64/vc1dsp_init_aarch64.c +++ b/libavcodec/aarch64/vc1dsp_init_aarch64.c @@ -25,6 +25,16 @@ #include "config.h" +void ff_vc1_inv_trans_8x8_neon(int16_t *block); +void ff_vc1_inv_trans_8x4_neon(uint8_t *dest, ptrdiff_t stride, int16_t *block); +void ff_vc1_inv_trans_4x8_neon(uint8_t *dest, ptrdiff_t stride, int16_t *block); +void ff_vc1_inv_trans_4x4_neon(uint8_t *dest, ptrdiff_t stride, int16_t *block); + +void ff_vc1_inv_trans_8x8_dc_neon(uint8_t *dest, ptrdiff_t stride, int16_t *block); +void ff_vc1_inv_trans_8x4_dc_neon(uint8_t *dest, ptrdiff_t stride, int16_t *block); +void ff_vc1_inv_trans_4x8_dc_neon(uint8_t *dest, ptrdiff_t stride, int16_t *block); +void ff_vc1_inv_trans_4x4_dc_neon(uint8_t *dest, ptrdiff_t stride, int16_t *block); + void ff_vc1_v_loop_filter4_neon(uint8_t *src, ptrdiff_t stride, int pq); void ff_vc1_h_loop_filter4_neon(uint8_t *src, ptrdiff_t stride, int pq); void ff_vc1_v_loop_filter8_neon(uint8_t *src, ptrdiff_t stride, int pq); @@ -46,6 +56,15 @@ av_cold void ff_vc1dsp_init_aarch64(VC1DSPContext *dsp) int cpu_flags = av_get_cpu_flags(); if (have_neon(cpu_flags)) { +dsp->vc1_inv_trans_8x8 = ff_vc1_inv_trans_8x8_neon; +dsp->vc1_inv_trans_8x4 = ff_vc1_inv_trans_8x4_neon; +dsp->vc1_inv_trans_4x8 = ff_vc1_inv_trans_4x8_neon; +dsp->vc1_inv_trans_4x4 = ff_vc1_inv_trans_4x4_neon; +dsp->vc1_inv_trans_8x8_dc = ff_vc1_inv_trans_8x8_dc_neon; +dsp->vc1_inv_trans_8x4_dc = ff_vc1_inv_trans_8x4_dc_neon; +dsp->vc1_inv_trans_4x8_dc = ff_vc1_inv_trans_4x8_dc_neon; +dsp->vc1_inv_trans_4x4_dc = ff_vc1_inv_trans_4x4_dc_neon; + dsp->vc1_v_loop_filter4 = ff_vc1_v_loop_filter4_neon; dsp->vc1_h_loop_filter4 = ff_vc1_h_loop_filter4_neon; dsp->vc1_v_loop_filter8 = ff_vc1_v_loop_filter8_neon; diff --git a/libavcodec/aarch64/vc1dsp_neon.S b/libavcodec/aarch64/vc1dsp_neon.S index 1ea9fa75ff..0201db4f78 100644 --- a/libavcodec/aarch64/vc1dsp_neon.S +++ b/libavcodec/aarch64/vc1dsp_neon.S @@ -22,7 +22,685 @@ #include "libavutil/aarch64/asm.S" +// VC-1 8x8 inverse transform +// On entry: +// x0 -> array of 16-bit inverse transform coefficients, in column-major order +// On exit: +// array at x0 updated to hold transformed block; also now held in row-major order +function ff_vc1_inv_trans_8x8_neon, export=1 +ld1 {v1.16b, v2.16b}, [x0], #32 +ld1 {v3.16b, v4.16b}, [x0], #32 +ld1 {v5.16b, v6.16b}, [x0], #32 +shl v1.8h, v1.8h, #2// 8/2 * src[0] +sub x1, x0, #3*32 +ld1 {v16.16b, v17.16b}, [x0] +shl v7.8h, v2.8h, #4// 16 * src[8] +shl v18.8h, v2.8h, #2 // 4 * src[8] +shl v19.8h, v4.8h, #4 //16 * src[24] +ldr d0, .Lcoeffs_it8 +shl v5.8h, v5.8h, #2// 8/2 * src[32] +shl v20.8h, v6.8h, #4 // 16 * src[40] +shl v21.8h, v6.8h, #2 // 4 * src[40] +shl v22.8h, v17.8h, #4 // 16 * src[56] +ssrav20.8h, v19.8h, #2 // 4 * src[24] + 16 * src[40] +mul v23.8h, v3.8h, v0.h[0] // 6/2 * src[16] +sub v19.8h, v19.8h, v21.8h //16 * src[24] - 4 * src[40] +ssrav7.8h, v22.8h, #2 // 16 * src[8] + 4 * src[56] +sub v18.8h, v22.8h, v18.8h //- 4 * src[8] + 16 * src[56] +shl v3.8h, v3.8h, #3//
[FFmpeg-devel] [PATCH v3 04/10] avcodec/vc1: Introduce fast path for unescaping bitstream buffer
Includes a checkasm test. Signed-off-by: Ben Avison --- libavcodec/vc1dec.c | 20 ++-- libavcodec/vc1dsp.c | 2 ++ libavcodec/vc1dsp.h | 3 ++ tests/checkasm/vc1dsp.c | 67 + 4 files changed, 82 insertions(+), 10 deletions(-) diff --git a/libavcodec/vc1dec.c b/libavcodec/vc1dec.c index e279ffd1c1..0426e8a752 100644 --- a/libavcodec/vc1dec.c +++ b/libavcodec/vc1dec.c @@ -491,7 +491,7 @@ static av_cold int vc1_decode_init(AVCodecContext *avctx) size = next - start - 4; if (size <= 0) continue; -buf2_size = vc1_unescape_buffer(start + 4, size, buf2); +buf2_size = v->vc1dsp.vc1_unescape_buffer(start + 4, size, buf2); init_get_bits(, buf2, buf2_size * 8); switch (AV_RB32(start)) { case VC1_CODE_SEQHDR: @@ -681,7 +681,7 @@ static int vc1_decode_frame(AVCodecContext *avctx, void *data, case VC1_CODE_FRAME: if (avctx->hwaccel) buf_start = start; -buf_size2 = vc1_unescape_buffer(start + 4, size, buf2); +buf_size2 = v->vc1dsp.vc1_unescape_buffer(start + 4, size, buf2); break; case VC1_CODE_FIELD: { int buf_size3; @@ -698,8 +698,8 @@ static int vc1_decode_frame(AVCodecContext *avctx, void *data, ret = AVERROR(ENOMEM); goto err; } -buf_size3 = vc1_unescape_buffer(start + 4, size, -slices[n_slices].buf); +buf_size3 = v->vc1dsp.vc1_unescape_buffer(start + 4, size, + slices[n_slices].buf); init_get_bits([n_slices].gb, slices[n_slices].buf, buf_size3 << 3); slices[n_slices].mby_start = avctx->coded_height + 31 >> 5; @@ -710,7 +710,7 @@ static int vc1_decode_frame(AVCodecContext *avctx, void *data, break; } case VC1_CODE_ENTRYPOINT: /* it should be before frame data */ -buf_size2 = vc1_unescape_buffer(start + 4, size, buf2); +buf_size2 = v->vc1dsp.vc1_unescape_buffer(start + 4, size, buf2); init_get_bits(>gb, buf2, buf_size2 * 8); ff_vc1_decode_entry_point(avctx, v, >gb); break; @@ -727,8 +727,8 @@ static int vc1_decode_frame(AVCodecContext *avctx, void *data, ret = AVERROR(ENOMEM); goto err; } -buf_size3 = vc1_unescape_buffer(start + 4, size, -slices[n_slices].buf); +buf_size3 = v->vc1dsp.vc1_unescape_buffer(start + 4, size, + slices[n_slices].buf); init_get_bits([n_slices].gb, slices[n_slices].buf, buf_size3 << 3); slices[n_slices].mby_start = get_bits([n_slices].gb, 9); @@ -762,7 +762,7 @@ static int vc1_decode_frame(AVCodecContext *avctx, void *data, ret = AVERROR(ENOMEM); goto err; } -buf_size3 = vc1_unescape_buffer(divider + 4, buf + buf_size - divider - 4, slices[n_slices].buf); +buf_size3 = v->vc1dsp.vc1_unescape_buffer(divider + 4, buf + buf_size - divider - 4, slices[n_slices].buf); init_get_bits([n_slices].gb, slices[n_slices].buf, buf_size3 << 3); slices[n_slices].mby_start = s->mb_height + 1 >> 1; @@ -771,9 +771,9 @@ static int vc1_decode_frame(AVCodecContext *avctx, void *data, n_slices1 = n_slices - 1; n_slices++; } -buf_size2 = vc1_unescape_buffer(buf, divider - buf, buf2); +buf_size2 = v->vc1dsp.vc1_unescape_buffer(buf, divider - buf, buf2); } else { -buf_size2 = vc1_unescape_buffer(buf, buf_size, buf2); +buf_size2 = v->vc1dsp.vc1_unescape_buffer(buf, buf_size, buf2); } init_get_bits(>gb, buf2, buf_size2*8); } else{ diff --git a/libavcodec/vc1dsp.c b/libavcodec/vc1dsp.c index f651d7d461..f1b7bb2397 100644 --- a/libavcodec/vc1dsp.c +++ b/libavcodec/vc1dsp.c @@ -34,6 +34,7 @@ #include "rnd_avg.h" #include "vc1dsp.h" #include "startcode.h" +#include "vc1_common.h" /* Apply overlap transform to horizontal edge */ static void vc1_v_overlap_c(uint8_t *src, ptrdiff_t stride) @@ -1030,6 +1031,7 @@ av_cold void ff_vc1dsp_init(VC1DSPContext *dsp) #endif /* CONFIG_WMV3IMAGE_DECODER || CONFIG_VC1IMAGE_DECODER */
[FFmpeg-devel] [PATCH v3 06/10] avcodec/vc1: Arm 32-bit NEON deblocking filter fast paths
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C version can still outperform the NEON version in specific cases. The balance between different code paths is stream-dependent, but in practice the best case happens about 5% of the time, the worst case happens about 40% of the time, and the complexity of the remaining cases fall somewhere in between. Therefore, taking the average of the best and worst case timings is probably a conservative estimate of the degree by which the NEON code improves performance. vc1dsp.vc1_h_loop_filter4_bestcase_c: 19.0 vc1dsp.vc1_h_loop_filter4_bestcase_neon: 48.5 vc1dsp.vc1_h_loop_filter4_worstcase_c: 144.7 vc1dsp.vc1_h_loop_filter4_worstcase_neon: 76.2 vc1dsp.vc1_h_loop_filter8_bestcase_c: 41.0 vc1dsp.vc1_h_loop_filter8_bestcase_neon: 75.0 vc1dsp.vc1_h_loop_filter8_worstcase_c: 294.0 vc1dsp.vc1_h_loop_filter8_worstcase_neon: 102.7 vc1dsp.vc1_h_loop_filter16_bestcase_c: 54.7 vc1dsp.vc1_h_loop_filter16_bestcase_neon: 130.0 vc1dsp.vc1_h_loop_filter16_worstcase_c: 569.7 vc1dsp.vc1_h_loop_filter16_worstcase_neon: 186.7 vc1dsp.vc1_v_loop_filter4_bestcase_c: 20.2 vc1dsp.vc1_v_loop_filter4_bestcase_neon: 47.2 vc1dsp.vc1_v_loop_filter4_worstcase_c: 164.2 vc1dsp.vc1_v_loop_filter4_worstcase_neon: 68.5 vc1dsp.vc1_v_loop_filter8_bestcase_c: 43.5 vc1dsp.vc1_v_loop_filter8_bestcase_neon: 55.2 vc1dsp.vc1_v_loop_filter8_worstcase_c: 316.2 vc1dsp.vc1_v_loop_filter8_worstcase_neon: 72.7 vc1dsp.vc1_v_loop_filter16_bestcase_c: 62.2 vc1dsp.vc1_v_loop_filter16_bestcase_neon: 103.7 vc1dsp.vc1_v_loop_filter16_worstcase_c: 646.5 vc1dsp.vc1_v_loop_filter16_worstcase_neon: 110.7 Signed-off-by: Ben Avison --- libavcodec/arm/vc1dsp_init_neon.c | 14 + libavcodec/arm/vc1dsp_neon.S | 643 ++ 2 files changed, 657 insertions(+) diff --git a/libavcodec/arm/vc1dsp_init_neon.c b/libavcodec/arm/vc1dsp_init_neon.c index 2cca784f5a..f5f5c702d7 100644 --- a/libavcodec/arm/vc1dsp_init_neon.c +++ b/libavcodec/arm/vc1dsp_init_neon.c @@ -32,6 +32,13 @@ void ff_vc1_inv_trans_4x8_dc_neon(uint8_t *dest, ptrdiff_t stride, int16_t *bloc void ff_vc1_inv_trans_8x4_dc_neon(uint8_t *dest, ptrdiff_t stride, int16_t *block); void ff_vc1_inv_trans_4x4_dc_neon(uint8_t *dest, ptrdiff_t stride, int16_t *block); +void ff_vc1_v_loop_filter4_neon(uint8_t *src, int stride, int pq); +void ff_vc1_h_loop_filter4_neon(uint8_t *src, int stride, int pq); +void ff_vc1_v_loop_filter8_neon(uint8_t *src, int stride, int pq); +void ff_vc1_h_loop_filter8_neon(uint8_t *src, int stride, int pq); +void ff_vc1_v_loop_filter16_neon(uint8_t *src, int stride, int pq); +void ff_vc1_h_loop_filter16_neon(uint8_t *src, int stride, int pq); + void ff_put_pixels8x8_neon(uint8_t *block, const uint8_t *pixels, ptrdiff_t line_size, int rnd); @@ -92,6 +99,13 @@ av_cold void ff_vc1dsp_init_neon(VC1DSPContext *dsp) dsp->vc1_inv_trans_8x4_dc = ff_vc1_inv_trans_8x4_dc_neon; dsp->vc1_inv_trans_4x4_dc = ff_vc1_inv_trans_4x4_dc_neon; +dsp->vc1_v_loop_filter4 = ff_vc1_v_loop_filter4_neon; +dsp->vc1_h_loop_filter4 = ff_vc1_h_loop_filter4_neon; +dsp->vc1_v_loop_filter8 = ff_vc1_v_loop_filter8_neon; +dsp->vc1_h_loop_filter8 = ff_vc1_h_loop_filter8_neon; +dsp->vc1_v_loop_filter16 = ff_vc1_v_loop_filter16_neon; +dsp->vc1_h_loop_filter16 = ff_vc1_h_loop_filter16_neon; + dsp->put_vc1_mspel_pixels_tab[1][ 0] = ff_put_pixels8x8_neon; FN_ASSIGN(1, 0); FN_ASSIGN(2, 0); diff --git a/libavcodec/arm/vc1dsp_neon.S b/libavcodec/arm/vc1dsp_neon.S index 93f043bf08..ba54221ef6 100644 --- a/libavcodec/arm/vc1dsp_neon.S +++ b/libavcodec/arm/vc1dsp_neon.S @@ -1161,3 +1161,646 @@ function ff_vc1_inv_trans_4x4_dc_neon, export=1 vst1.32 {d1[1]}, [r0,:32] bx lr endfunc + +@ VC-1 in-loop deblocking filter for 4 pixel pairs at boundary of vertically-neighbouring blocks +@ On entry: +@ r0 -> top-left pel of lower block +@ r1 = row stride, bytes +@ r2 = PQUANT bitstream parameter +function ff_vc1_v_loop_filter4_neon, export=1 +sub r3, r0, r1, lsl #2 +vldrd0, .Lcoeffs +vld1.32 {d1[0]}, [r0], r1 @ P5 +vld1.32 {d2[0]}, [r3], r1 @ P1 +vld1.32 {d3[0]}, [r3], r1 @ P2 +vld1.32 {d4[0]}, [r0], r1 @ P6 +vld1.32 {d5[0]}, [r3], r1 @ P3 +vld1.32 {d6[0]}, [r0], r1 @ P7 +vld1.32 {d7[0]}, [r3] @ P4 +vld1.32 {d16[0]}, [r0] @ P8 +vshll.u8q9, d1, #1 @ 2*P5 +vdup.16 d17, r2 @ pq +vshll.u8q10, d2, #1 @ 2*P1 +vmovl.u8q11, d3 @ P2 +vmovl.u8q1, d4 @ P6 +vmovl.u8q12, d5 @ P3 +vmls.i16d20, d22, d0[1]
[FFmpeg-devel] [PATCH v3 03/10] checkasm: Add idctdsp add/put-pixels-clamped tests
Signed-off-by: Ben Avison --- tests/checkasm/Makefile | 1 + tests/checkasm/checkasm.c | 3 ++ tests/checkasm/checkasm.h | 1 + tests/checkasm/idctdsp.c | 98 +++ tests/fate/checkasm.mak | 1 + 5 files changed, 104 insertions(+) create mode 100644 tests/checkasm/idctdsp.c diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile index 7133a6ee66..f6b1008855 100644 --- a/tests/checkasm/Makefile +++ b/tests/checkasm/Makefile @@ -9,6 +9,7 @@ AVCODECOBJS-$(CONFIG_G722DSP) += g722dsp.o AVCODECOBJS-$(CONFIG_H264DSP) += h264dsp.o AVCODECOBJS-$(CONFIG_H264PRED) += h264pred.o AVCODECOBJS-$(CONFIG_H264QPEL) += h264qpel.o +AVCODECOBJS-$(CONFIG_IDCTDSP) += idctdsp.o AVCODECOBJS-$(CONFIG_LLVIDDSP) += llviddsp.o AVCODECOBJS-$(CONFIG_LLVIDENCDSP) += llviddspenc.o AVCODECOBJS-$(CONFIG_VC1DSP)+= vc1dsp.o diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index c2efd81b6d..57134f96ea 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -123,6 +123,9 @@ static const struct { #if CONFIG_HUFFYUV_DECODER { "huffyuvdsp", checkasm_check_huffyuvdsp }, #endif +#if CONFIG_IDCTDSP +{ "idctdsp", checkasm_check_idctdsp }, +#endif #if CONFIG_JPEG2000_DECODER { "jpeg2000dsp", checkasm_check_jpeg2000dsp }, #endif diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h index 52ab18a5b1..a86db140e3 100644 --- a/tests/checkasm/checkasm.h +++ b/tests/checkasm/checkasm.h @@ -64,6 +64,7 @@ void checkasm_check_hevc_idct(void); void checkasm_check_hevc_pel(void); void checkasm_check_hevc_sao(void); void checkasm_check_huffyuvdsp(void); +void checkasm_check_idctdsp(void); void checkasm_check_jpeg2000dsp(void); void checkasm_check_llviddsp(void); void checkasm_check_llviddspenc(void); diff --git a/tests/checkasm/idctdsp.c b/tests/checkasm/idctdsp.c new file mode 100644 index 00..02724536a7 --- /dev/null +++ b/tests/checkasm/idctdsp.c @@ -0,0 +1,98 @@ +/* + * Copyright (c) 2022 Ben Avison + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with FFmpeg; if not, write to the Free Software Foundation, Inc., + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + */ + +#include + +#include "checkasm.h" + +#include "libavcodec/idctdsp.h" + +#include "libavutil/common.h" +#include "libavutil/internal.h" +#include "libavutil/intreadwrite.h" +#include "libavutil/mem_internal.h" + +#define IDCTDSP_TEST(func) { #func, offsetof(IDCTDSPContext, func) }, + +typedef struct { +const char *name; +size_t offset; +} test; + +#define RANDOMIZE_BUFFER16(name, size) \ +do {\ +int i; \ +for (i = 0; i < size; ++i) {\ +uint16_t r = rnd() % 0x201 - 0x100; \ +AV_WN16A(name##0 + i, r); \ +AV_WN16A(name##1 + i, r); \ +} \ +} while (0) + +#define RANDOMIZE_BUFFER8(name, size) \ +do { \ +int i;\ +for (i = 0; i < size; ++i) { \ +uint8_t r = rnd();\ +name##0[i] = r; \ +name##1[i] = r; \ +} \ +} while (0) + +static void check_add_put_clamped(void) +{ +/* Source buffers are only as big as needed, since any over-read won't affect results */ +LOCAL_ALIGNED_16(int16_t, src0, [64]); +LOCAL_ALIGNED_16(int16_t, src1, [64]); +/* Destination buffers have borders of one row above/below and 8 columns left/right to catch overflows */ +LOCAL_ALIGNED_8(uint8_t, dst0, [10 * 24]); +LOCAL_ALIGNED_8(uint8_t, dst1, [10 * 24]); + +AVCodecContext avctx = { 0 }; +IDCTDSPContext h; + +const test tests[] = { +IDCTDSP_TEST(add_pixels_clamped) +IDCTDSP_TEST(put_pixels_clamped) +IDCTDSP_TEST(put_signed_pixels_clamped) +}; + +ff_idctdsp_init(, ); + +for (size_t t = 0; t < FF_ARRAY_ELEMS(tests); ++t) { +void (*func)(const int16_t *, uint8_t * ptrdiff_t) = *(void **)((intptr_t) + tests[t].offset); +if (check_func(func, "idctdsp.%s",
[FFmpeg-devel] [PATCH v3 05/10] avcodec/vc1: Arm 64-bit NEON deblocking filter fast paths
checkasm benchmarks on 1.5 GHz Cortex-A72 are as follows. Note that the C version can still outperform the NEON version in specific cases. The balance between different code paths is stream-dependent, but in practice the best case happens about 5% of the time, the worst case happens about 40% of the time, and the complexity of the remaining cases fall somewhere in between. Therefore, taking the average of the best and worst case timings is probably a conservative estimate of the degree by which the NEON code improves performance. vc1dsp.vc1_h_loop_filter4_bestcase_c: 10.7 vc1dsp.vc1_h_loop_filter4_bestcase_neon: 43.5 vc1dsp.vc1_h_loop_filter4_worstcase_c: 184.5 vc1dsp.vc1_h_loop_filter4_worstcase_neon: 73.7 vc1dsp.vc1_h_loop_filter8_bestcase_c: 31.2 vc1dsp.vc1_h_loop_filter8_bestcase_neon: 62.2 vc1dsp.vc1_h_loop_filter8_worstcase_c: 358.2 vc1dsp.vc1_h_loop_filter8_worstcase_neon: 88.2 vc1dsp.vc1_h_loop_filter16_bestcase_c: 51.0 vc1dsp.vc1_h_loop_filter16_bestcase_neon: 107.7 vc1dsp.vc1_h_loop_filter16_worstcase_c: 722.7 vc1dsp.vc1_h_loop_filter16_worstcase_neon: 140.5 vc1dsp.vc1_v_loop_filter4_bestcase_c: 9.7 vc1dsp.vc1_v_loop_filter4_bestcase_neon: 43.0 vc1dsp.vc1_v_loop_filter4_worstcase_c: 178.7 vc1dsp.vc1_v_loop_filter4_worstcase_neon: 69.0 vc1dsp.vc1_v_loop_filter8_bestcase_c: 30.2 vc1dsp.vc1_v_loop_filter8_bestcase_neon: 50.7 vc1dsp.vc1_v_loop_filter8_worstcase_c: 353.0 vc1dsp.vc1_v_loop_filter8_worstcase_neon: 69.2 vc1dsp.vc1_v_loop_filter16_bestcase_c: 60.0 vc1dsp.vc1_v_loop_filter16_bestcase_neon: 90.0 vc1dsp.vc1_v_loop_filter16_worstcase_c: 714.2 vc1dsp.vc1_v_loop_filter16_worstcase_neon: 97.2 Signed-off-by: Ben Avison --- libavcodec/aarch64/Makefile | 1 + libavcodec/aarch64/vc1dsp_init_aarch64.c | 14 + libavcodec/aarch64/vc1dsp_neon.S | 692 +++ 3 files changed, 707 insertions(+) create mode 100644 libavcodec/aarch64/vc1dsp_neon.S diff --git a/libavcodec/aarch64/Makefile b/libavcodec/aarch64/Makefile index 954461f81d..5b25e4dfb9 100644 --- a/libavcodec/aarch64/Makefile +++ b/libavcodec/aarch64/Makefile @@ -48,6 +48,7 @@ NEON-OBJS-$(CONFIG_IDCTDSP) += aarch64/simple_idct_neon.o NEON-OBJS-$(CONFIG_MDCT)+= aarch64/mdct_neon.o NEON-OBJS-$(CONFIG_MPEGAUDIODSP)+= aarch64/mpegaudiodsp_neon.o NEON-OBJS-$(CONFIG_PIXBLOCKDSP) += aarch64/pixblockdsp_neon.o +NEON-OBJS-$(CONFIG_VC1DSP) += aarch64/vc1dsp_neon.o NEON-OBJS-$(CONFIG_VP8DSP) += aarch64/vp8dsp_neon.o # decoders/encoders diff --git a/libavcodec/aarch64/vc1dsp_init_aarch64.c b/libavcodec/aarch64/vc1dsp_init_aarch64.c index 13dfd74940..8f96e4802d 100644 --- a/libavcodec/aarch64/vc1dsp_init_aarch64.c +++ b/libavcodec/aarch64/vc1dsp_init_aarch64.c @@ -25,6 +25,13 @@ #include "config.h" +void ff_vc1_v_loop_filter4_neon(uint8_t *src, ptrdiff_t stride, int pq); +void ff_vc1_h_loop_filter4_neon(uint8_t *src, ptrdiff_t stride, int pq); +void ff_vc1_v_loop_filter8_neon(uint8_t *src, ptrdiff_t stride, int pq); +void ff_vc1_h_loop_filter8_neon(uint8_t *src, ptrdiff_t stride, int pq); +void ff_vc1_v_loop_filter16_neon(uint8_t *src, ptrdiff_t stride, int pq); +void ff_vc1_h_loop_filter16_neon(uint8_t *src, ptrdiff_t stride, int pq); + void ff_put_vc1_chroma_mc8_neon(uint8_t *dst, uint8_t *src, ptrdiff_t stride, int h, int x, int y); void ff_avg_vc1_chroma_mc8_neon(uint8_t *dst, uint8_t *src, ptrdiff_t stride, @@ -39,6 +46,13 @@ av_cold void ff_vc1dsp_init_aarch64(VC1DSPContext *dsp) int cpu_flags = av_get_cpu_flags(); if (have_neon(cpu_flags)) { +dsp->vc1_v_loop_filter4 = ff_vc1_v_loop_filter4_neon; +dsp->vc1_h_loop_filter4 = ff_vc1_h_loop_filter4_neon; +dsp->vc1_v_loop_filter8 = ff_vc1_v_loop_filter8_neon; +dsp->vc1_h_loop_filter8 = ff_vc1_h_loop_filter8_neon; +dsp->vc1_v_loop_filter16 = ff_vc1_v_loop_filter16_neon; +dsp->vc1_h_loop_filter16 = ff_vc1_h_loop_filter16_neon; + dsp->put_no_rnd_vc1_chroma_pixels_tab[0] = ff_put_vc1_chroma_mc8_neon; dsp->avg_no_rnd_vc1_chroma_pixels_tab[0] = ff_avg_vc1_chroma_mc8_neon; dsp->put_no_rnd_vc1_chroma_pixels_tab[1] = ff_put_vc1_chroma_mc4_neon; diff --git a/libavcodec/aarch64/vc1dsp_neon.S b/libavcodec/aarch64/vc1dsp_neon.S new file mode 100644 index 00..1ea9fa75ff --- /dev/null +++ b/libavcodec/aarch64/vc1dsp_neon.S @@ -0,0 +1,692 @@ +/* + * VC1 AArch64 NEON optimisations + * + * Copyright (c) 2022 Ben Avison + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY
[FFmpeg-devel] [PATCH v3 02/10] checkasm: Add vc1dsp inverse transform tests
This test deliberately doesn't exercise the full range of inputs described in the committee draft VC-1 standard. It says: input coefficients in frequency domain, D, satisfy -2048 <= D < 2047 intermediate coefficients, E, satisfy-4096 <= E < 4095 fully inverse-transformed coefficients, R, satisfy-512 <= R < 511 For one thing, the inequalities look odd. Did they mean them to go the other way round? That would make more sense because the equations generally both add and subtract coefficients multiplied by constants, including powers of 2. Requiring the most-negative values to be valid extends the number of bits to represent the intermediate values just for the sake of that one case! For another thing, the extreme values don't look to occur in real streams - both in my experience and supported by the following comment in the AArch32 decoder: tNhalf is half of the value of tN (as described in vc1_inv_trans_8x8_c). This is done because sometimes files have input that causes tN + tM to overflow. To avoid this overflow, we compute tNhalf, then compute tNhalf + tM (which doesn't overflow), and then we use vhadd to compute (tNhalf + (tNhalf + tM)) >> 1 which does not overflow because it is one instruction. My AArch64 decoder goes further than this. It calculates tNhalf and tM then does an SRA (essentially a fused halve and add) to compute (tN + tM) >> 1 without ever having to hold (tNhalf + tM) in a 16-bit element without overflowing. It only encounters difficulties if either tNhalf or tM overflow in isolation. I haven't had sight of the final standard, so it's possible that these issues were dealt with during finalisation, which could explain the lack of usage of extreme inputs in real streams. Or a preponderance of decoders that only support 16-bit intermediate values in their inverse transforms might have caused encoders to steer clear of such cases. I have effectively followed this approach in the test, and limited the scale of the coefficients sufficient that both the existing AArch32 decoder and my new AArch64 decoder both pass. Signed-off-by: Ben Avison --- tests/checkasm/vc1dsp.c | 283 1 file changed, 283 insertions(+) diff --git a/tests/checkasm/vc1dsp.c b/tests/checkasm/vc1dsp.c index 2fd6c74d6c..7d4457306f 100644 --- a/tests/checkasm/vc1dsp.c +++ b/tests/checkasm/vc1dsp.c @@ -30,12 +30,208 @@ #include "libavutil/mem_internal.h" #define VC1DSP_TEST(func) { #func, offsetof(VC1DSPContext, func) }, +#define VC1DSP_SIZED_TEST(func, width, height) { #func, offsetof(VC1DSPContext, func), width, height }, typedef struct { const char *name; size_t offset; +int width; +int height; } test; +typedef struct matrix { +size_t width; +size_t height; +float d[]; +} matrix; + +static const matrix T8 = { 8, 8, { +12, 12, 12, 12, 12, 12, 12, 12, +16, 15, 9, 4, -4, -9, -15, -16, +16, 6, -6, -16, -16, -6, 6, 16, +15, -4, -16, -9, 9, 16, 4, -15, +12, -12, -12, 12, 12, -12, -12, 12, + 9, -16, 4, 15, -15, -4, 16, -9, + 6, -16, 16, -6, -6, 16, -16, 6, + 4, -9, 15, -16, 16, -15, 9, -4 +} }; + +static const matrix T4 = { 4, 4, { +17, 17, 17, 17, +22, 10, -10, -22, +17, -17, -17, 17, +10, -22, 22, -10 +} }; + +static const matrix T8t = { 8, 8, { +12, 16, 16, 15, 12, 9, 6, 4, +12, 15, 6, -4, -12, -16, -16, -9, +12, 9, -6, -16, -12, 4, 16, 15, +12, 4, -16, -9, 12, 15, -6, -16, +12, -4, -16, 9, 12, -15, -6, 16, +12, -9, -6, 16, -12, -4, 16, -15, +12, -15, 6, 4, -12, 16, -16, 9, +12, -16, 16, -15, 12, -9, 6, -4 +} }; + +static const matrix T4t = { 4, 4, { +17, 22, 17, 10, +17, 10, -17, -22, +17, -10, -17, 22, +17, -22, 17, -10 +} }; + +static matrix *new_matrix(size_t width, size_t height) +{ +matrix *out = av_mallocz(sizeof (matrix) + height * width * sizeof (float)); +if (out == NULL) { +fprintf(stderr, "Memory allocation failure\n"); +exit(EXIT_FAILURE); +} +out->width = width; +out->height = height; +return out; +} + +static matrix *multiply(const matrix *a, const matrix *b) +{ +matrix *out; +if (a->width != b->height) { +fprintf(stderr, "Incompatible multiplication\n"); +exit(EXIT_FAILURE); +} +out = new_matrix(b->width, a->height); +for (int j = 0; j < out->height; ++j) +for (int i = 0; i < out->width; ++i) { +float sum = 0; +for (int k = 0; k < a->width; ++k) +sum += a->d[j * a->width + k] * b->d[k * b->width + i]; +out->d[j * out->width + i] = sum; +} +return out; +} + +static void normalise(matrix *a) +{ +for (int j = 0; j < a->height; ++j)
[FFmpeg-devel] [PATCH v3 01/10] checkasm: Add vc1dsp in-loop deblocking filter tests
Note that the benchmarking results for these functions are highly dependent upon the input data. Therefore, each function is benchmarked twice, corresponding to the best and worst case complexity of the reference C implementation. The performance of a real stream decode will fall somewhere between these two extremes. Signed-off-by: Ben Avison --- tests/checkasm/Makefile | 1 + tests/checkasm/checkasm.c | 3 ++ tests/checkasm/checkasm.h | 1 + tests/checkasm/vc1dsp.c | 102 ++ tests/fate/checkasm.mak | 1 + 5 files changed, 108 insertions(+) create mode 100644 tests/checkasm/vc1dsp.c diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile index f768b1144e..7133a6ee66 100644 --- a/tests/checkasm/Makefile +++ b/tests/checkasm/Makefile @@ -11,6 +11,7 @@ AVCODECOBJS-$(CONFIG_H264PRED) += h264pred.o AVCODECOBJS-$(CONFIG_H264QPEL) += h264qpel.o AVCODECOBJS-$(CONFIG_LLVIDDSP) += llviddsp.o AVCODECOBJS-$(CONFIG_LLVIDENCDSP) += llviddspenc.o +AVCODECOBJS-$(CONFIG_VC1DSP)+= vc1dsp.o AVCODECOBJS-$(CONFIG_VP8DSP)+= vp8dsp.o AVCODECOBJS-$(CONFIG_VIDEODSP) += videodsp.o diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c index 748d6a9f3a..c2efd81b6d 100644 --- a/tests/checkasm/checkasm.c +++ b/tests/checkasm/checkasm.c @@ -147,6 +147,9 @@ static const struct { #if CONFIG_V210_ENCODER { "v210enc", checkasm_check_v210enc }, #endif +#if CONFIG_VC1DSP +{ "vc1dsp", checkasm_check_vc1dsp }, +#endif #if CONFIG_VP8DSP { "vp8dsp", checkasm_check_vp8dsp }, #endif diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h index c3192d8c23..52ab18a5b1 100644 --- a/tests/checkasm/checkasm.h +++ b/tests/checkasm/checkasm.h @@ -78,6 +78,7 @@ void checkasm_check_sw_scale(void); void checkasm_check_utvideodsp(void); void checkasm_check_v210dec(void); void checkasm_check_v210enc(void); +void checkasm_check_vc1dsp(void); void checkasm_check_vf_eq(void); void checkasm_check_vf_gblur(void); void checkasm_check_vf_hflip(void); diff --git a/tests/checkasm/vc1dsp.c b/tests/checkasm/vc1dsp.c new file mode 100644 index 00..2fd6c74d6c --- /dev/null +++ b/tests/checkasm/vc1dsp.c @@ -0,0 +1,102 @@ +/* + * Copyright (c) 2022 Ben Avison + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with FFmpeg; if not, write to the Free Software Foundation, Inc., + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + */ + +#include + +#include "checkasm.h" + +#include "libavcodec/vc1dsp.h" + +#include "libavutil/common.h" +#include "libavutil/internal.h" +#include "libavutil/intreadwrite.h" +#include "libavutil/mem_internal.h" + +#define VC1DSP_TEST(func) { #func, offsetof(VC1DSPContext, func) }, + +typedef struct { +const char *name; +size_t offset; +} test; + +#define RANDOMIZE_BUFFER8_MID_WEIGHTED(name, size) \ +do {\ +uint8_t *p##0 = name##0, *p##1 = name##1; \ +int i = (size); \ +while (i-- > 0) { \ +int x = 0x80 | (rnd() & 0x7F); \ +x >>= rnd() % 9;\ +if (rnd() & 1) \ +x = -x; \ +*p##1++ = *p##0++ = 0x80 + x; \ +} \ +} while (0) + +static void check_loop_filter(void) +{ +/* Deblocking filter buffers are big enough to hold a 16x16 block, + * plus 16 columns left and 4 rows above to hold filter inputs + * (depending on whether v or h neighbouring block edge, oversized + * horizontally to maintain 16-byte alignment) plus 16 columns and + * 4 rows below to catch write overflows */ +LOCAL_ALIGNED_16(uint8_t, filter_buf0, [24 * 48]); +LOCAL_ALIGNED_16(uint8_t, filter_buf1, [24 * 48]); + +VC1DSPContext h; + +const test tests[] = { +VC1DSP_TEST(vc1_v_loop_filter4) +VC1DSP_TEST(vc1_h_loop_filter4) +VC1DSP_TEST(vc1_v_loop_filter8) +VC1DSP_TEST(vc1_h_loop_filter8) +VC1DSP_TEST(vc1_v_loop_filter16) +VC1DSP_TEST(vc1_h_loop_filter16) +}; + +ff_vc1dsp_init(); + +for (size_t t = 0; t < FF_ARRAY_ELEMS(tests); ++t) { +void
[FFmpeg-devel] [PATCH v3 00/10] avcodec/vc1: Arm optimisations
The VC1 decoder was missing lots of important fast paths for Arm, especially for 64-bit Arm. This submission fills in implementations for all functions where a fast path already existed and the fallback C implementation was taking 1% or more of the runtime, and adds a new fast path to permit vc1_unescape_buffer() to be overridden. I've measured the playback speed on a 1.5 GHz Cortex-A72 (Raspberry Pi 4) using `ffmpeg -i -f null -` for a couple of example streams: Architecture: AArch32AArch32AArch64AArch64 Stream:1 2 1 2 Before speed: 1.22x 0.82x 1.00x 0.67x After speed: 1.31x 0.98x 1.39x 1.06x Improvement: 7.4% 20%39%58% `make fate` passes on both AArch32 and AArch64. Changes in v2: * Refactor checkasm tests to convert some macros into functions. * Remove cast-to-void of checked_call. * Limit 16-bit values in idctdsp checkasm test to +/-0x100. * Reinstate ff_add_pixels_clamped_arm. * Adapt vc1 deblocking filters to specify stride as ptrdiff_t. * Add align specifiers to a few VLD/VST instructions for AArch32 deblocking filter, and adapt checkasm test not to test with tighter alignment than is encountered in normal use. * Correct unescape buffer memcmp length. * Update benchmarks for AArch64 idctdsp. Ben Avison (10): checkasm: Add vc1dsp in-loop deblocking filter tests checkasm: Add vc1dsp inverse transform tests checkasm: Add idctdsp add/put-pixels-clamped tests avcodec/vc1: Introduce fast path for unescaping bitstream buffer avcodec/vc1: Arm 64-bit NEON deblocking filter fast paths avcodec/vc1: Arm 32-bit NEON deblocking filter fast paths avcodec/vc1: Arm 64-bit NEON inverse transform fast paths avcodec/idctdsp: Arm 64-bit NEON block add and clamp fast paths avcodec/vc1: Arm 64-bit NEON unescape fast path avcodec/vc1: Arm 32-bit NEON unescape fast path libavcodec/aarch64/Makefile |4 +- libavcodec/aarch64/idctdsp_init_aarch64.c | 26 +- libavcodec/aarch64/idctdsp_neon.S | 130 ++ libavcodec/aarch64/vc1dsp_init_aarch64.c | 94 ++ libavcodec/aarch64/vc1dsp_neon.S | 1546 + libavcodec/arm/vc1dsp_init_neon.c | 75 + libavcodec/arm/vc1dsp_neon.S | 761 ++ libavcodec/vc1dec.c | 20 +- libavcodec/vc1dsp.c |2 + libavcodec/vc1dsp.h |3 + tests/checkasm/Makefile |2 + tests/checkasm/checkasm.c |6 + tests/checkasm/checkasm.h |2 + tests/checkasm/idctdsp.c | 98 ++ tests/checkasm/vc1dsp.c | 452 ++ tests/fate/checkasm.mak |2 + 16 files changed, 3204 insertions(+), 19 deletions(-) create mode 100644 libavcodec/aarch64/idctdsp_neon.S create mode 100644 libavcodec/aarch64/vc1dsp_neon.S create mode 100644 tests/checkasm/idctdsp.c create mode 100644 tests/checkasm/vc1dsp.c -- 2.25.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 08/10] avcodec/idctdsp: Arm 64-bit NEON block add and clamp fast paths
On 30/03/2022 15:14, Martin Storsjö wrote: On Fri, 25 Mar 2022, Ben Avison wrote: +// Clamp 16-bit signed block coefficients to signed 8-bit (biased by 128) +// On entry: +// x0 -> array of 64x 16-bit coefficients +// x1 -> 8-bit results +// x2 = row stride for results, bytes +function ff_put_signed_pixels_clamped_neon, export=1 + ld1 {v0.16b, v1.16b, v2.16b, v3.16b}, [x0], #64 + movi v4.8b, #128 + ld1 {v16.16b, v17.16b, v18.16b, v19.16b}, [x0] + sqxtn v0.8b, v0.8h + sqxtn v1.8b, v1.8h + sqxtn v2.8b, v2.8h + sqxtn v3.8b, v3.8h + sqxtn v5.8b, v16.8h + add v0.8b, v0.8b, v4.8b Here you could save 4 add instructions with sqxtn2 and adding .16b vectors, but I'm not sure if it's wortwhile. (It reduces the checkasm numbers by 0.7 for Cortex A72, by 0.3 for A73, but increases the runtime by 1.0 on A53.) Stranegely enough, I get much smaller numbers on my A72 than you got. That's weird. As you say, it should be independent of clock-frequency. FWIW, I'm benchmarking on a Raspberry Pi 4; I'd assume all its board variants' Cortex-A72 cores are of identical revision. Now I run it again, I'm getting these figures: idctdsp.add_pixels_clamped_c: 313.3 idctdsp.add_pixels_clamped_neon: 24.3 idctdsp.put_pixels_clamped_c: 220.3 idctdsp.put_pixels_clamped_neon: 15.5 idctdsp.put_signed_pixels_clamped_c: 210.5 idctdsp.put_signed_pixels_clamped_neon: 19.5 which is more in line with what you see! I am getting a lot of variability between runs though - from a small sample, I'm seeing add_pixels_clamped_neon coming out as anything from 21 to 30, which is well above the sort of differences you're seeing between alternate implementations. This sort of case is always going to be difficult to schedule optimally for multiple core - factors like how much dual-issuing is possible, latency before values can be used, load speed and the granularity of scoreboarding parts of vectors, all vary widely. In the case of the Cortex-A72, the critical path goes ld1 of first 16 bytes -> sqxtn: 5 cycles sqxtn -> add:4 cycles add -> st1 of first 8 bytes: 3 cycles It then bangs out one store per cycle, a total of 8. Everything else can largely be fitted in around this - so for example, other than I-cache usage, there shouldn't be a disadvantage to the adds being non-Q-form as they should dual-issue with the sqxtns and st2s - you'll notice I have them alternating. I'd have expected anything interfering with this (such as by updating half the vector input required by any Q-form add) to slow things down. Ben ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v10 1/1] avformat: Add IPFS protocol support.
On Wed, Mar 30, 2022 at 5:16 PM Mark Gaiser wrote: > > > On Wed, Mar 30, 2022 at 3:57 PM Andreas Rheinhardt < > andreas.rheinha...@outlook.com> wrote: > >> Mark Gaiser: >> > On Wed, Mar 30, 2022 at 2:21 PM Andreas Rheinhardt < >> > andreas.rheinha...@outlook.com> wrote: >> > >> >> Mark Gaiser: >> >>> This patch adds support for: >> >>> - ffplay ipfs:// >> >>> - ffplay ipns:// >> >>> >> >>> IPFS data can be played from so called "ipfs gateways". >> >>> A gateway is essentially a webserver that gives access to the >> >>> distributed IPFS network. >> >>> >> >>> This protocol support (ipfs and ipns) therefore translates >> >>> ipfs:// and ipns:// to a http:// url. This resulting url is >> >>> then handled by the http protocol. It could also be https >> >>> depending on the gateway provided. >> >>> >> >>> To use this protocol, a gateway must be provided. >> >>> If you do nothing it will try to find it in your >> >>> $HOME/.ipfs/gateway file. The ways to set it manually are: >> >>> 1. Define a -gateway to the gateway. >> >>> 2. Define $IPFS_GATEWAY with the full http link to the gateway. >> >>> 3. Define $IPFS_PATH and point it to the IPFS data path. >> >>> 4. Have IPFS running in your local user folder (under $HOME/.ipfs). >> >>> >> >>> Signed-off-by: Mark Gaiser >> >>> --- >> >>> configure | 2 + >> >>> doc/protocols.texi| 30 >> >>> libavformat/Makefile | 2 + >> >>> libavformat/ipfsgateway.c | 309 >> ++ >> >>> libavformat/protocols.c | 2 + >> >>> 5 files changed, 345 insertions(+) >> >>> create mode 100644 libavformat/ipfsgateway.c >> >>> >> >>> diff --git a/configure b/configure >> >>> index e4d36aa639..55af90957a 100755 >> >>> --- a/configure >> >>> +++ b/configure >> >>> @@ -3579,6 +3579,8 @@ udp_protocol_select="network" >> >>> udplite_protocol_select="network" >> >>> unix_protocol_deps="sys_un_h" >> >>> unix_protocol_select="network" >> >>> +ipfs_protocol_select="https_protocol" >> >>> +ipns_protocol_select="https_protocol" >> >>> >> >>> # external library protocols >> >>> libamqp_protocol_deps="librabbitmq" >> >>> diff --git a/doc/protocols.texi b/doc/protocols.texi >> >>> index d207df0b52..7c9c0a4808 100644 >> >>> --- a/doc/protocols.texi >> >>> +++ b/doc/protocols.texi >> >>> @@ -2025,5 +2025,35 @@ decoding errors. >> >>> >> >>> @end table >> >>> >> >>> +@section ipfs >> >>> + >> >>> +InterPlanetary File System (IPFS) protocol support. One can access >> >> files stored >> >>> +on the IPFS network through so called gateways. Those are http(s) >> >> endpoints. >> >>> +This protocol wraps the IPFS native protocols (ipfs:// and ipns://) >> to >> >> be send >> >>> +to such a gateway. Users can (and should) host their own node which >> >> means this >> >>> +protocol will use your local machine gateway to access files on the >> >> IPFS network. >> >>> + >> >>> +If a user doesn't have a node of their own then the public gateway >> >> dweb.link is >> >>> +used by default. >> >>> + >> >>> +You can use this protocol in 2 ways. Using IPFS: >> >>> +@example >> >>> +ffplay ipfs://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T >> >>> +@end example >> >>> + >> >>> +Or the IPNS protocol (IPNS is mutable IPFS): >> >>> +@example >> >>> +ffplay ipns://QmbGtJg23skhvFmu9mJiePVByhfzu5rwo74MEkVDYAmF5T >> >>> +@end example >> >>> + >> >>> +You can also change the gateway to be used: >> >>> + >> >>> +@table @option >> >>> + >> >>> +@item gateway >> >>> +Defines the gateway to use. When nothing is provided the protocol >> will >> >> first try >> >>> +your local gateway. If that fails dweb.link will be used. >> >>> + >> >>> +@end table >> >>> >> >>> @c man end PROTOCOLS >> >>> diff --git a/libavformat/Makefile b/libavformat/Makefile >> >>> index d7182d6bd8..e3233fd7ac 100644 >> >>> --- a/libavformat/Makefile >> >>> +++ b/libavformat/Makefile >> >>> @@ -660,6 +660,8 @@ OBJS-$(CONFIG_SRTP_PROTOCOL) += >> >> srtpproto.o srtp.o >> >>> OBJS-$(CONFIG_SUBFILE_PROTOCOL) += subfile.o >> >>> OBJS-$(CONFIG_TEE_PROTOCOL) += teeproto.o tee_common.o >> >>> OBJS-$(CONFIG_TCP_PROTOCOL) += tcp.o >> >>> +OBJS-$(CONFIG_IPFS_PROTOCOL) += ipfsgateway.o >> >>> +OBJS-$(CONFIG_IPNS_PROTOCOL) += ipfsgateway.o >> >>> TLS-OBJS-$(CONFIG_GNUTLS)+= tls_gnutls.o >> >>> TLS-OBJS-$(CONFIG_LIBTLS)+= tls_libtls.o >> >>> TLS-OBJS-$(CONFIG_MBEDTLS) += tls_mbedtls.o >> >>> diff --git a/libavformat/ipfsgateway.c b/libavformat/ipfsgateway.c >> >>> new file mode 100644 >> >>> index 00..1a039589c0 >> >>> --- /dev/null >> >>> +++ b/libavformat/ipfsgateway.c >> >>> @@ -0,0 +1,309 @@ >> >>> +/* >> >>> + * IPFS and IPNS protocol support through IPFS Gateway. >> >>> + * Copyright (c) 2022 Mark Gaiser >> >>> + * >> >>> + * This file is part of FFmpeg. >> >>> + * >> >>> + * FFmpeg is free software; you can redistribute it and/or >> >>> + * modify it
Re: [FFmpeg-devel] [PATCH 07/10] avcodec/vc1: Arm 64-bit NEON inverse transform fast paths
On 30/03/2022 14:49, Martin Storsjö wrote: Looks generally reasonable. Is it possible to factorize out the individual transforms (so that you'd e.g. invoke the same macro twice in the 8x8 and 4x4 functions) without too much loss? There is a close analogy here with the vertical/horizontal deblocking filters, because while there are similarities between the two matrix multiplications within a transform, one of them follows a series of loads and the other follows a matrix transposition. If you look for example at ff_vc1_inv_trans_8x8_neon, you'll see I was able to do a fair amount of overlap between sections of the function - particularly between the transpose and the second matrix multiplication, but to a lesser extent between the loads and the first matrix multiplication and between the second multiplication and the stores. This sort of overlapping is tricky to maintain when using macros. Also, it means the the order of operations within each matrix multiply ended up quite different. At first sight, you might think that the multiplies from the 8x8 function (which you might also view as kind of 8-tap filter) would be re-usable for the size-8 multiplies in the 8x4 or 4x8 function. Yes, the instructions are similar, save for using .4h elements rather than .8h elements, but that has significant impacts on scheduling. For example, the Cortex-A72, which is my primary target, can only do NEON bit-shifts in one pipeline at once, irrespective of whether the vectors are 64-bit or 128-bit long, while other instructions don't have such restrictions. So while in theory you could factor some of this code out more, I suspect any attempt to do so would have a detrimental effect on performance. Ben ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] libavutil/hwcontext_vaapi: Re-enable support for libva v1
On Thu, 2022-03-31 at 14:58 +, Xiang, Haihao wrote: > On Tue, 2022-03-29 at 14:37 +, Xiang, Haihao wrote: > > On Fri, 2022-03-11 at 13:24 +0100, Ingo Brückl wrote: > > > Commit e050959103f375e6494937fa28ef2c4d2d15c9ef implemented passing in > > > modifiers by using the PRIME_2 memory type, which only exists in v2 of > > > the library. > > > > > > To still support v1 of the library, conditionally compile using > > > VA_CHECK_VERSION() for both the new code and the old code before > > > the commit. > > > --- > > > libavutil/hwcontext_vaapi.c | 57 - > > > 1 file changed, 56 insertions(+), 1 deletion(-) > > > > > > diff --git a/libavutil/hwcontext_vaapi.c b/libavutil/hwcontext_vaapi.c > > > index 994b744e4d..799490442e 100644 > > > --- a/libavutil/hwcontext_vaapi.c > > > +++ b/libavutil/hwcontext_vaapi.c > > > @@ -1026,7 +1026,12 @@ static void vaapi_unmap_from_drm(AVHWFramesContext > > > *dst_fc, > > > static int vaapi_map_from_drm(AVHWFramesContext *src_fc, AVFrame *dst, > > >const AVFrame *src, int flags) > > > { > > > +#if VA_CHECK_VERSION(2, 0, 0) > > > VAAPIFramesContext *src_vafc = src_fc->internal->priv; > > > +int use_prime2; > > > +#else > > > +int k; > > > +#endif > > > AVHWFramesContext *dst_fc = > > > (AVHWFramesContext*)dst->hw_frames_ctx->data; > > > AVVAAPIDeviceContext *dst_dev = dst_fc->device_ctx->hwctx; > > > @@ -1034,10 +1039,28 @@ static int vaapi_map_from_drm(AVHWFramesContext > > > *src_fc, AVFrame *dst, > > > const VAAPIFormatDescriptor *format_desc; > > > VASurfaceID surface_id; > > > VAStatus vas = VA_STATUS_SUCCESS; > > > -int use_prime2; > > > uint32_t va_fourcc; > > > int err, i, j; > > > > > > +#if !VA_CHECK_VERSION(2, 0, 0) > > > +unsigned long buffer_handle; > > > +VASurfaceAttribExternalBuffers buffer_desc; > > > +VASurfaceAttrib attrs[2] = { > > > +{ > > > +.type = VASurfaceAttribMemoryType, > > > +.flags = VA_SURFACE_ATTRIB_SETTABLE, > > > +.value.type= VAGenericValueTypeInteger, > > > +.value.value.i = VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME, > > > +}, > > > +{ > > > +.type = VASurfaceAttribExternalBufferDescriptor, > > > +.flags = VA_SURFACE_ATTRIB_SETTABLE, > > > +.value.type= VAGenericValueTypePointer, > > > +.value.value.p = _desc, > > > +} > > > +}; > > > +#endif > > > + > > > desc = (AVDRMFrameDescriptor*)src->data[0]; > > > > > > if (desc->nb_objects != 1) { > > > @@ -1072,6 +1095,7 @@ static int vaapi_map_from_drm(AVHWFramesContext > > > *src_fc, > > > AVFrame *dst, > > > format_desc = vaapi_format_from_fourcc(va_fourcc); > > > av_assert0(format_desc); > > > > > > +#if VA_CHECK_VERSION(2, 0, 0) > > > use_prime2 = !src_vafc->prime_2_import_unsupported && > > > desc->objects[0].format_modifier != > > > DRM_FORMAT_MOD_INVALID; > > > if (use_prime2) { > > > @@ -1183,6 +1207,37 @@ static int vaapi_map_from_drm(AVHWFramesContext > > > *src_fc, AVFrame *dst, > > > _id, 1, > > > buffer_attrs, > > > FF_ARRAY_ELEMS(buffer_attrs)); > > > } > > > +#else > > > +buffer_handle = desc->objects[0].fd; > > > +buffer_desc.pixel_format = va_fourcc; > > > +buffer_desc.width= src_fc->width; > > > +buffer_desc.height = src_fc->height; > > > +buffer_desc.data_size= desc->objects[0].size; > > > +buffer_desc.buffers = _handle; > > > +buffer_desc.num_buffers = 1; > > > +buffer_desc.flags= 0; > > > + > > > +k = 0; > > > +for (i = 0; i < desc->nb_layers; i++) { > > > +for (j = 0; j < desc->layers[i].nb_planes; j++) { > > > +buffer_desc.pitches[k] = desc->layers[i].planes[j].pitch; > > > +buffer_desc.offsets[k] = desc->layers[i].planes[j].offset; > > > +++k; > > > +} > > > +} > > > +buffer_desc.num_planes = k; > > > + > > > +if (format_desc->chroma_planes_swapped && > > > +buffer_desc.num_planes == 3) { > > > +FFSWAP(uint32_t, buffer_desc.pitches[1], buffer_desc.pitches[2]); > > > +FFSWAP(uint32_t, buffer_desc.offsets[1], buffer_desc.offsets[2]); > > > +} > > > + > > > +vas = vaCreateSurfaces(dst_dev->display, format_desc->rt_format, > > > + src->width, src->height, > > > + _id, 1, > > > + attrs, FF_ARRAY_ELEMS(attrs)); > > > +#endif > > > if (vas != VA_STATUS_SUCCESS) { > > > av_log(dst_fc, AV_LOG_ERROR, "Failed to create surface from DRM " > > > "object: %d (%s).\n", vas, vaErrorStr(vas)); > > > > LGTM, will apply > > I'm sorry I didn't notice you are using `VA_CHECK_VERSION(2, 0, 0)`. PRIME_2 > is >
Re: [FFmpeg-devel] [PATCH 05/10] avcodec/vc1: Arm 64-bit NEON deblocking filter fast paths
On 30/03/2022 13:35, Martin Storsjö wrote: Overall, the code looks sensible to me. Would it make sense to share the core of the filter between the horizontal/vertical cases with e.g. a macro? (I didn't check in detail if there's much differences in the core of the filter. At most some differences in condition registers for partial writeout in the horizontal forms?) Well, looking at the comments at the right-hand side of the source, which give the logical meaning of the results of each instruction, I admit there's a resemblance in the middle of the 8-pixel-pair function. However, the physical register assignments are quite different, and attempting to reassign the registers in one to match the other isn't a trivial task. It's hard enough when you start register assignment from the top of a function and work your way down, as I have done here. In the 16-pixel-pair case, the fact that the input values arrive in a different order as the result of them, in one case, being loaded in regularly-increasing address order, and in the other, falling out of a matrix transposition, has resulted in even the logical order of instructions being quite different in the two cases. In the 4-pixel-pair case, the values are packed differently into registers in the two cases, because in the v case, we're loading 4 pixels between row-strides, which means it's easy to place each row in its own vector, whereas in the h case we load 4 rows of 8 pixels each and transpose, which leaves the values in 4 vectors rather than 8. Some of the filtering steps can be performed with the data packed in this way (calculating a1 and a2) while waiting for it to be restructured in order to calculate the other metrics, but it's not worth packing the data together in this way in the v case given that it starts off already separated. So the two implementations end up quite different in the operations they perform, not just the scheduling of instructions and in register assignment terms. Some background: as you may have guessed, I didn't start out writing these functions as they currently appear. Prototype versions didn't care much for scheduling or keeping to a small number of registers. They were primarily for checking the correctness of the mathematics, and they'd use all available vectors, sometimes shuffling values between registers or to the stack to make room. Once I'd verified correctness, I then reworked them to keep to a minimal number of registers and to minimise stalls as far as possible. I'm targeting the Cortex-A72, since that's what the Raspberry Pi 4 uses and it's on the cusp of having enough power to decode VC-1 BluRay streams, so I deliberately didn't take too much consideration of the requirements of earlier cores. Yes, it's an out-of-order core, but I reckoned there are probably limits to how wisely it can select instructions to execute (there have got to be limits to instruction queue lengths, for example). So based on the pipeline structure documented in Arm's Cortex-A72 software opimization guide, I arranged the instructions to best keep all pipelines busy as much as possible, then assigned registers to keep the instructions in this order. For the most part, I was able to keep the number of vectors used low enough that no callee-saving was required - or failing that, at least avoiding having to spill values to the stack mid-function. But it came pretty close at times - witness for example the peculiar order in which vectors had to be loaded in the AArch32 version of ff_vc1_h_loop_filter16_neon. There's reason behind that! In short, I'd really rather not tamper with these larger assembly functions any more unless I really have to. Ben ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] libavutil/hwcontext_vaapi: Re-enable support for libva v1
On Tue, 2022-03-29 at 14:37 +, Xiang, Haihao wrote: > On Fri, 2022-03-11 at 13:24 +0100, Ingo Brückl wrote: > > Commit e050959103f375e6494937fa28ef2c4d2d15c9ef implemented passing in > > modifiers by using the PRIME_2 memory type, which only exists in v2 of > > the library. > > > > To still support v1 of the library, conditionally compile using > > VA_CHECK_VERSION() for both the new code and the old code before > > the commit. > > --- > > libavutil/hwcontext_vaapi.c | 57 - > > 1 file changed, 56 insertions(+), 1 deletion(-) > > > > diff --git a/libavutil/hwcontext_vaapi.c b/libavutil/hwcontext_vaapi.c > > index 994b744e4d..799490442e 100644 > > --- a/libavutil/hwcontext_vaapi.c > > +++ b/libavutil/hwcontext_vaapi.c > > @@ -1026,7 +1026,12 @@ static void vaapi_unmap_from_drm(AVHWFramesContext > > *dst_fc, > > static int vaapi_map_from_drm(AVHWFramesContext *src_fc, AVFrame *dst, > >const AVFrame *src, int flags) > > { > > +#if VA_CHECK_VERSION(2, 0, 0) > > VAAPIFramesContext *src_vafc = src_fc->internal->priv; > > +int use_prime2; > > +#else > > +int k; > > +#endif > > AVHWFramesContext *dst_fc = > > (AVHWFramesContext*)dst->hw_frames_ctx->data; > > AVVAAPIDeviceContext *dst_dev = dst_fc->device_ctx->hwctx; > > @@ -1034,10 +1039,28 @@ static int vaapi_map_from_drm(AVHWFramesContext > > *src_fc, AVFrame *dst, > > const VAAPIFormatDescriptor *format_desc; > > VASurfaceID surface_id; > > VAStatus vas = VA_STATUS_SUCCESS; > > -int use_prime2; > > uint32_t va_fourcc; > > int err, i, j; > > > > +#if !VA_CHECK_VERSION(2, 0, 0) > > +unsigned long buffer_handle; > > +VASurfaceAttribExternalBuffers buffer_desc; > > +VASurfaceAttrib attrs[2] = { > > +{ > > +.type = VASurfaceAttribMemoryType, > > +.flags = VA_SURFACE_ATTRIB_SETTABLE, > > +.value.type= VAGenericValueTypeInteger, > > +.value.value.i = VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME, > > +}, > > +{ > > +.type = VASurfaceAttribExternalBufferDescriptor, > > +.flags = VA_SURFACE_ATTRIB_SETTABLE, > > +.value.type= VAGenericValueTypePointer, > > +.value.value.p = _desc, > > +} > > +}; > > +#endif > > + > > desc = (AVDRMFrameDescriptor*)src->data[0]; > > > > if (desc->nb_objects != 1) { > > @@ -1072,6 +1095,7 @@ static int vaapi_map_from_drm(AVHWFramesContext > > *src_fc, > > AVFrame *dst, > > format_desc = vaapi_format_from_fourcc(va_fourcc); > > av_assert0(format_desc); > > > > +#if VA_CHECK_VERSION(2, 0, 0) > > use_prime2 = !src_vafc->prime_2_import_unsupported && > > desc->objects[0].format_modifier != > > DRM_FORMAT_MOD_INVALID; > > if (use_prime2) { > > @@ -1183,6 +1207,37 @@ static int vaapi_map_from_drm(AVHWFramesContext > > *src_fc, AVFrame *dst, > > _id, 1, > > buffer_attrs, FF_ARRAY_ELEMS(buffer_attrs)); > > } > > +#else > > +buffer_handle = desc->objects[0].fd; > > +buffer_desc.pixel_format = va_fourcc; > > +buffer_desc.width= src_fc->width; > > +buffer_desc.height = src_fc->height; > > +buffer_desc.data_size= desc->objects[0].size; > > +buffer_desc.buffers = _handle; > > +buffer_desc.num_buffers = 1; > > +buffer_desc.flags= 0; > > + > > +k = 0; > > +for (i = 0; i < desc->nb_layers; i++) { > > +for (j = 0; j < desc->layers[i].nb_planes; j++) { > > +buffer_desc.pitches[k] = desc->layers[i].planes[j].pitch; > > +buffer_desc.offsets[k] = desc->layers[i].planes[j].offset; > > +++k; > > +} > > +} > > +buffer_desc.num_planes = k; > > + > > +if (format_desc->chroma_planes_swapped && > > +buffer_desc.num_planes == 3) { > > +FFSWAP(uint32_t, buffer_desc.pitches[1], buffer_desc.pitches[2]); > > +FFSWAP(uint32_t, buffer_desc.offsets[1], buffer_desc.offsets[2]); > > +} > > + > > +vas = vaCreateSurfaces(dst_dev->display, format_desc->rt_format, > > + src->width, src->height, > > + _id, 1, > > + attrs, FF_ARRAY_ELEMS(attrs)); > > +#endif > > if (vas != VA_STATUS_SUCCESS) { > > av_log(dst_fc, AV_LOG_ERROR, "Failed to create surface from DRM " > > "object: %d (%s).\n", vas, vaErrorStr(vas)); > > LGTM, will apply I'm sorry I didn't notice you are using `VA_CHECK_VERSION(2, 0, 0)`. PRIME_2 is available since VA-API 1.0, so you should use `VA_CHECK_VERSION(1, 0, 0)`, could you please update your patch ? Thanks Haihao ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe,
Re: [FFmpeg-devel] [PATCH 04/10] avcodec/vc1: Introduce fast path for unescaping bitstream buffer
On Thu, 31 Mar 2022, Ben Avison wrote: On 29/03/2022 21:37, Martin Storsjö wrote: On Fri, 25 Mar 2022, Ben Avison wrote: As with the rest of the checkasm tests - please unmacro most things where possible (except for the RANDOMIZE_* macros, those are ok to keep macroed if you want to). In the case of TEST_UNESCAPE, I think it has to remain as a macro, otherwise the next function up ends up with a declare_func_emms() and a bench_new() but no call_ref() or call_new(), which means some builds end up with an unused function warning. Oh, right - yes, call_ref and call_new need to be in the same scope as declare_func, yes. I can, however, split all the unescape tests out of checkasm_check_vc1dsp into a separate function (and separate functions for inverse-transform and deblocking tests). Awesome, thanks! // Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 04/10] avcodec/vc1: Introduce fast path for unescaping bitstream buffer
On 29/03/2022 21:37, Martin Storsjö wrote: On Fri, 25 Mar 2022, Ben Avison wrote: +#define TEST_UNESCAPE \ + do { \ + for (int count = 100; count > 0; --count) { \ + escaped_offset = rnd() & 7; \ + unescaped_offset = rnd() & 7; \ + escaped_len = (1u << (rnd() % 8) + 3) - (rnd() & 7); \ + RANDOMIZE_BUFFER8(unescaped, UNESCAPE_BUF_SIZE); \ The output buffer will be overwritten in the end, but I guess this initialization is useful for making sure that the test doesn't accidentally rely on the output from the previous iteration, right? The main idea was to catch examples of writing to the buffer beyond the length reported (and less likely, writes before the start of the buffer). I suppose it's possible that someone might want to deliberately overwrite in specific conditions, but the test could always be loosened up at that point once those conditions become clearer. + len0 = call_ref(escaped0 + escaped_offset, escaped_len, unescaped0 + unescaped_offset); \ + len1 = call_new(escaped1 + escaped_offset, escaped_len, unescaped1 + unescaped_offset); \ + if (len0 != len1 || memcmp(unescaped0, unescaped1, len0)) \ Don't you need to include unescaped_offset here too? Otherwise you're just checking areas of the buffer that wasn't necessarily written. I realise I should have made the memcmp length UNESCAPE_BUF_SIZE here to achieve what I intended. Testing len0 bytes from the start of the buffer neither checks all the written bytes nor checks the byte after those written :-$ As with the rest of the checkasm tests - please unmacro most things where possible (except for the RANDOMIZE_* macros, those are ok to keep macroed if you want to). In the case of TEST_UNESCAPE, I think it has to remain as a macro, otherwise the next function up ends up with a declare_func_emms() and a bench_new() but no call_ref() or call_new(), which means some builds end up with an unused function warning. I can, however, split all the unescape tests out of checkasm_check_vc1dsp into a separate function (and separate functions for inverse-transform and deblocking tests). Ben ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/4 v2] ffmpeg: ensure a keyframe was not seen before skipping packets
On 3/31/2022 8:47 AM, Anton Khirnov wrote: Quoting James Almer (2022-02-23 16:03:53) A keyframe could be buffered in the bsf and not be output until more packets had been fed to it. Signed-off-by: James Almer --- Changed the check from pkt to !eof, since a packet is always provided. fftools/ffmpeg.c | 4 +++- fftools/ffmpeg.h | 1 + 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/fftools/ffmpeg.c b/fftools/ffmpeg.c index 44043ef203..2b61c0d5aa 100644 --- a/fftools/ffmpeg.c +++ b/fftools/ffmpeg.c @@ -890,6 +890,8 @@ static void output_packet(OutputFile *of, AVPacket *pkt, /* apply the output bitstream filters */ if (ost->bsf_ctx) { +if (!eof && pkt->flags & AV_PKT_FLAG_KEY) +ost->seen_kf = 1; Shouldn't this also be set when no bsfs are used? Afaict only in streamcopy with bsfs scenarios can packets be temporarily withheld, so it's not necessary. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 2/4 v2] ffmpeg: ensure a keyframe was not seen before skipping packets
Quoting James Almer (2022-02-23 16:03:53) > A keyframe could be buffered in the bsf and not be output until more packets > had been fed to it. > > Signed-off-by: James Almer > --- > Changed the check from pkt to !eof, since a packet is always provided. > > fftools/ffmpeg.c | 4 +++- > fftools/ffmpeg.h | 1 + > 2 files changed, 4 insertions(+), 1 deletion(-) > > diff --git a/fftools/ffmpeg.c b/fftools/ffmpeg.c > index 44043ef203..2b61c0d5aa 100644 > --- a/fftools/ffmpeg.c > +++ b/fftools/ffmpeg.c > @@ -890,6 +890,8 @@ static void output_packet(OutputFile *of, AVPacket *pkt, > > /* apply the output bitstream filters */ > if (ost->bsf_ctx) { > +if (!eof && pkt->flags & AV_PKT_FLAG_KEY) > +ost->seen_kf = 1; Shouldn't this also be set when no bsfs are used? -- Anton Khirnov ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 6/7] avcodec: Make avcodec_decoder_subtitles2 accept a const AVPacket*
Quoting Andreas Rheinhardt (2022-03-31 00:49:57) > From: Andreas Rheinhardt > > Signed-off-by: Andreas Rheinhardt > --- > doc/APIchanges| 3 +++ > fftools/ffmpeg.c | 4 ++-- > fftools/ffprobe.c | 2 +- > libavcodec/avcodec.h | 3 +-- > libavcodec/decode.c | 9 - > libavcodec/version.h | 2 +- > tools/target_dec_fuzzer.c | 4 ++-- > 7 files changed, 14 insertions(+), 13 deletions(-) > > diff --git a/doc/APIchanges b/doc/APIchanges > index 1a9f0a303e..326a3c721c 100644 > --- a/doc/APIchanges > +++ b/doc/APIchanges > @@ -14,6 +14,9 @@ libavutil: 2021-04-27 > > API changes, most recent first: > > +2022-03-30 - xx - lavc 59.26.100 - avcodec.h > + avcodec_decode_subtitle2() now accepts const AVPacket*. I vaguely recall C++ having a problem with such changes. Anybody remembers the details? Do we care? -- Anton Khirnov ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH v2] doc/filters: document vf_libplacebo
From: Niklas Haas Signed-off-by: Niklas Haas --- Changes in v2: - expand documentation of tone mapping curves - slight rewording of some sections - add more examples --- doc/filters.texi | 494 +++ 1 file changed, 494 insertions(+) diff --git a/doc/filters.texi b/doc/filters.texi index 1d56d24819..a6f2f1397e 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -14793,6 +14793,500 @@ ffmpeg -i input.mov -vf lensfun=make=Canon:model="Canon EOS 100D":lens_model="Ca @end itemize +@section libplacebo + +Flexible GPU-accelerated processing filter based on libplacebo +(@url{https://code.videolan.org/videolan/libplacebo}). Note that this filter +currently only accepts Vulkan input frames. + +@subsection Options + +The options for this filter are divided into the following sections: + +@subsubsection Output mode +These options control the overall output mode. By default, libplacebo will try +to preserve the source colorimetry and size as best as it can, but it will +apply any embedded film grain, dolby vision metadata or anamorphic SAR present +in source frames. +@table @option +@item w +@item h +Set the output video dimension expression. Default value is the input dimension. + +Allows for the same expressions as the @ref{scale} filter. + +@item format +Set the output format override. If unset (the default), frames will be output +in the same format as the respective input frames. Otherwise, format conversion +will be performed. + +@item force_original_aspect_ratio +@item force_divisible_by +Work the same as the identical @ref{scale} filter options. + +@item normalize_sar +If enabled (the default), output frames will always have a pixel aspect ratio +of 1:1. If disabled, any aspect ratio mismatches, including those from e.g. +anamorphic video sources, are forwarded to the output pixel aspect ratio. + +@item pad_crop_ratio +Specifies a ratio (between @code{0.0} and @code{1.0}) between padding and +cropping when the input aspect ratio does not match the output aspect ratio and +@option{normalize_sar} is in effect. The default of @code{0.0} always pads the +content with black borders, while a value of @code{1.0} always crops off parts +of the content. Intermediate values are possible, leading to a mix of the two +approaches. + +@item colorspace +@item color_primaries +@item color_trc +@item range +Configure the colorspace that output frames will be delivered in. The default +value of @code{auto} outputs frames in the same format as the input frames, +leading to no change. For any other value, conversion will be performed. + +See the @ref{setparams} filter for a list of possible values. + +@item apply_filmgrain +Apply film grain (e.g. AV1 or H.274) if present in source frames, and strip +it from the output. Enabled by default. + +@item apply_dolbyvision +Apply Dolby Vision RPU metadata if present in source frames, and strip it from +the output. Enabled by default. Note that Dolby Vision will always output +BT.2020+PQ, overriding the usual input frame metadata. These will also be +picked as the values of @code{auto} for the respective frame output options. +@end table + +@subsubsection Scaling +The options in this section control how libplacebo performs upscaling and (if +necessary) downscaling. Note that libplacebo will always internally operate on +4:4:4 content, so any sub-sampled chroma formats such as @code{yuv420p} will +necessarily be upsampled and downsampled as part of the rendering process. That +means scaling might be in effect even if the source and destination resolution +are the same. +@table @option +@item upscaler +@item downscaler +Configure the filter kernel used for upscaling and downscaling. The respective +defaults are @code{spline36} and @code{mitchell}. For a full list of possible +values, pass @code{help} to these options. The most important values are: +@table @samp + +@item none +Forces the use of built-in GPU texture sampling (typically bilinear). Extremely +fast but poor quality, especially when downscaling. + +@item bilinear +Bilinear interpolation. Can generally be done for free on GPUs, except when +doing so would lead to aliasing. Fast and low quality. + +@item nearest +Nearest-neighbour interpolation. Sharp but highly aliasing. + +@item oversample +Algorithm that looks visually similar to nearest-neighbour interpolation but +tries to preserve pixel aspect ratio. Good for pixel art, since it results in +minimal distortion of the artistic appearance. + +@item lanczos +Standard sinc-sinc interpolation kernel. + +@item spline36 +Cubic spline approximation of lanczos. No difference in performance, but has +very slightly less ringing. + +@item ewa_lanczos +Elliptically weighted average version of lanczos, based on a jinc-sinc kernel. +This is also popularly referred to as just "Jinc scaling". Slow but very high +quality. + +@item gaussian +Gaussian kernel. Has certain ideal mathematical properties, but subjectively +very blurry. +
Re: [FFmpeg-devel] [PATCH v2] avfilter/vf_libplacebo: update for new tone mapping API
Applied as e301a24fa191ad19574289b765ff1946b23c03f3 On Fri, 25 Mar 2022 16:11:19 +0100 Niklas Haas wrote: > From: Niklas Haas > > Upstream gained a new tone-mapping API, which we never switched to. We > don't need a version bump for this because it was included as part of > the v4.192 release we currently already depend on. > > Some of the old options can be moderately approximated with the new API, > but specifically "desaturation_base" and "max_boost" cannot. Remove > these entirely, rather than deprecating them. They have actually been > non-functional for a while as a result of the upstream deprecation. > > Signed-off-by: Niklas Haas > --- > Changes in v2: > - Avoid use of strings in favor of replicating the enum values > - Fix two wrong enum option value ranges > - Simplify the option setting code again slightly > --- > libavfilter/vf_libplacebo.c | 112 > 1 file changed, 89 insertions(+), 23 deletions(-) > > diff --git a/libavfilter/vf_libplacebo.c b/libavfilter/vf_libplacebo.c > index 31ae28ac38..8ce6462c66 100644 > --- a/libavfilter/vf_libplacebo.c > +++ b/libavfilter/vf_libplacebo.c > @@ -26,6 +26,33 @@ > #include > #include > > +enum { > +TONE_MAP_AUTO, > +TONE_MAP_CLIP, > +TONE_MAP_BT2390, > +TONE_MAP_BT2446A, > +TONE_MAP_SPLINE, > +TONE_MAP_REINHARD, > +TONE_MAP_MOBIUS, > +TONE_MAP_HABLE, > +TONE_MAP_GAMMA, > +TONE_MAP_LINEAR, > +TONE_MAP_COUNT, > +}; > + > +static const struct pl_tone_map_function * const > tonemapping_funcs[TONE_MAP_COUNT] = { > +[TONE_MAP_AUTO] = _tone_map_auto, > +[TONE_MAP_CLIP] = _tone_map_clip, > +[TONE_MAP_BT2390] = _tone_map_bt2390, > +[TONE_MAP_BT2446A] = _tone_map_bt2446a, > +[TONE_MAP_SPLINE] = _tone_map_spline, > +[TONE_MAP_REINHARD] = _tone_map_reinhard, > +[TONE_MAP_MOBIUS] = _tone_map_mobius, > +[TONE_MAP_HABLE]= _tone_map_hable, > +[TONE_MAP_GAMMA]= _tone_map_gamma, > +[TONE_MAP_LINEAR] = _tone_map_linear, > +}; > + > typedef struct LibplaceboContext { > /* lavfi vulkan*/ > FFVulkanContext vkctx; > @@ -91,12 +118,16 @@ typedef struct LibplaceboContext { > > /* pl_color_map_params */ > int intent; > +int gamut_mode; > int tonemapping; > float tonemapping_param; > +int tonemapping_mode; > +int inverse_tonemapping; > +float crosstalk; > +int tonemapping_lut_size; > +/* for backwards compatibility */ > float desat_str; > float desat_exp; > -float desat_base; > -float max_boost; > int gamut_warning; > int gamut_clipping; > > @@ -281,6 +312,8 @@ static int process_frames(AVFilterContext *avctx, AVFrame > *out, AVFrame *in) > int err = 0, ok; > LibplaceboContext *s = avctx->priv; > struct pl_render_params params; > +enum pl_tone_map_mode tonemapping_mode = s->tonemapping_mode; > +enum pl_gamut_mode gamut_mode = s->gamut_mode; > struct pl_frame image, target; > ok = pl_map_avframe_ex(s->gpu, , pl_avframe_params( > .frame= in, > @@ -305,6 +338,24 @@ static int process_frames(AVFilterContext *avctx, > AVFrame *out, AVFrame *in) > pl_rect2df_aspect_set(, aspect, s->pad_crop_ratio); > } > > +/* backwards compatibility with older API */ > +if (!tonemapping_mode && (s->desat_str >= 0.0f || s->desat_exp >= 0.0f)) > { > +float str = s->desat_str < 0.0f ? 0.9f : s->desat_str; > +float exp = s->desat_exp < 0.0f ? 0.2f : s->desat_exp; > +if (str >= 0.9f && exp <= 0.1f) { > +tonemapping_mode = PL_TONE_MAP_RGB; > +} else if (str > 0.1f) { > +tonemapping_mode = PL_TONE_MAP_HYBRID; > +} else { > +tonemapping_mode = PL_TONE_MAP_LUMA; > +} > +} > + > +if (s->gamut_warning) > +gamut_mode = PL_GAMUT_WARN; > +if (s->gamut_clipping) > +gamut_mode = PL_GAMUT_DESATURATE; > + > /* Update render params */ > params = (struct pl_render_params) { > PL_RENDER_DEFAULTS > @@ -338,14 +389,13 @@ static int process_frames(AVFilterContext *avctx, > AVFrame *out, AVFrame *in) > > .color_map_params = pl_color_map_params( > .intent = s->intent, > -.tone_mapping_algo = s->tonemapping, > +.gamut_mode = gamut_mode, > +.tone_mapping_function = tonemapping_funcs[s->tonemapping], > .tone_mapping_param = s->tonemapping_param, > -.desaturation_strength = s->desat_str, > -.desaturation_exponent = s->desat_exp, > -.desaturation_base = s->desat_base, > -.max_boost = s->max_boost, > -.gamut_warning = s->gamut_warning, > -.gamut_clipping = s->gamut_clipping, > +.tone_mapping_mode = tonemapping_mode, > +.inverse_tone_mapping = s->inverse_tonemapping, > +.tone_mapping_crosstalk = s->crosstalk, > +
Re: [FFmpeg-devel] [PATCH 8/8] avcodec/codec_internal: Include codec_tags only when they are needed
Andreas Rheinhardt: > They are only needed for the fuzzer, so check for CONFIG_OSSFUZZ. > This decreases sizeof(FFCodec), which is important given that > FFCodecs reside in .data.rel.ro in case of ELF with > position-independent code which is always loaded and can't be shared > between processes. > > Signed-off-by: Andreas Rheinhardt > --- > libavcodec/bitpacked_dec.c | 5 + > libavcodec/codec_internal.h | 10 ++ > libavcodec/hapdec.c | 13 + > tools/target_dec_fuzzer.c | 2 ++ > 4 files changed, 18 insertions(+), 12 deletions(-) > > diff --git a/libavcodec/bitpacked_dec.c b/libavcodec/bitpacked_dec.c > index 419550dfe0..b62d88fa8f 100644 > --- a/libavcodec/bitpacked_dec.c > +++ b/libavcodec/bitpacked_dec.c > @@ -151,9 +151,6 @@ const FFCodec ff_bitpacked_decoder = { > .init = bitpacked_init_decoder, > .decode = bitpacked_decode, > .p.capabilities = AV_CODEC_CAP_FRAME_THREADS, > -.codec_tags = (const uint32_t []){ > -MKTAG('U', 'Y', 'V', 'Y'), > -FF_CODEC_TAGS_END, > -}, > .caps_internal = FF_CODEC_CAP_INIT_THREADSAFE, > +FF_CODEC_TAGS(MKTAG('U', 'Y', 'V', 'Y')) > }; > diff --git a/libavcodec/codec_internal.h b/libavcodec/codec_internal.h > index 596cdbebd2..b6b5b05b44 100644 > --- a/libavcodec/codec_internal.h > +++ b/libavcodec/codec_internal.h > @@ -21,6 +21,7 @@ > > #include > > +#include "config.h" > #include "libavutil/attributes.h" > #include "codec.h" > > @@ -74,10 +75,16 @@ > */ > #define FF_CODEC_CAP_SETS_FRAME_PROPS (1 << 8) > > +#if CONFIG_OSSFUZZ > /** > * FFCodec.codec_tags termination value > */ > #define FF_CODEC_TAGS_END -1 > +#define FF_CODEC_TAGS(...) \ > +.codec_tags = (const uint32_t[]){ __VA_ARGS__, FF_CODEC_TAGS_END }, > +#else > +#define FF_CODEC_TAGS(...) > +#endif > > typedef struct FFCodecDefault { > const char *key; > @@ -196,10 +203,13 @@ typedef struct FFCodec { > */ > const struct AVCodecHWConfigInternal *const *hw_configs; > > +#if CONFIG_OSSFUZZ > /** > * List of supported codec_tags, terminated by FF_CODEC_TAGS_END. > + * Should be defined with the FF_CODEC_TAGS() macro. > */ > const uint32_t *codec_tags; > +#endif > } FFCodec; > > static av_always_inline const FFCodec *ffcodec(const AVCodec *codec) > diff --git a/libavcodec/hapdec.c b/libavcodec/hapdec.c > index 4a7ac15a8e..72f922bc5b 100644 > --- a/libavcodec/hapdec.c > +++ b/libavcodec/hapdec.c > @@ -486,12 +486,9 @@ const FFCodec ff_hap_decoder = { >AV_CODEC_CAP_DR1, > .caps_internal = FF_CODEC_CAP_INIT_THREADSAFE | >FF_CODEC_CAP_INIT_CLEANUP, > -.codec_tags = (const uint32_t []){ > -MKTAG('H','a','p','1'), > -MKTAG('H','a','p','5'), > -MKTAG('H','a','p','Y'), > -MKTAG('H','a','p','A'), > -MKTAG('H','a','p','M'), > -FF_CODEC_TAGS_END, > -}, > +FF_CODEC_TAGS(MKTAG('H','a','p','1'), > + MKTAG('H','a','p','5'), > + MKTAG('H','a','p','Y'), > + MKTAG('H','a','p','A'), > + MKTAG('H','a','p','M')) > }; > diff --git a/tools/target_dec_fuzzer.c b/tools/target_dec_fuzzer.c > index 288aa63313..77f4bb8dd8 100644 > --- a/tools/target_dec_fuzzer.c > +++ b/tools/target_dec_fuzzer.c > @@ -279,12 +279,14 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t > size) { > ctx->sample_rate= bytestream2_get_le32() > & 0x7FFF; > ctx->ch_layout.nb_channels = > (unsigned)bytestream2_get_le32() % FF_SANE_NB_CHANNELS; > ctx->block_align= bytestream2_get_le32() > & 0x7FFF; > +#if CONFIG_OSSFUZZ > ctx->codec_tag = bytestream2_get_le32(); > if (c->codec_tags) { > int n; > for (n = 0; c->codec_tags[n] != FF_CODEC_TAGS_END; n++); > ctx->codec_tag = c->codec_tags[ctx->codec_tag % n]; > } > +#endif > keyframes = bytestream2_get_le64(); > request_channel_layout = bytestream2_get_le64(); > Will apply tomorrow unless there are objections. - Andreas ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] libavutil/hwcontext_qsv: Align width and heigh when download qsv frame
The width and height for qsv frame to download need to be aligned with 16. Add the alignment operation. Now the following command works: ffmpeg -hwaccel qsv -f rawvideo -s 1920x1080 -pix_fmt yuv420p -i \ input.yuv -vf "hwupload=extra_hw_frames=16,format=qsv,hwdownload, \ format=nv12" -f null - Signed-off-by: Wenbin Chen --- libavutil/hwcontext_qsv.c | 34 ++ 1 file changed, 34 insertions(+) diff --git a/libavutil/hwcontext_qsv.c b/libavutil/hwcontext_qsv.c index 95f8071abe..1e7c065902 100644 --- a/libavutil/hwcontext_qsv.c +++ b/libavutil/hwcontext_qsv.c @@ -1063,6 +1063,40 @@ static int qsv_transfer_data_from(AVHWFramesContext *ctx, AVFrame *dst, if (ret < 0) return ret; +/* According to MSDK spec for mfxframeinfo, "Width must be a multiple of 16. + * Height must be a multiple of 16 for progressive frame sequence and a + * multiple of 32 otherwise.", so allign all frames to 16 before downloading. */ +if (ctx->height & 15 || dst->linesize[0] & 15) { +AVFrame *tmp_frame; +tmp_frame = av_frame_alloc(); +if (!tmp_frame) +return AVERROR(ENOMEM); +ret = av_frame_ref(tmp_frame, dst); +if (ret < 0) { +av_frame_free(_frame); +return ret; +} +av_frame_unref(dst); + +dst->width = FFALIGN(tmp_frame->width, 16); +dst->height = FFALIGN(ctx->height, 16); +dst->format = tmp_frame->format; +ret = av_frame_get_buffer(dst, 0); +if (ret < 0) { +av_frame_free(_frame); +return ret; +} + +dst->width = tmp_frame->width; +dst->height = tmp_frame->height; +ret = av_frame_copy_props(dst, tmp_frame); +if (ret < 0) { +av_frame_free(_frame); +return ret; +} +av_frame_free(_frame); +} + if (!s->session_download) { if (s->child_frames_ref) return qsv_transfer_data_child(ctx, dst, src); -- 2.32.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH 1/8] fate/filter-refcmp-*: make refcmp_metadata fail on empty input
On 30/03/2022 22:31, Marton Balint wrote: On empty input the awk script was always successful which caused the filter-refcmp tests to always succeed. Also fix the command lines for refcmp_metadata compare function because it needs auto conversion filters, and update reference of test filter-refcmp-psnr-rgb because it was missed in a7fc78c1a638a32c3695c06f727774c740d675c2 but was never noticed due to the original issue... Signed-off-by: Marton Balint --- tests/fate-run.sh | 2 +- tests/ref/fate/filter-refcmp-psnr-rgb | 80 +-- tests/refcmp-metadata.awk | 3 + 3 files changed, 44 insertions(+), 41 deletions(-) diff --git a/tests/fate-run.sh b/tests/fate-run.sh index fbfc0a925d..5e8d607d88 100755 --- a/tests/fate-run.sh +++ b/tests/fate-run.sh @@ -377,7 +377,7 @@ refcmp_metadata(){ refcmp=$1 pixfmt=$2 fuzz=${3:-0.001} -ffmpeg $FLAGS $ENC_OPTS \ +ffmpeg -auto_conversion_filters $FLAGS $ENC_OPTS \ -lavfi "testsrc2=size=300x200:rate=1:duration=5,format=${pixfmt},split[ref][tmp];[tmp]avgblur=4[enc];[enc][ref]${refcmp},metadata=print:file=-" \ -f null /dev/null | awk -v ref=${ref} -v fuzz=${fuzz} -f ${base}/refcmp-metadata.awk - } diff --git a/tests/ref/fate/filter-refcmp-psnr-rgb b/tests/ref/fate/filter-refcmp-psnr-rgb index f06db575ac..20abd3dc5a 100644 --- a/tests/ref/fate/filter-refcmp-psnr-rgb +++ b/tests/ref/fate/filter-refcmp-psnr-rgb @@ -1,45 +1,45 @@ frame:0pts:0 pts_time:0 -lavfi.psnr.mse.r=1381.80 -lavfi.psnr.psnr.r=16.73 -lavfi.psnr.mse.g=896.00 -lavfi.psnr.psnr.g=18.61 -lavfi.psnr.mse.b=277.38 -lavfi.psnr.psnr.b=23.70 -lavfi.psnr.mse_avg=851.73 -lavfi.psnr.psnr_avg=18.83 +lavfi.psnr.mse.r=1367.642090 +lavfi.psnr.psnr.r=16.771078 +lavfi.psnr.mse.g=885.804382 +lavfi.psnr.psnr.g=18.657425 +lavfi.psnr.mse.b=274.825073 +lavfi.psnr.psnr.b=23.740240 +lavfi.psnr.mse_avg=842.757202 +lavfi.psnr.psnr_avg=18.873779 frame:1pts:1 pts_time:1 -lavfi.psnr.mse.r=1380.37 -lavfi.psnr.psnr.r=16.73 -lavfi.psnr.mse.g=975.91 -lavfi.psnr.psnr.g=18.24 -lavfi.psnr.mse.b=435.72 -lavfi.psnr.psnr.b=21.74 -lavfi.psnr.mse_avg=930.67 -lavfi.psnr.psnr_avg=18.44 +lavfi.psnr.mse.r=1356.681152 +lavfi.psnr.psnr.r=16.806026 +lavfi.psnr.mse.g=958.161560 +lavfi.psnr.psnr.g=18.316416 +lavfi.psnr.mse.b=428.238312 +lavfi.psnr.psnr.b=21.813948 +lavfi.psnr.mse_avg=914.360352 +lavfi.psnr.psnr_avg=18.519630 frame:2pts:2 pts_time:2 -lavfi.psnr.mse.r=1403.20 -lavfi.psnr.psnr.r=16.66 -lavfi.psnr.mse.g=954.05 -lavfi.psnr.psnr.g=18.34 -lavfi.psnr.mse.b=494.22 -lavfi.psnr.psnr.b=21.19 -lavfi.psnr.mse_avg=950.49 -lavfi.psnr.psnr_avg=18.35 +lavfi.psnr.mse.r=1387.254883 +lavfi.psnr.psnr.r=16.709242 +lavfi.psnr.mse.g=939.230957 +lavfi.psnr.psnr.g=18.403080 +lavfi.psnr.mse.b=493.913757 +lavfi.psnr.psnr.b=21.194292 +lavfi.psnr.mse_avg=940.133179 +lavfi.psnr.psnr_avg=18.398911 frame:3pts:3 pts_time:3 -lavfi.psnr.mse.r=1452.80 -lavfi.psnr.psnr.r=16.51 -lavfi.psnr.mse.g=1001.02 -lavfi.psnr.psnr.g=18.13 -lavfi.psnr.mse.b=557.39 -lavfi.psnr.psnr.b=20.67 -lavfi.psnr.mse_avg=1003.74 -lavfi.psnr.psnr_avg=18.11 +lavfi.psnr.mse.r=1433.291260 +lavfi.psnr.psnr.r=16.567459 +lavfi.psnr.mse.g=990.005859 +lavfi.psnr.psnr.g=18.174425 +lavfi.psnr.mse.b=550.512329 +lavfi.psnr.psnr.b=20.723133 +lavfi.psnr.mse_avg=991.269836 +lavfi.psnr.psnr_avg=18.168884 frame:4pts:4 pts_time:4 -lavfi.psnr.mse.r=1401.25 -lavfi.psnr.psnr.r=16.67 -lavfi.psnr.mse.g=1009.80 -lavfi.psnr.psnr.g=18.09 -lavfi.psnr.mse.b=602.42 -lavfi.psnr.psnr.b=20.33 -lavfi.psnr.mse_avg=1004.49 -lavfi.psnr.psnr_avg=18.11 +lavfi.psnr.mse.r=1385.949341 +lavfi.psnr.psnr.r=16.713329 +lavfi.psnr.mse.g=997.065796 +lavfi.psnr.psnr.g=18.143566 +lavfi.psnr.mse.b=601.962952 +lavfi.psnr.psnr.b=20.335106 +lavfi.psnr.mse_avg=994.992676 +lavfi.psnr.psnr_avg=18.152605 diff --git a/tests/refcmp-metadata.awk b/tests/refcmp-metadata.awk index fa21aad0e0..e7ed5ae809 100644 --- a/tests/refcmp-metadata.awk +++ b/tests/refcmp-metadata.awk @@ -50,12 +50,15 @@ BEGIN { } END { +result = result && (NR != 0); Checking for "NR == ref_nr" would additionally catch truncated input. if (result) { for (i = 1; i <= ref_nr; i++) print ref_lines[i]; } else { for (i = 1; i <= NR; i++) print cmp_lines[i]; +if (NR == 0) +print "[refcmp] no input"; Output should go to stderr here. if (NR != ref_nr) print "[refcmp] lines: " NR " != " ref_nr > "/dev/stderr"; Maybe add an "else" before the "if" to avoid that both lines are printed for empty input. if (delta_max >= fuzz) Otherwise looks good to me. Thanks for catching the issue! Regards, Tobias ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email
Re: [FFmpeg-devel] [PATCH 1/2] avutil/hwcontext_videotoolbox: create real buffer pool
Ping for patch 1/2. rcombs has reviewed the patch on IRC. I decided to drop patch 2/2. > 11:05 rcombs: quink_: seems reasonable to me > 11:06 quink_: rcombs: thanks : ) > 11:06 rcombs: not entirely sure what the deal with the second commit is but > ¯\_(ツ)_/¯ it's harmless so w/e > On Mar 10, 2022, at 12:37 PM, Zhao Zhili wrote: > > vt_get_buffer shouldn't do buffer pool's job. > --- > libavutil/hwcontext_videotoolbox.c | 71 ++ > 1 file changed, 34 insertions(+), 37 deletions(-) > > diff --git a/libavutil/hwcontext_videotoolbox.c > b/libavutil/hwcontext_videotoolbox.c > index 026127d412..e442a95007 100644 > --- a/libavutil/hwcontext_videotoolbox.c > +++ b/libavutil/hwcontext_videotoolbox.c > @@ -210,9 +210,36 @@ static int vt_pool_alloc(AVHWFramesContext *ctx) > return AVERROR_EXTERNAL; > } > > -static AVBufferRef *vt_dummy_pool_alloc(void *opaque, size_t size) > +static void videotoolbox_buffer_release(void *opaque, uint8_t *data) > +{ > +CVPixelBufferRelease((CVPixelBufferRef)data); > +} > + > +static AVBufferRef *vt_pool_alloc_buffer(void *opaque, size_t size) > { > -return NULL; > +CVPixelBufferRef pixbuf; > +AVBufferRef *buf; > +CVReturn err; > +AVHWFramesContext *ctx = opaque; > +VTFramesContext *fctx = ctx->internal->priv; > + > +err = CVPixelBufferPoolCreatePixelBuffer( > +NULL, > +fctx->pool, > + > +); > +if (err != kCVReturnSuccess) { > +av_log(ctx, AV_LOG_ERROR, "Failed to create pixel buffer from pool: > %d\n", err); > +return NULL; > +} > + > +buf = av_buffer_create((uint8_t *)pixbuf, size, > + videotoolbox_buffer_release, NULL, 0); > +if (!buf) { > +CVPixelBufferRelease(pixbuf); > +return NULL; > +} > +return buf; > } > > static void vt_frames_uninit(AVHWFramesContext *ctx) > @@ -238,9 +265,9 @@ static int vt_frames_init(AVHWFramesContext *ctx) > return AVERROR(ENOSYS); > } > > -// create a dummy pool so av_hwframe_get_buffer doesn't EINVAL > if (!ctx->pool) { > -ctx->internal->pool_internal = av_buffer_pool_init2(0, ctx, > vt_dummy_pool_alloc, NULL); > +ctx->internal->pool_internal = av_buffer_pool_init2( > +sizeof(CVPixelBufferRef), ctx, vt_pool_alloc_buffer, NULL); > if (!ctx->internal->pool_internal) > return AVERROR(ENOMEM); > } > @@ -252,41 +279,11 @@ static int vt_frames_init(AVHWFramesContext *ctx) > return 0; > } > > -static void videotoolbox_buffer_release(void *opaque, uint8_t *data) > -{ > -CVPixelBufferRelease((CVPixelBufferRef)data); > -} > - > static int vt_get_buffer(AVHWFramesContext *ctx, AVFrame *frame) > { > -VTFramesContext *fctx = ctx->internal->priv; > - > -if (ctx->pool && ctx->pool->size != 0) { > -frame->buf[0] = av_buffer_pool_get(ctx->pool); > -if (!frame->buf[0]) > -return AVERROR(ENOMEM); > -} else { > -CVPixelBufferRef pixbuf; > -AVBufferRef *buf = NULL; > -CVReturn err; > - > -err = CVPixelBufferPoolCreatePixelBuffer( > -NULL, > -fctx->pool, > - > -); > -if (err != kCVReturnSuccess) { > -av_log(ctx, AV_LOG_ERROR, "Failed to create pixel buffer from > pool: %d\n", err); > -return AVERROR_EXTERNAL; > -} > - > -buf = av_buffer_create((uint8_t *)pixbuf, 1, > videotoolbox_buffer_release, NULL, 0); > -if (!buf) { > -CVPixelBufferRelease(pixbuf); > -return AVERROR(ENOMEM); > -} > -frame->buf[0] = buf; > -} > +frame->buf[0] = av_buffer_pool_get(ctx->pool); > +if (!frame->buf[0]) > +return AVERROR(ENOMEM); > > frame->data[3] = frame->buf[0]->data; > frame->format = AV_PIX_FMT_VIDEOTOOLBOX; > -- > 2.31.1 > > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".