Re: [FFmpeg-devel] [PATCH] avfilter/palettegen: export color quantization ratio
Le septidi 7 ventôse, an CCXXIII, Clement Boesch a écrit : +float ratio = (float)nb_out / nb_in; +snprintf(buf, sizeof(buf), %f, ratio); I wonder if both values could be useful instead individually. If so, either set the string to %d/%d or set two metadata keys. If not, use double instead of float, float is very rarely useful. And in this particular case it will be converted to double by the vararg call. +av_log(ctx, AV_LOG_INFO, %d%s colors generated out of %d colors; ratio=%f\n, + s-nb_boxes, s-reserve_transparent ? (+1) : , s-nb_refs, + set_colorquant_ratio_meta(out, s-nb_boxes, s-nb_refs)); This is your code, but I must say that I find returning the double value as a secondary effect of a function called set_something() not very readable. Regards, -- Nicolas George signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] avfilter/palettegen: export color quantization ratio
From: Clément Bœsch clem...@stupeflix.com --- doc/filters.texi| 5 + libavfilter/version.h | 2 +- libavfilter/vf_palettegen.c | 14 -- 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/doc/filters.texi b/doc/filters.texi index baef346..0c72145 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -6980,6 +6980,11 @@ the background is static. Default value is @var{full}. @end table +The filter also exports the frame metadata @code{lavfi.color_quant_ratio} +(@code{nb_color_in / nb_color_out}) which you can use to evaluate the degree of +color quantization of the palette. This information is also visible at +@var{info} logging level. + @subsection Examples @itemize diff --git a/libavfilter/version.h b/libavfilter/version.h index 83fd2a5..986d7a7 100644 --- a/libavfilter/version.h +++ b/libavfilter/version.h @@ -31,7 +31,7 @@ #define LIBAVFILTER_VERSION_MAJOR 5 #define LIBAVFILTER_VERSION_MINOR 11 -#define LIBAVFILTER_VERSION_MICRO 101 +#define LIBAVFILTER_VERSION_MICRO 102 #define LIBAVFILTER_VERSION_INT AV_VERSION_INT(LIBAVFILTER_VERSION_MAJOR, \ LIBAVFILTER_VERSION_MINOR, \ diff --git a/libavfilter/vf_palettegen.c b/libavfilter/vf_palettegen.c index 0e303a9..9a4a701 100644 --- a/libavfilter/vf_palettegen.c +++ b/libavfilter/vf_palettegen.c @@ -275,6 +275,15 @@ static struct color_ref **load_color_refs(const struct hist_node *hist, int nb_r return refs; } +static float set_colorquant_ratio_meta(AVFrame *out, int nb_out, int nb_in) +{ +char buf[32]; +float ratio = (float)nb_out / nb_in; +snprintf(buf, sizeof(buf), %f, ratio); +av_dict_set(out-metadata, lavfi.color_quant_ratio, buf, 0); +return ratio; +} + /** * Main function implementing the Median Cut Algorithm defined by Paul Heckbert * in Color Image Quantization for Frame Buffer Display (1982) @@ -363,8 +372,9 @@ static AVFrame *get_palette_frame(AVFilterContext *ctx) box = box_id = 0 ? s-boxes[box_id] : NULL; } -av_log(ctx, AV_LOG_DEBUG, %d%s boxes generated out of %d colors\n, - s-nb_boxes, s-reserve_transparent ? (+1) : , s-nb_refs); +av_log(ctx, AV_LOG_INFO, %d%s colors generated out of %d colors; ratio=%f\n, + s-nb_boxes, s-reserve_transparent ? (+1) : , s-nb_refs, + set_colorquant_ratio_meta(out, s-nb_boxes, s-nb_refs)); qsort(s-boxes, s-nb_boxes, sizeof(*s-boxes), cmp_color); -- 2.3.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount
Hi, On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com wrote: +#if HAVE_FAST_POPCNT +#if AV_GCC_VERSION_AT_LEAST(4,5) +#ifndef av_popcount +#define av_popcount __builtin_popcount +#endif /* av_popcount */ +#if HAVE_FAST_64BIT +#ifndef av_popcount64 +#define av_popcount64 __builtin_popcountll +#endif /* av_popcount64 */ +#endif /* HAVE_FAST_64BIT */ +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */ +#endif /* HAVE_FAST_POPCNT */ Is this just to get the sse4 popcnt instruction if we compile with -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet configure still does an arch/cpu check. I'd expect the built-in/compiler to do that for us based on -mcpu, and we could always unconditionally use this (as long as gcc = 4.5); alternatively, you could use inline asm and then have the configure check (HAVE_FAST_POPCNT). But doing both seems a little odd. I have no objection to it, patch is still fine, just odd. Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] lavfi: add erosion dilation filter
Signed-off-by: Paul B Mahol one...@gmail.com --- doc/filters.texi | 34 ++ libavfilter/Makefile | 2 + libavfilter/allfilters.c | 2 + libavfilter/vf_neighbor.c | 289 ++ 4 files changed, 327 insertions(+) create mode 100644 libavfilter/vf_neighbor.c diff --git a/doc/filters.texi b/doc/filters.texi index baef346..13ba797 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -3728,6 +3728,23 @@ FFmpeg was configured with @code{--enable-opencl}. Default value is 0. @end table +@section dilation + +Apply dilation effect to the video. + +This filter replaces the pixel by the local(3x3) maximum. + +It accepts the following parameters: + +@table @option +@item threshold +Allows to limit the maximum change, default is 65535. + +@item coordinates +Flag which specifies the pixel to refer to. Default is 255 ie. all eight +pixels are used. +@end table + @section drawbox Draw a colored box on the input image. @@ -4437,6 +4454,23 @@ value. @end table +@section erosion + +Apply erosion effect to the video. + +This filter replaces the pixel by the local(3x3) minimum. + +It accepts the following parameters: + +@table @option +@item threshold +Allows to limit the maximum change, default is 65535. + +@item coordinates +Flag which specifies the pixel to refer to. Default is 255 ie. all eight +pixels are used. +@end table + @section extractplanes Extract color channel components from input video stream into diff --git a/libavfilter/Makefile b/libavfilter/Makefile index 289c63b..97a38a8 100644 --- a/libavfilter/Makefile +++ b/libavfilter/Makefile @@ -112,12 +112,14 @@ OBJS-$(CONFIG_DECIMATE_FILTER) += vf_decimate.o OBJS-$(CONFIG_DEJUDDER_FILTER) += vf_dejudder.o OBJS-$(CONFIG_DELOGO_FILTER) += vf_delogo.o OBJS-$(CONFIG_DESHAKE_FILTER)+= vf_deshake.o +OBJS-$(CONFIG_DILATION_FILTER) += vf_neighbor.o OBJS-$(CONFIG_DRAWBOX_FILTER)+= vf_drawbox.o OBJS-$(CONFIG_DRAWGRID_FILTER) += vf_drawbox.o OBJS-$(CONFIG_DRAWTEXT_FILTER) += vf_drawtext.o OBJS-$(CONFIG_ELBG_FILTER) += vf_elbg.o OBJS-$(CONFIG_EDGEDETECT_FILTER) += vf_edgedetect.o OBJS-$(CONFIG_EQ_FILTER) += vf_eq.o +OBJS-$(CONFIG_EROSION_FILTER)+= vf_neighbor.o OBJS-$(CONFIG_EXTRACTPLANES_FILTER) += vf_extractplanes.o OBJS-$(CONFIG_FADE_FILTER) += vf_fade.o OBJS-$(CONFIG_FIELD_FILTER) += vf_field.o diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c index 55de154..060744a 100644 --- a/libavfilter/allfilters.c +++ b/libavfilter/allfilters.c @@ -128,12 +128,14 @@ void avfilter_register_all(void) REGISTER_FILTER(DEJUDDER, dejudder, vf); REGISTER_FILTER(DELOGO, delogo, vf); REGISTER_FILTER(DESHAKE,deshake,vf); +REGISTER_FILTER(DILATION, dilation, vf); REGISTER_FILTER(DRAWBOX,drawbox,vf); REGISTER_FILTER(DRAWGRID, drawgrid, vf); REGISTER_FILTER(DRAWTEXT, drawtext, vf); REGISTER_FILTER(EDGEDETECT, edgedetect, vf); REGISTER_FILTER(ELBG, elbg, vf); REGISTER_FILTER(EQ, eq, vf); +REGISTER_FILTER(EROSION,erosion,vf); REGISTER_FILTER(EXTRACTPLANES, extractplanes, vf); REGISTER_FILTER(FADE, fade, vf); REGISTER_FILTER(FIELD, field, vf); diff --git a/libavfilter/vf_neighbor.c b/libavfilter/vf_neighbor.c new file mode 100644 index 000..1a923dd --- /dev/null +++ b/libavfilter/vf_neighbor.c @@ -0,0 +1,289 @@ +/* + * Copyright (c) 2012-2013 Oka Motofumi (chikuzen.mo at gmail dot com) + * Copyright (c) 2015 Paul B Mahol + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include libavutil/imgutils.h +#include libavutil/pixdesc.h +#include libavutil/opt.h +#include avfilter.h +#include formats.h +#include internal.h +#include video.h + +typedef struct EDContext { +const AVClass *class; +int planeheight[4]; +int linesize[4]; +int nb_planes; +int
Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount
On Tue, Feb 24, 2015 at 10:05:24PM -0300, James Almer wrote: Signed-off-by: James Almer jamr...@gmail.com --- I decided to go the configure route since other features (cmov, clz) also do it , but if prefered this could instead be done with a new intmath.h header in the x86/ folder containing something like #if defined(__GNUC__) defined(__POPCNT__) #define av_popcount __builtin_popcount #if ARCH_X86_64 #define av_popcount64 __builtin_popcountll #endif #endif For a cleaner compile time check. configure | 12 ++-- libavutil/intmath.h | 13 + 2 files changed, 23 insertions(+), 2 deletions(-) For the record, the builtin implementation looks like this here: av_popcount_c: 0: 89 f8 mov%edi,%eax 2: d1 e8 shr%eax 4: 25 55 55 55 55 and$0x,%eax 9: 29 c7 sub%eax,%edi b: 89 fa mov%edi,%edx d: c1 ef 02shr$0x2,%edi 10: 81 e2 33 33 33 33 and$0x,%edx 16: 81 e7 33 33 33 33 and$0x,%edi 1c: 8d 04 17lea(%rdi,%rdx,1),%eax 1f: 89 c2 mov%eax,%edx 21: c1 ea 04shr$0x4,%edx 24: 01 d0 add%edx,%eax 26: 25 0f 0f 0f 0f and$0xf0f0f0f,%eax 2b: 89 c2 mov%eax,%edx 2d: c1 ea 08shr$0x8,%edx 30: 01 d0 add%edx,%eax 32: 89 c2 mov%eax,%edx 34: c1 ea 10shr$0x10,%edx 37: 01 d0 add%edx,%eax 39: 83 e0 3fand$0x3f,%eax 3c: c3 retq 3d: 0f 1f 00nopl (%rax) 0040 popcount_gcc: 40: 48 83 ec 08 sub$0x8,%rsp 44: 89 ff mov%edi,%edi 46: e8 00 00 00 00 callq 4b popcount_gcc+0xb 4b: 48 83 c4 08 add$0x8,%rsp 4f: c3 retq 0040 popcount_clang: 40: 89 f8 mov%edi,%eax 42: d1 e8 shr%eax 44: 25 55 55 55 55 and$0x,%eax 49: 29 c7 sub%eax,%edi 4b: 89 f8 mov%edi,%eax 4d: 25 33 33 33 33 and$0x,%eax 52: c1 ef 02shr$0x2,%edi 55: 81 e7 33 33 33 33 and$0x,%edi 5b: 01 c7 add%eax,%edi 5d: 89 f8 mov%edi,%eax 5f: c1 e8 04shr$0x4,%eax 62: 01 f8 add%edi,%eax 64: 25 0f 0f 0f 0f and$0xf0f0f0f,%eax 69: 69 c0 01 01 01 01 imul $0x1010101,%eax,%eax 6f: c1 e8 18shr$0x18,%eax 72: c3 retq We might see relevant optimizations for our reference code. [...] -- Clément B. pgpzj_uiJQRKV.pgp Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] avfilter/palettegen: use AV_QSORT()
From: Clément Bœsch clem...@stupeflix.com This makes the sorting of the colors along an axis (r, g or b) predictible, and thus testable under FATE. The performance is not really an issue here since the function is called only once at the end and will need to sort very small number of entries, so an alternative would be to make the sorting functions (see DECLARE_CMP_FUNC()) fallback on another axis in case of equality. This approach was actually simpler. I don't know if there is any advantage in using a multidimensional sort, but it will affect the final palette one way or another. --- libavfilter/vf_palettegen.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/libavfilter/vf_palettegen.c b/libavfilter/vf_palettegen.c index 9a4a701..67c99d9 100644 --- a/libavfilter/vf_palettegen.c +++ b/libavfilter/vf_palettegen.c @@ -25,6 +25,7 @@ #include libavutil/avassert.h #include libavutil/opt.h +#include libavutil/qsort.h #include avfilter.h #include internal.h @@ -351,7 +352,8 @@ static AVFrame *get_palette_frame(AVFilterContext *ctx) /* sort the range by its longest axis if it's not already sorted */ if (box-sorted_by != longest) { -qsort(s-refs[box-start], box-len, sizeof(*s-refs), cmp_funcs[longest]); +cmp_func cmpf = cmp_funcs[longest]; +AV_QSORT(s-refs[box-start], box-len, const struct color_ref *, cmpf); box-sorted_by = longest; } -- 2.3.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavfi: add erosion dilation filter
On Wed, Feb 25, 2015 at 02:55:58PM +, Paul B Mahol wrote: Signed-off-by: Paul B Mahol one...@gmail.com --- doc/filters.texi | 34 ++ libavfilter/Makefile | 2 + libavfilter/allfilters.c | 2 + libavfilter/vf_neighbor.c | 289 ++ 4 files changed, 327 insertions(+) create mode 100644 libavfilter/vf_neighbor.c [...] If you put 1 filters in the same file, you need to protect with CONFIG_*_FILTER. -- Clément B. pgpLwMeMhKlvv.pgp Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avfilter/palettegen: export color quantization ratio
On Wed, Feb 25, 2015 at 03:43:23PM +0100, Nicolas George wrote: Le septidi 7 ventôse, an CCXXIII, Clement Boesch a écrit : +float ratio = (float)nb_out / nb_in; +snprintf(buf, sizeof(buf), %f, ratio); I wonder if both values could be useful instead individually. If so, either set the string to %d/%d or set two metadata keys. The numerator is the user input, so only one value is necessary. We could export the number of colors found in the input (nb_in), but this code is all about the quantization process so I think the ratio is actually more relevant. If not, use double instead of float, float is very rarely useful. And in this particular case it will be converted to double by the vararg call. Sure, OK +av_log(ctx, AV_LOG_INFO, %d%s colors generated out of %d colors; ratio=%f\n, + s-nb_boxes, s-reserve_transparent ? (+1) : , s-nb_refs, + set_colorquant_ratio_meta(out, s-nb_boxes, s-nb_refs)); This is your code, but I must say that I find returning the double value as a secondary effect of a function called set_something() not very readable. The alternative was to have the ratio declared on top of the function (far away), or pass the context to set_colorquant_ratio_meta() to make it log as well. This is really a cosmetic issue, maybe I'll just declare ratio on top so it's more readable. -- Clément B. pgpJTidiVe0p6.pgp Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] lavfi: add erosion dilation filter
On 2/25/15, Clement Boesch u...@pkh.me wrote: On Wed, Feb 25, 2015 at 02:55:58PM +, Paul B Mahol wrote: Signed-off-by: Paul B Mahol one...@gmail.com --- doc/filters.texi | 34 ++ libavfilter/Makefile | 2 + libavfilter/allfilters.c | 2 + libavfilter/vf_neighbor.c | 289 ++ 4 files changed, 327 insertions(+) create mode 100644 libavfilter/vf_neighbor.c [...] If you put 1 filters in the same file, you need to protect with CONFIG_*_FILTER. -- Clement B. locally fixed ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] Adding Webvtt in hls muxer
On 2/21/15 8:05 AM, Deron wrote: On 2/15/15 5:44 AM, Anshul wrote: attached another cleaned patch. -Anshul Not sure if I should be posting here, privately, or do the user list since it is an unaccepted patch... This patch applies cleanly to ffmpeg git of that day, and with minor adjustment to current git, but either crashes the same for me right away. Here is the back trace from gdb: Program received signal SIGSEGV, Segmentation fault. __GI___libc_realloc (oldmem=0x70, bytes=8) at malloc.c:2977 2977malloc.c: No such file or directory. (gdb) bt #0 __GI___libc_realloc (oldmem=0x70, bytes=8) at malloc.c:2977 #1 0x77609b99 in avformat_new_stream (s=s@entry=0xe2dbc0, c=c@entry=0x0) at libavformat/utils.c:3655 #2 0x775451c4 in hls_mux_init (s=0x6787c0) at libavformat/hlsenc.c:194 #3 hls_write_header (s=0x6787c0) at libavformat/hlsenc.c:490 #4 0x775999ec in avformat_write_header (s=s@entry=0x6787c0, options=0x6a9948) at libavformat/mux.c:406 #5 0x00424c00 in transcode_init () at ffmpeg.c:3096 #6 0x00407581 in transcode () at ffmpeg.c:3815 #7 main (argc=13, argv=0x7fffe5a8) at ffmpeg.c:4022 The command (I've tried all sorts of combinations. If I don't provide a subtitle stream, it does not crash. Otherwise it does. Unpatched can generate a webvtt from the subtitle stream without crashing.) ffmpeg -loglevel debug -f lavfi -i movie=out.ts\[out0+subcc\] -f hls -hls_segment_filename /var/www/html/stream/kota/v.low.%d.ts -hls_subtitle_path /var/www/html/stream/kota/ -y /var/www/html/stream/kota/v.low.m3u8 I did find the problem. The patch does not properly initialize hls-vtt_oformat (etc) if you provide the -hls_segment_filename parameter as I have done. Deron ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH][RFC] Interpretation of duration field in AVI super index chunk
On 24.02.2015 16:33, Tobias Rapp wrote: Hi, I am currently trying to interpret the index data of HuffYuv/PCM AVI files written by FFmpeg. If the file is larger than 2GiB an AVIX RIFF chunk is added for each 2GiB-block, each of these blocks gets an OpenDML chunk index ix00 and an OpenDML super index chunk indx is written for each stream that contains the file offset of the according chunk indexes together with its size and total duration. For this super index duration entry the documentation I have found states that it shall be the time span in stream ticks [1]. FFmpeg seems to write the chunk count instead which is correct for video streams where one data chunk 00dc corresponds to one frame. For audio streams one data chunk 01wb usually contains multiple samples. I guess for audio streams the duration should be multiplied by the samples-per-chunk factor in avienc.c to allow third-party applications to jump to the correct AVIX RIFF segment based on a given timestamp. Any comments? To illustrate my issue I have attached a patch that makes the OpenDML super index chunk consistent with my interpretation of the specs. A test with VirtualDub 1.10.4 shows that it also writes the audio duration using sample count instead of packet count. Using sample count allows to seek the audio stream to the right RIFF segment without reading through all OpenDML standard ix01 chunks. Please comment, Tobias From 3a8e15598fa3e044f3ef65b5063c633cb4b3afed Mon Sep 17 00:00:00 2001 From: Tobias Rapp t.r...@noa-audio.com Date: Wed, 25 Feb 2015 17:10:13 +0100 Subject: [PATCH] libavformat/avienc: Fix duration for audio stream OpenDML super index Fixes the duration field of the OpenDML super index indx chunk to contain the number of samples instead of the number of packets for (linear/PCM) audio streams. This matches the OpenDML V1.02 standard text which states that the duration field shall contain time span in stream ticks. --- libavformat/avienc.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/libavformat/avienc.c b/libavformat/avienc.c index 53c2fe7..b77b295 100644 --- a/libavformat/avienc.c +++ b/libavformat/avienc.c @@ -49,6 +49,7 @@ typedef struct AVIIentry { typedef struct AVIIndex { int64_t indx_start; +int64_t audio_strm_offset; int entry; int ents_allocated; AVIIentry** cluster; @@ -91,6 +92,7 @@ static int64_t avi_start_new_riff(AVFormatContext *s, AVIOContext *pb, avi-riff_id++; for (i = 0; i s-nb_streams; i++) { AVIStream *avist = s-streams[i]-priv_data; +avist-indexes.audio_strm_offset = avist-audio_strm_length; avist-indexes.entry = 0; } @@ -476,6 +478,7 @@ static int avi_write_ix(AVFormatContext *s) for (i = 0; i s-nb_streams; i++) { AVIStream *avist = s-streams[i]-priv_data; int64_t ix, pos; +int au_byterate, au_ssize, au_scale; avi_stream2fourcc(tag, i, s-streams[i]-codec-codec_type); ix_tag[3] = '0' + i; @@ -511,7 +514,11 @@ static int avi_write_ix(AVFormatContext *s) avio_skip(pb, 16 * avi-riff_id); avio_wl64(pb, ix);/* qwOffset */ avio_wl32(pb, pos - ix); /* dwSize */ -avio_wl32(pb, avist-indexes.entry); /* dwDuration */ +ff_parse_specific_params(s-streams[i], au_byterate, au_ssize, au_scale); +if (au_ssize == 0) +avio_wl32(pb, avist-indexes.entry); /* dwDuration (packet count) */ +else +avio_wl32(pb, (avist-audio_strm_length - avist-indexes.audio_strm_offset) / au_ssize); /* dwDuration (sample count) */ avio_seek(pb, pos, SEEK_SET); } -- 1.9.1 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount
Hi James, On Wed, Feb 25, 2015 at 12:25 PM, James Almer jamr...@gmail.com wrote: On 25/02/15 9:41 AM, Ronald S. Bultje wrote: Hi, On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com wrote: +#if HAVE_FAST_POPCNT +#if AV_GCC_VERSION_AT_LEAST(4,5) +#ifndef av_popcount +#define av_popcount __builtin_popcount +#endif /* av_popcount */ +#if HAVE_FAST_64BIT +#ifndef av_popcount64 +#define av_popcount64 __builtin_popcountll +#endif /* av_popcount64 */ +#endif /* HAVE_FAST_64BIT */ +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */ +#endif /* HAVE_FAST_POPCNT */ Is this just to get the sse4 popcnt instruction if we compile with -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet configure still does an arch/cpu check. I'd expect the built-in/compiler to do that for us based on -mcpu, and we could always unconditionally use this (as long as gcc = 4.5); alternatively, you could use inline asm and then have the configure check (HAVE_FAST_POPCNT). But doing both seems a little odd. I have no objection to it, patch is still fine, just odd. Ronald I purposely made the checks for gcc 4.5 and in configure for cpus that support popcnt because otherwise __builtin_popcount (at least gcc's) is slower than our generic av_popcount_c function from lavu/common.h. When the CPU supports popcnt the builtin becomes a single inlined instruction. I tried the __asm__ approach, but the code generated by the builtin seemed better. That's interesting, can you show the code you tried? Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount
On 25/02/15 9:41 AM, Ronald S. Bultje wrote: Hi, On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com wrote: +#if HAVE_FAST_POPCNT +#if AV_GCC_VERSION_AT_LEAST(4,5) +#ifndef av_popcount +#define av_popcount __builtin_popcount +#endif /* av_popcount */ +#if HAVE_FAST_64BIT +#ifndef av_popcount64 +#define av_popcount64 __builtin_popcountll +#endif /* av_popcount64 */ +#endif /* HAVE_FAST_64BIT */ +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */ +#endif /* HAVE_FAST_POPCNT */ Is this just to get the sse4 popcnt instruction if we compile with -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet configure still does an arch/cpu check. I'd expect the built-in/compiler to do that for us based on -mcpu, and we could always unconditionally use this (as long as gcc = 4.5); alternatively, you could use inline asm and then have the configure check (HAVE_FAST_POPCNT). But doing both seems a little odd. I have no objection to it, patch is still fine, just odd. Ronald I purposely made the checks for gcc 4.5 and in configure for cpus that support popcnt because otherwise __builtin_popcount (at least gcc's) is slower than our generic av_popcount_c function from lavu/common.h. When the CPU supports popcnt the builtin becomes a single inlined instruction. I tried the __asm__ approach, but the code generated by the builtin seemed better. And I agree it looks odd and maybe way too specific, which is why i said i can add this to a new header in the x86/ folder instead. Patch attached. I don't have clang so i can't test it, nor i know how to check for a version that supports the builtin. From 056a0ec11618bc02ea52eff269dc7bc016dc9897 Mon Sep 17 00:00:00 2001 From: James Almer jamr...@gmail.com Date: Wed, 25 Feb 2015 14:21:02 -0300 Subject: [PATCH] libavutil: add x86 optimized av_popcount Signed-off-by: James Almer jamr...@gmail.com --- libavutil/intmath.h | 3 +++ libavutil/x86/intmath.h | 38 ++ 2 files changed, 41 insertions(+) create mode 100644 libavutil/x86/intmath.h diff --git a/libavutil/intmath.h b/libavutil/intmath.h index fa549c8..f5ecc77 100644 --- a/libavutil/intmath.h +++ b/libavutil/intmath.h @@ -29,6 +29,9 @@ #if ARCH_ARM # include arm/intmath.h #endif +#if ARCH_X86 +# include x86/intmath.h +#endif /** * @addtogroup lavu_internal diff --git a/libavutil/x86/intmath.h b/libavutil/x86/intmath.h new file mode 100644 index 000..2b71cec --- /dev/null +++ b/libavutil/x86/intmath.h @@ -0,0 +1,38 @@ +/* + * Copyright (c) 2015 James Almer + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVUTIL_X86_INTMATH_H +#define AVUTIL_X86_INTMATH_H + +#include stdint.h +#include config.h + +/* Our generic version of av_popcount is faster than GCC's built-in on + * CPUs that don't support the popcnt instruction. + */ +#if defined(__GNUC__) defined(__POPCNT__) +#define av_popcount __builtin_popcount +#if ARCH_X86_64 +#define av_popcount64 __builtin_popcountll +#endif + +#endif /* defined(__GNUC__) defined(__POPCNT__) */ + +#endif /* AVUTIL_X86_INTMATH_H */ -- 2.3.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount
On 25/02/15 2:36 PM, Ronald S. Bultje wrote: Hi James, On Wed, Feb 25, 2015 at 12:25 PM, James Almer jamr...@gmail.com wrote: On 25/02/15 9:41 AM, Ronald S. Bultje wrote: Hi, On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com wrote: +#if HAVE_FAST_POPCNT +#if AV_GCC_VERSION_AT_LEAST(4,5) +#ifndef av_popcount +#define av_popcount __builtin_popcount +#endif /* av_popcount */ +#if HAVE_FAST_64BIT +#ifndef av_popcount64 +#define av_popcount64 __builtin_popcountll +#endif /* av_popcount64 */ +#endif /* HAVE_FAST_64BIT */ +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */ +#endif /* HAVE_FAST_POPCNT */ Is this just to get the sse4 popcnt instruction if we compile with -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet configure still does an arch/cpu check. I'd expect the built-in/compiler to do that for us based on -mcpu, and we could always unconditionally use this (as long as gcc = 4.5); alternatively, you could use inline asm and then have the configure check (HAVE_FAST_POPCNT). But doing both seems a little odd. I have no objection to it, patch is still fine, just odd. Ronald I purposely made the checks for gcc 4.5 and in configure for cpus that support popcnt because otherwise __builtin_popcount (at least gcc's) is slower than our generic av_popcount_c function from lavu/common.h. When the CPU supports popcnt the builtin becomes a single inlined instruction. I tried the __asm__ approach, but the code generated by the builtin seemed better. That's interesting, can you show the code you tried? Ronald If instead of builtins i use +#if HAVE_INLINE_ASM + +#ifdef __POPCNT__ + +#define av_popcount av_popcount_abm +static av_always_inline av_const int av_popcount_abm(uint32_t a) +{ +int x; + +__asm__ (popcnt %1, %0 : =r (x) : rm (a)); +return x; +} + +#if ARCH_X86_64 +#define av_popcount64 av_popcount64_abm +static av_always_inline av_const int av_popcount64_abm(uint64_t a) +{ +uint64_t x; + +__asm__ (popcnt %1, %0 : =r (x) : rm (a)); +return x; +} +#endif + +#endif /* __POPCNT__ */ + +#endif /* HAVE_INLINE_ASM */ in the x86/ header from the second version i sent, on x86_32 av_cpu_count from libavutil/cpu.o on Windows (Will not work with other OSes because of HAVE_GETPROCESSAFFINITYMASK) generates 1af: f3 0f b8 5c 24 18 popcnt ebx,DWORD PTR [esp+0x18] 1b5: 31 c0 xoreax,eax 1b7: f3 0f b8 c0 popcnt eax,eax 1bb: 01 c3 addebx,eax Whereas with the builtins i instead get 1af: f3 0f b8 5c 24 18 popcnt ebx,DWORD PTR [esp+0x18] This is probably because of av_popcount64_c (Used unconditionally on x86_32) calling av_popcount twice, and proc_aff being DWORD_PTR type means it's 4 bytes in x86_32. The builtin seems to know proc_aff can't be right shifted 32 bits so it generates a single popcnt. This is probably a rare case since av_popcount64 is not normally called with a 32bits argument, but the builtin handled it better. If the argument used with av_popcount64 is effectively 64bits then both the builtin and the above asm seem to generate the same code. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount
On 25/02/15 12:43 PM, Clément Bœsch wrote: On Tue, Feb 24, 2015 at 10:05:24PM -0300, James Almer wrote: Signed-off-by: James Almer jamr...@gmail.com --- I decided to go the configure route since other features (cmov, clz) also do it , but if prefered this could instead be done with a new intmath.h header in the x86/ folder containing something like #if defined(__GNUC__) defined(__POPCNT__) #define av_popcount __builtin_popcount #if ARCH_X86_64 #define av_popcount64 __builtin_popcountll #endif #endif For a cleaner compile time check. configure | 12 ++-- libavutil/intmath.h | 13 + 2 files changed, 23 insertions(+), 2 deletions(-) For the record, the builtin implementation looks like this here: av_popcount_c: 0: 89 f8 mov%edi,%eax 2: d1 e8 shr%eax 4: 25 55 55 55 55 and$0x,%eax 9: 29 c7 sub%eax,%edi b: 89 fa mov%edi,%edx d: c1 ef 02shr$0x2,%edi 10: 81 e2 33 33 33 33 and$0x,%edx 16: 81 e7 33 33 33 33 and$0x,%edi 1c: 8d 04 17lea(%rdi,%rdx,1),%eax 1f: 89 c2 mov%eax,%edx 21: c1 ea 04shr$0x4,%edx 24: 01 d0 add%edx,%eax 26: 25 0f 0f 0f 0f and$0xf0f0f0f,%eax 2b: 89 c2 mov%eax,%edx 2d: c1 ea 08shr$0x8,%edx 30: 01 d0 add%edx,%eax 32: 89 c2 mov%eax,%edx 34: c1 ea 10shr$0x10,%edx 37: 01 d0 add%edx,%eax 39: 83 e0 3fand$0x3f,%eax 3c: c3 retq 3d: 0f 1f 00nopl (%rax) 0040 popcount_gcc: 40: 48 83 ec 08 sub$0x8,%rsp 44: 89 ff mov%edi,%edi 46: e8 00 00 00 00 callq 4b popcount_gcc+0xb 4b: 48 83 c4 08 add$0x8,%rsp 4f: c3 retq This is the reason i made it only use the builtin if the CPU supports popcnt, because the fallback implementation as seen here is slower than our version above. 0040 popcount_clang: 40: 89 f8 mov%edi,%eax 42: d1 e8 shr%eax 44: 25 55 55 55 55 and$0x,%eax 49: 29 c7 sub%eax,%edi 4b: 89 f8 mov%edi,%eax 4d: 25 33 33 33 33 and$0x,%eax 52: c1 ef 02shr$0x2,%edi 55: 81 e7 33 33 33 33 and$0x,%edi 5b: 01 c7 add%eax,%edi 5d: 89 f8 mov%edi,%eax 5f: c1 e8 04shr$0x4,%eax 62: 01 f8 add%edi,%eax 64: 25 0f 0f 0f 0f and$0xf0f0f0f,%eax 69: 69 c0 01 01 01 01 imul $0x1010101,%eax,%eax 6f: c1 e8 18shr$0x18,%eax 72: c3 retq We might see relevant optimizations for our reference code. [...] ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount
On 25/02/15 12:43 PM, Clément Bœsch wrote: On Tue, Feb 24, 2015 at 10:05:24PM -0300, James Almer wrote: Signed-off-by: James Almer jamr...@gmail.com --- I decided to go the configure route since other features (cmov, clz) also do it , but if prefered this could instead be done with a new intmath.h header in the x86/ folder containing something like #if defined(__GNUC__) defined(__POPCNT__) #define av_popcount __builtin_popcount #if ARCH_X86_64 #define av_popcount64 __builtin_popcountll #endif #endif For a cleaner compile time check. configure | 12 ++-- libavutil/intmath.h | 13 + 2 files changed, 23 insertions(+), 2 deletions(-) For the record, the builtin implementation looks like this here: av_popcount_c: 0: 89 f8 mov%edi,%eax 2: d1 e8 shr%eax 4: 25 55 55 55 55 and$0x,%eax 9: 29 c7 sub%eax,%edi b: 89 fa mov%edi,%edx d: c1 ef 02shr$0x2,%edi 10: 81 e2 33 33 33 33 and$0x,%edx 16: 81 e7 33 33 33 33 and$0x,%edi 1c: 8d 04 17lea(%rdi,%rdx,1),%eax 1f: 89 c2 mov%eax,%edx 21: c1 ea 04shr$0x4,%edx 24: 01 d0 add%edx,%eax 26: 25 0f 0f 0f 0f and$0xf0f0f0f,%eax 2b: 89 c2 mov%eax,%edx 2d: c1 ea 08shr$0x8,%edx 30: 01 d0 add%edx,%eax 32: 89 c2 mov%eax,%edx 34: c1 ea 10shr$0x10,%edx 37: 01 d0 add%edx,%eax 39: 83 e0 3fand$0x3f,%eax 3c: c3 retq 3d: 0f 1f 00nopl (%rax) 0040 popcount_gcc: 40: 48 83 ec 08 sub$0x8,%rsp 44: 89 ff mov%edi,%edi 46: e8 00 00 00 00 callq 4b popcount_gcc+0xb 4b: 48 83 c4 08 add$0x8,%rsp 4f: c3 retq 0040 popcount_clang: 40: 89 f8 mov%edi,%eax 42: d1 e8 shr%eax 44: 25 55 55 55 55 and$0x,%eax 49: 29 c7 sub%eax,%edi 4b: 89 f8 mov%edi,%eax 4d: 25 33 33 33 33 and$0x,%eax 52: c1 ef 02shr$0x2,%edi 55: 81 e7 33 33 33 33 and$0x,%edi 5b: 01 c7 add%eax,%edi 5d: 89 f8 mov%edi,%eax 5f: c1 e8 04shr$0x4,%eax 62: 01 f8 add%edi,%eax 64: 25 0f 0f 0f 0f and$0xf0f0f0f,%eax 69: 69 c0 01 01 01 01 imul $0x1010101,%eax,%eax 6f: c1 e8 18shr$0x18,%eax 72: c3 retq We might see relevant optimizations for our reference code. What's clang code for av_popcount64_c, or their builtin? We're currently calling av_popcount_c twice from within av_popcount64_c, when on x86_64 cpus we could probably take advantage of the 64bits gprs. [...] ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount
Hi, On Wed, Feb 25, 2015 at 12:52 PM, James Almer jamr...@gmail.com wrote: On 25/02/15 2:36 PM, Ronald S. Bultje wrote: Hi James, On Wed, Feb 25, 2015 at 12:25 PM, James Almer jamr...@gmail.com wrote: On 25/02/15 9:41 AM, Ronald S. Bultje wrote: Hi, On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com wrote: +#if HAVE_FAST_POPCNT +#if AV_GCC_VERSION_AT_LEAST(4,5) +#ifndef av_popcount +#define av_popcount __builtin_popcount +#endif /* av_popcount */ +#if HAVE_FAST_64BIT +#ifndef av_popcount64 +#define av_popcount64 __builtin_popcountll +#endif /* av_popcount64 */ +#endif /* HAVE_FAST_64BIT */ +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */ +#endif /* HAVE_FAST_POPCNT */ Is this just to get the sse4 popcnt instruction if we compile with -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet configure still does an arch/cpu check. I'd expect the built-in/compiler to do that for us based on -mcpu, and we could always unconditionally use this (as long as gcc = 4.5); alternatively, you could use inline asm and then have the configure check (HAVE_FAST_POPCNT). But doing both seems a little odd. I have no objection to it, patch is still fine, just odd. Ronald I purposely made the checks for gcc 4.5 and in configure for cpus that support popcnt because otherwise __builtin_popcount (at least gcc's) is slower than our generic av_popcount_c function from lavu/common.h. When the CPU supports popcnt the builtin becomes a single inlined instruction. I tried the __asm__ approach, but the code generated by the builtin seemed better. That's interesting, can you show the code you tried? Ronald If instead of builtins i use +#if HAVE_INLINE_ASM + +#ifdef __POPCNT__ + +#define av_popcount av_popcount_abm +static av_always_inline av_const int av_popcount_abm(uint32_t a) +{ +int x; + +__asm__ (popcnt %1, %0 : =r (x) : rm (a)); +return x; +} + +#if ARCH_X86_64 +#define av_popcount64 av_popcount64_abm +static av_always_inline av_const int av_popcount64_abm(uint64_t a) +{ +uint64_t x; + +__asm__ (popcnt %1, %0 : =r (x) : rm (a)); +return x; +} +#endif + +#endif /* __POPCNT__ */ + +#endif /* HAVE_INLINE_ASM */ in the x86/ header from the second version i sent, on x86_32 av_cpu_count from libavutil/cpu.o on Windows (Will not work with other OSes because of HAVE_GETPROCESSAFFINITYMASK) generates 1af: f3 0f b8 5c 24 18 popcnt ebx,DWORD PTR [esp+0x18] 1b5: 31 c0 xoreax,eax 1b7: f3 0f b8 c0 popcnt eax,eax 1bb: 01 c3 addebx,eax Whereas with the builtins i instead get 1af: f3 0f b8 5c 24 18 popcnt ebx,DWORD PTR [esp+0x18] This is probably because of av_popcount64_c (Used unconditionally on x86_32) calling av_popcount twice, and proc_aff being DWORD_PTR type means it's 4 bytes in x86_32. The builtin seems to know proc_aff can't be right shifted 32 bits so it generates a single popcnt. It seems to me rather than the second version is defined as pure (or whatever the keyword was), i.e. we don't depend on external state - if you call this with a constant, we can calculate this at compile-time. Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avfilter/palettegen: use AV_QSORT()
On Wed, Feb 25, 2015 at 04:10:44PM +0100, Clément Bœsch wrote: From: Clément Bœsch clem...@stupeflix.com This makes the sorting of the colors along an axis (r, g or b) predictible, and thus testable under FATE. The performance is not really an issue here since the function is called only once at the end and will need to sort very small number of entries, so an alternative would be to make the sorting functions (see DECLARE_CMP_FUNC()) fallback on another axis in case of equality. This approach was actually simpler. I don't know if there is any advantage in using a multidimensional sort, but it will affect the final palette one way or another. --- libavfilter/vf_palettegen.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) LGTM [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Into a blind darkness they enter who follow after the Ignorance, they as if into a greater darkness enter who devote themselves to the Knowledge alone. -- Isha Upanishad signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount
Hi, On Wed, Feb 25, 2015 at 1:21 PM, James Almer jamr...@gmail.com wrote: On 25/02/15 2:58 PM, Ronald S. Bultje wrote: Hi, On Wed, Feb 25, 2015 at 12:52 PM, James Almer jamr...@gmail.com wrote: On 25/02/15 2:36 PM, Ronald S. Bultje wrote: Hi James, On Wed, Feb 25, 2015 at 12:25 PM, James Almer jamr...@gmail.com wrote: On 25/02/15 9:41 AM, Ronald S. Bultje wrote: Hi, On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com wrote: +#if HAVE_FAST_POPCNT +#if AV_GCC_VERSION_AT_LEAST(4,5) +#ifndef av_popcount +#define av_popcount __builtin_popcount +#endif /* av_popcount */ +#if HAVE_FAST_64BIT +#ifndef av_popcount64 +#define av_popcount64 __builtin_popcountll +#endif /* av_popcount64 */ +#endif /* HAVE_FAST_64BIT */ +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */ +#endif /* HAVE_FAST_POPCNT */ Is this just to get the sse4 popcnt instruction if we compile with -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet configure still does an arch/cpu check. I'd expect the built-in/compiler to do that for us based on -mcpu, and we could always unconditionally use this (as long as gcc = 4.5); alternatively, you could use inline asm and then have the configure check (HAVE_FAST_POPCNT). But doing both seems a little odd. I have no objection to it, patch is still fine, just odd. Ronald I purposely made the checks for gcc 4.5 and in configure for cpus that support popcnt because otherwise __builtin_popcount (at least gcc's) is slower than our generic av_popcount_c function from lavu/common.h. When the CPU supports popcnt the builtin becomes a single inlined instruction. I tried the __asm__ approach, but the code generated by the builtin seemed better. That's interesting, can you show the code you tried? Ronald If instead of builtins i use +#if HAVE_INLINE_ASM + +#ifdef __POPCNT__ + +#define av_popcount av_popcount_abm +static av_always_inline av_const int av_popcount_abm(uint32_t a) +{ +int x; + +__asm__ (popcnt %1, %0 : =r (x) : rm (a)); +return x; +} + +#if ARCH_X86_64 +#define av_popcount64 av_popcount64_abm +static av_always_inline av_const int av_popcount64_abm(uint64_t a) +{ +uint64_t x; + +__asm__ (popcnt %1, %0 : =r (x) : rm (a)); +return x; +} +#endif + +#endif /* __POPCNT__ */ + +#endif /* HAVE_INLINE_ASM */ in the x86/ header from the second version i sent, on x86_32 av_cpu_count from libavutil/cpu.o on Windows (Will not work with other OSes because of HAVE_GETPROCESSAFFINITYMASK) generates 1af: f3 0f b8 5c 24 18 popcnt ebx,DWORD PTR [esp+0x18] 1b5: 31 c0 xoreax,eax 1b7: f3 0f b8 c0 popcnt eax,eax 1bb: 01 c3 addebx,eax Whereas with the builtins i instead get 1af: f3 0f b8 5c 24 18 popcnt ebx,DWORD PTR [esp+0x18] This is probably because of av_popcount64_c (Used unconditionally on x86_32) calling av_popcount twice, and proc_aff being DWORD_PTR type means it's 4 bytes in x86_32. The builtin seems to know proc_aff can't be right shifted 32 bits so it generates a single popcnt. It seems to me rather than the second version is defined as pure (or whatever the keyword was), i.e. we don't depend on external state - if you call this with a constant, we can calculate this at compile-time. Ronald Do you know how to get the asm version working right for this one case? No, const should do it but looks like inline asm breaks that assumption. Otherwise i think the second version with builtins i sent is good to apply. Much cleaner than the configure approach to be honest. OK. Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount
On 25/02/15 2:58 PM, Ronald S. Bultje wrote: Hi, On Wed, Feb 25, 2015 at 12:52 PM, James Almer jamr...@gmail.com wrote: On 25/02/15 2:36 PM, Ronald S. Bultje wrote: Hi James, On Wed, Feb 25, 2015 at 12:25 PM, James Almer jamr...@gmail.com wrote: On 25/02/15 9:41 AM, Ronald S. Bultje wrote: Hi, On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com wrote: +#if HAVE_FAST_POPCNT +#if AV_GCC_VERSION_AT_LEAST(4,5) +#ifndef av_popcount +#define av_popcount __builtin_popcount +#endif /* av_popcount */ +#if HAVE_FAST_64BIT +#ifndef av_popcount64 +#define av_popcount64 __builtin_popcountll +#endif /* av_popcount64 */ +#endif /* HAVE_FAST_64BIT */ +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */ +#endif /* HAVE_FAST_POPCNT */ Is this just to get the sse4 popcnt instruction if we compile with -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet configure still does an arch/cpu check. I'd expect the built-in/compiler to do that for us based on -mcpu, and we could always unconditionally use this (as long as gcc = 4.5); alternatively, you could use inline asm and then have the configure check (HAVE_FAST_POPCNT). But doing both seems a little odd. I have no objection to it, patch is still fine, just odd. Ronald I purposely made the checks for gcc 4.5 and in configure for cpus that support popcnt because otherwise __builtin_popcount (at least gcc's) is slower than our generic av_popcount_c function from lavu/common.h. When the CPU supports popcnt the builtin becomes a single inlined instruction. I tried the __asm__ approach, but the code generated by the builtin seemed better. That's interesting, can you show the code you tried? Ronald If instead of builtins i use +#if HAVE_INLINE_ASM + +#ifdef __POPCNT__ + +#define av_popcount av_popcount_abm +static av_always_inline av_const int av_popcount_abm(uint32_t a) +{ +int x; + +__asm__ (popcnt %1, %0 : =r (x) : rm (a)); +return x; +} + +#if ARCH_X86_64 +#define av_popcount64 av_popcount64_abm +static av_always_inline av_const int av_popcount64_abm(uint64_t a) +{ +uint64_t x; + +__asm__ (popcnt %1, %0 : =r (x) : rm (a)); +return x; +} +#endif + +#endif /* __POPCNT__ */ + +#endif /* HAVE_INLINE_ASM */ in the x86/ header from the second version i sent, on x86_32 av_cpu_count from libavutil/cpu.o on Windows (Will not work with other OSes because of HAVE_GETPROCESSAFFINITYMASK) generates 1af: f3 0f b8 5c 24 18 popcnt ebx,DWORD PTR [esp+0x18] 1b5: 31 c0 xoreax,eax 1b7: f3 0f b8 c0 popcnt eax,eax 1bb: 01 c3 addebx,eax Whereas with the builtins i instead get 1af: f3 0f b8 5c 24 18 popcnt ebx,DWORD PTR [esp+0x18] This is probably because of av_popcount64_c (Used unconditionally on x86_32) calling av_popcount twice, and proc_aff being DWORD_PTR type means it's 4 bytes in x86_32. The builtin seems to know proc_aff can't be right shifted 32 bits so it generates a single popcnt. It seems to me rather than the second version is defined as pure (or whatever the keyword was), i.e. we don't depend on external state - if you call this with a constant, we can calculate this at compile-time. Ronald Do you know how to get the asm version working right for this one case? Otherwise i think the second version with builtins i sent is good to apply. Much cleaner than the configure approach to be honest. Someone else can change it to add the relevant clang check/s because i can't test that. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] ffmpeg: allow to set the thread message queue size.
On Wed, Feb 25, 2015 at 12:01:26PM +0100, Nicolas George wrote: Signed-off-by: Nicolas George geo...@nsup.org --- doc/ffmpeg.texi | 6 ++ ffmpeg.c| 2 +- ffmpeg.h| 2 ++ ffmpeg_opt.c| 4 4 files changed, 13 insertions(+), 1 deletion(-) without pthreads, this seems to fail to build ffmpeg/ffmpeg_opt.c: In function ‘open_input_file’: ffmpeg/ffmpeg_opt.c:954:6: error: ‘InputFile’ has no member named ‘thread_queue_size’ make: *** [ffmpeg_opt.o] Error 1 [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB In a rich man's house there is no place to spit but his face. -- Diogenes of Sinope signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] ffmpeg: allow to set the thread message queue size.
Le septidi 7 ventôse, an CCXXIII, Michael Niedermayer a écrit : without pthreads, this seems to fail to build Forgot to test that, sorry. Fixed. By the way, what are the remaining supported cases where threads are not available? Regards, -- Nicolas George From 16fc0adda7fa35b69053205abf9dd61a3ebe4b03 Mon Sep 17 00:00:00 2001 From: Nicolas George geo...@nsup.org Date: Wed, 25 Feb 2015 11:50:44 +0100 Subject: [PATCH 2/3] ffmpeg: allow to set the thread message queue size. Signed-off-by: Nicolas George geo...@nsup.org --- doc/ffmpeg.texi | 6 ++ ffmpeg.c| 2 +- ffmpeg.h| 2 ++ ffmpeg_opt.c| 6 ++ 4 files changed, 15 insertions(+), 1 deletion(-) diff --git a/doc/ffmpeg.texi b/doc/ffmpeg.texi index 6e51c44..6772f2f 100644 --- a/doc/ffmpeg.texi +++ b/doc/ffmpeg.texi @@ -1158,6 +1158,12 @@ This option enables or disables accurate seeking in input files with the transcoding. Use @option{-noaccurate_seek} to disable it, which may be useful e.g. when copying some streams and transcoding the others. +@item -thread_message_queue @var{size} (@emph{input}) +This option sets the maximum number of queued packets when reading from the +file or device. With low latency / high rate live streams, packets may be +discarded if they are not read in a timely manner; raising this value can +avoid it. + @item -override_ffserver (@emph{global}) Overrides the input specifications from @command{ffserver}. Using this option you can map any input stream to @command{ffserver} and control diff --git a/ffmpeg.c b/ffmpeg.c index becd5df..54ba0cb 100644 --- a/ffmpeg.c +++ b/ffmpeg.c @@ -3420,7 +3420,7 @@ static int init_input_threads(void) strcmp(f-ctx-iformat-name, lavfi)) f-non_blocking = 1; ret = av_thread_message_queue_alloc(f-in_thread_queue, -8, sizeof(AVPacket)); +f-thread_queue_size, sizeof(AVPacket)); if (ret 0) return ret; diff --git a/ffmpeg.h b/ffmpeg.h index d2e0c5c..7c0c22c 100644 --- a/ffmpeg.h +++ b/ffmpeg.h @@ -111,6 +111,7 @@ typedef struct OptionsContext { int64_t input_ts_offset; int rate_emu; int accurate_seek; +int thread_queue_size; SpecifierOpt *ts_scale; intnb_ts_scale; @@ -350,6 +351,7 @@ typedef struct InputFile { pthread_t thread; /* thread reading from this file */ int non_blocking; /* reading packets from the thread should not block */ int joined; /* the thread has been joined */ +int thread_queue_size; /* maximum number of queued packets */ #endif } InputFile; diff --git a/ffmpeg_opt.c b/ffmpeg_opt.c index 09e6e33..39c5f49 100644 --- a/ffmpeg_opt.c +++ b/ffmpeg_opt.c @@ -951,6 +951,9 @@ static int open_input_file(OptionsContext *o, const char *filename) f-nb_streams = ic-nb_streams; f-rate_emu = o-rate_emu; f-accurate_seek = o-accurate_seek; +#if HAVE_PTHREADS +f-thread_queue_size = o-thread_queue_size 0 ? o-thread_queue_size : 8; +#endif /* check if all codec options have been used */ unused_opts = strip_specifiers(o-g-codec_opts); @@ -2957,6 +2960,9 @@ const OptionDef options[] = { { disposition,OPT_STRING | HAS_ARG | OPT_SPEC | OPT_OUTPUT, { .off = OFFSET(disposition) }, disposition, }, +{ thread_queue_size, HAS_ARG | OPT_INT | OPT_OFFSET | OPT_EXPERT | OPT_INPUT, + { .off = OFFSET(thread_queue_size) }, +set the maximum number of queued packets from the demuxer }, /* video options */ { vframes, OPT_VIDEO | HAS_ARG | OPT_PERFILE | OPT_OUTPUT, { .func_arg = opt_video_frames }, -- 2.1.4 signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount
On 25.02.2015, at 19:29, Ronald S. Bultje rsbul...@gmail.com wrote: Hi, On Wed, Feb 25, 2015 at 1:21 PM, James Almer jamr...@gmail.com wrote: On 25/02/15 2:58 PM, Ronald S. Bultje wrote: Hi, On Wed, Feb 25, 2015 at 12:52 PM, James Almer jamr...@gmail.com wrote: On 25/02/15 2:36 PM, Ronald S. Bultje wrote: Hi James, On Wed, Feb 25, 2015 at 12:25 PM, James Almer jamr...@gmail.com wrote: On 25/02/15 9:41 AM, Ronald S. Bultje wrote: Hi, On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com wrote: +#if HAVE_FAST_POPCNT +#if AV_GCC_VERSION_AT_LEAST(4,5) +#ifndef av_popcount +#define av_popcount __builtin_popcount +#endif /* av_popcount */ +#if HAVE_FAST_64BIT +#ifndef av_popcount64 +#define av_popcount64 __builtin_popcountll +#endif /* av_popcount64 */ +#endif /* HAVE_FAST_64BIT */ +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */ +#endif /* HAVE_FAST_POPCNT */ Is this just to get the sse4 popcnt instruction if we compile with -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet configure still does an arch/cpu check. I'd expect the built-in/compiler to do that for us based on -mcpu, and we could always unconditionally use this (as long as gcc = 4.5); alternatively, you could use inline asm and then have the configure check (HAVE_FAST_POPCNT). But doing both seems a little odd. I have no objection to it, patch is still fine, just odd. Ronald I purposely made the checks for gcc 4.5 and in configure for cpus that support popcnt because otherwise __builtin_popcount (at least gcc's) is slower than our generic av_popcount_c function from lavu/common.h. When the CPU supports popcnt the builtin becomes a single inlined instruction. I tried the __asm__ approach, but the code generated by the builtin seemed better. That's interesting, can you show the code you tried? Ronald If instead of builtins i use +#if HAVE_INLINE_ASM + +#ifdef __POPCNT__ + +#define av_popcount av_popcount_abm +static av_always_inline av_const int av_popcount_abm(uint32_t a) +{ +int x; + +__asm__ (popcnt %1, %0 : =r (x) : rm (a)); +return x; +} + +#if ARCH_X86_64 +#define av_popcount64 av_popcount64_abm +static av_always_inline av_const int av_popcount64_abm(uint64_t a) +{ +uint64_t x; + +__asm__ (popcnt %1, %0 : =r (x) : rm (a)); +return x; +} +#endif + +#endif /* __POPCNT__ */ + +#endif /* HAVE_INLINE_ASM */ in the x86/ header from the second version i sent, on x86_32 av_cpu_count from libavutil/cpu.o on Windows (Will not work with other OSes because of HAVE_GETPROCESSAFFINITYMASK) generates 1af: f3 0f b8 5c 24 18 popcnt ebx,DWORD PTR [esp+0x18] 1b5: 31 c0 xoreax,eax 1b7: f3 0f b8 c0 popcnt eax,eax 1bb: 01 c3 addebx,eax Whereas with the builtins i instead get 1af: f3 0f b8 5c 24 18 popcnt ebx,DWORD PTR [esp+0x18] This is probably because of av_popcount64_c (Used unconditionally on x86_32) calling av_popcount twice, and proc_aff being DWORD_PTR type means it's 4 bytes in x86_32. The builtin seems to know proc_aff can't be right shifted 32 bits so it generates a single popcnt. It seems to me rather than the second version is defined as pure (or whatever the keyword was), i.e. we don't depend on external state - if you call this with a constant, we can calculate this at compile-time. Ronald Do you know how to get the asm version working right for this one case? No, const should do it but looks like inline asm breaks that assumption. There are gcc builtins to check if something is a constant and use e.g. the c version then. Not sure if it works here or makes sense, just in principle. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] ffmpeg: allow to set the thread message queue size.
On Wed, Feb 25, 2015 at 08:00:03PM +0100, Nicolas George wrote: Le septidi 7 ventôse, an CCXXIII, Michael Niedermayer a écrit : without pthreads, this seems to fail to build Forgot to test that, sorry. Fixed. By the way, what are the remaining supported cases where threads are not available? i think the case i tested had HAVE_W32THREADS set also theres HAVE_OS2THREADS [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB No snowflake in an avalanche ever feels responsible. -- Voltaire signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] ffmpeg: allow to set the thread message queue size.
On Wed, 25 Feb 2015 22:24:39 +0100 Michael Niedermayer michae...@gmx.at wrote: On Wed, Feb 25, 2015 at 08:00:03PM +0100, Nicolas George wrote: Le septidi 7 ventôse, an CCXXIII, Michael Niedermayer a écrit : without pthreads, this seems to fail to build Forgot to test that, sorry. Fixed. By the way, what are the remaining supported cases where threads are not available? i think the case i tested had HAVE_W32THREADS set also theres HAVE_OS2THREADS [...] Surely OS/2 support is worth of forcing every single developer who touches thread stuff to consider a dead platform that's only used by 2 persons in Korea (?). ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount
On Wed, Feb 25, 2015 at 02:29:11PM -0300, James Almer wrote: On 25/02/15 12:43 PM, Clément Bœsch wrote: On Tue, Feb 24, 2015 at 10:05:24PM -0300, James Almer wrote: Signed-off-by: James Almer jamr...@gmail.com --- I decided to go the configure route since other features (cmov, clz) also do it , but if prefered this could instead be done with a new intmath.h header in the x86/ folder containing something like #if defined(__GNUC__) defined(__POPCNT__) #define av_popcount __builtin_popcount #if ARCH_X86_64 #define av_popcount64 __builtin_popcountll #endif #endif For a cleaner compile time check. configure | 12 ++-- libavutil/intmath.h | 13 + 2 files changed, 23 insertions(+), 2 deletions(-) For the record, the builtin implementation looks like this here: av_popcount_c: 0: 89 f8 mov%edi,%eax 2: d1 e8 shr%eax 4: 25 55 55 55 55 and$0x,%eax 9: 29 c7 sub%eax,%edi b: 89 fa mov%edi,%edx d: c1 ef 02shr$0x2,%edi 10: 81 e2 33 33 33 33 and$0x,%edx 16: 81 e7 33 33 33 33 and$0x,%edi 1c: 8d 04 17lea(%rdi,%rdx,1),%eax 1f: 89 c2 mov%eax,%edx 21: c1 ea 04shr$0x4,%edx 24: 01 d0 add%edx,%eax 26: 25 0f 0f 0f 0f and$0xf0f0f0f,%eax 2b: 89 c2 mov%eax,%edx 2d: c1 ea 08shr$0x8,%edx 30: 01 d0 add%edx,%eax 32: 89 c2 mov%eax,%edx 34: c1 ea 10shr$0x10,%edx 37: 01 d0 add%edx,%eax 39: 83 e0 3fand$0x3f,%eax 3c: c3 retq 3d: 0f 1f 00nopl (%rax) 0040 popcount_gcc: 40: 48 83 ec 08 sub$0x8,%rsp 44: 89 ff mov%edi,%edi 46: e8 00 00 00 00 callq 4b popcount_gcc+0xb 4b: 48 83 c4 08 add$0x8,%rsp 4f: c3 retq 0040 popcount_clang: 40: 89 f8 mov%edi,%eax 42: d1 e8 shr%eax 44: 25 55 55 55 55 and$0x,%eax 49: 29 c7 sub%eax,%edi 4b: 89 f8 mov%edi,%eax 4d: 25 33 33 33 33 and$0x,%eax 52: c1 ef 02shr$0x2,%edi 55: 81 e7 33 33 33 33 and$0x,%edi 5b: 01 c7 add%eax,%edi 5d: 89 f8 mov%edi,%eax 5f: c1 e8 04shr$0x4,%eax 62: 01 f8 add%edi,%eax 64: 25 0f 0f 0f 0f and$0xf0f0f0f,%eax 69: 69 c0 01 01 01 01 imul $0x1010101,%eax,%eax 6f: c1 e8 18shr$0x18,%eax 72: c3 retq We might see relevant optimizations for our reference code. What's clang code for av_popcount64_c, or their builtin? popcount64_clang: 0: 48 89 f8movrax,rdi 3: 48 d1 e8shrrax,1 6: 48 b9 55 55 55 55 55movabs rcx,0x d: 55 55 55 10: 48 21 c1andrcx,rax 13: 48 29 cfsubrdi,rcx 16: 48 b8 33 33 33 33 33movabs rax,0x 1d: 33 33 33 20: 48 89 f9movrcx,rdi 23: 48 21 c1andrcx,rax 26: 48 c1 ef 02 shrrdi,0x2 2a: 48 21 c7andrdi,rax 2d: 48 01 cfaddrdi,rcx 30: 48 89 f8movrax,rdi 33: 48 c1 e8 04 shrrax,0x4 37: 48 01 f8addrax,rdi 3a: 48 b9 0f 0f 0f 0f 0fmovabs rcx,0xf0f0f0f0f0f0f0f 41: 0f 0f 0f 44: 48 21 c1andrcx,rax 47: 48 b8 01 01 01 01 01movabs rax,0x101010101010101 4e: 01 01 01 51: 48 0f af c1 imul rax,rcx 55: 48 c1 e8 38 shrrax,0x38 59: c3 ret 5a: 66 0f 1f 44 00 00 nopWORD PTR [rax+rax*1+0x0] 0060 av_popcount64_c: 60: 89 f8 moveax,edi 62: d1 e8 shreax,1 64: 25 55 55 55 55 andeax,0x 69: 89 f9 movecx,edi 6b: 29 c1 subecx,eax 6d: 89 c8 moveax,ecx 6f: 25 33 33 33 33 andeax,0x 74: c1 e9 02shrecx,0x2 77: 81 e1 33 33 33 33 andecx,0x 7d: 01 c1
Re: [FFmpeg-devel] [PATCH] avcodec/utils: Allocate dummy codec_frame for video encoders which do not allocate one
On Mon, Feb 23, 2015 at 07:51:30PM +0100, wm4 wrote: On Mon, 23 Feb 2015 19:11:58 +0100 Michael Niedermayer michae...@gmx.at wrote: On Mon, Feb 23, 2015 at 06:49:36PM +0100, Michael Niedermayer wrote: On Mon, Feb 23, 2015 at 06:15:15PM +0100, wm4 wrote: On Mon, 23 Feb 2015 17:19:16 +0100 Michael Niedermayer michae...@gmx.at wrote: This should allow simplifying many encoders Signed-off-by: Michael Niedermayer michae...@gmx.at --- libavcodec/utils.c | 12 1 file changed, 12 insertions(+) diff --git a/libavcodec/utils.c b/libavcodec/utils.c index c5e6300..c6d588a 100644 --- a/libavcodec/utils.c +++ b/libavcodec/utils.c @@ -1695,6 +1695,16 @@ int attribute_align_arg avcodec_open2(AVCodecContext *avctx, const AVCodec *code av_assert0(*(const AVClass **)avctx-priv_data == codec-priv_class); } +if (av_codec_is_encoder(avctx-codec) +avctx-codec_type == AVMEDIA_TYPE_VIDEO +!avctx-coded_frame) { +avctx-coded_frame = av_frame_alloc(); +if (avctx-coded_frame) { +av_assert0(avctx-coded_frame-key_frame); +avctx-coded_frame-pict_type = AV_PICTURE_TYPE_I; +} +} + end: ff_unlock_avcodec(); if (options) { @@ -2852,6 +2862,8 @@ av_cold int avcodec_close(AVCodecContext *avctx) av_freep(avctx-internal-hwaccel_priv_data); av_freep(avctx-internal); +if (av_codec_is_encoder(avctx-codec)) +av_frame_free(avctx-coded_frame); This isn't the same condition. You might free the frame, even if you didn't allocate it. fixed locally by seting a internal variable so the condition is also not duplicated What's the value in allocating the frame anyway? Seems like it increases the memory management pain levels by 200%. if its good or not is entirely for the lib uses to decide, nothing in FFmpeg should need it i just would like to avoid having this allocation duplicated in half the encoders so IMHO encoders which do not export anything in their coded_frame should either be able to leave it NULL or have it handled by some generic code that does not need to be duplicated in each encoder Can you explain more? Why do some encoders need it, and some not? coded_frame is used to export information from encoders, encoders should generally not need it themselfs [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Everything should be made as simple as possible, but not simpler. -- Albert Einstein signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Socket options for librtmp
On Mon, Feb 23, 2015 at 10:28:29PM -0600, Cary Tetrick wrote: please ping this thread when this happens about the patch itself IIUC a user using old rtmpdump with new ffmpeg would loose the rtmp_buffer_size option. Would it make sense to support the old setsockopt() code for this case ? [...] That's a really good point. So the only real question then is whether to keep it the way it is, or to submit another patch to support the other two options (which are not as important IMO) in the same manner as Brian's patch. thats simply a question of if someone wants to implement and maintain it Either way, it would eliminate the dependency on rtmpdump. What do you think? Thanks, Cary ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB No snowflake in an avalanche ever feels responsible. -- Voltaire signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] avformat/adxdec: check avctx-channels for invalid values
Hi, if avctx-channels is 0 in adx_read_packet, size gets set to 0, av_get_packet sets pkt-data to NULL and then AV_RB16(pkt-data) results in a null pointer dereference. Attached patch fixes this. Best regards, Andreas From 2578976a0a9eec03d168f393795119fd274ee81f Mon Sep 17 00:00:00 2001 From: Andreas Cadhalpun andreas.cadhal...@googlemail.com Date: Wed, 25 Feb 2015 22:55:44 +0100 Subject: [PATCH] avformat/adxdec: check avctx-channels for invalid values This avoids a null pointer dereference of pkt-data. Signed-off-by: Andreas Cadhalpun andreas.cadhal...@googlemail.com --- libavformat/adxdec.c | 5 + 1 file changed, 5 insertions(+) diff --git a/libavformat/adxdec.c b/libavformat/adxdec.c index ddaa201..24a8a1f 100644 --- a/libavformat/adxdec.c +++ b/libavformat/adxdec.c @@ -40,6 +40,11 @@ static int adx_read_packet(AVFormatContext *s, AVPacket *pkt) AVCodecContext *avctx = s-streams[0]-codec; int ret, size; +if (avctx-channels = 0) { +av_log(s, AV_LOG_ERROR, invalid number of channels %d\n, avctx-channels); +return AVERROR_INVALIDDATA; +} + size = BLOCK_SIZE * avctx-channels; pkt-pos = avio_tell(s-pb); -- 2.1.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avformat/adxdec: check avctx-channels for invalid values
On Wed, Feb 25, 2015 at 11:48:33PM +0100, Andreas Cadhalpun wrote: Hi, if avctx-channels is 0 in adx_read_packet, size gets set to 0, av_get_packet sets pkt-data to NULL and then AV_RB16(pkt-data) results in a null pointer dereference. Attached patch fixes this. Best regards, Andreas adxdec.c |5 + 1 file changed, 5 insertions(+) 7312e6a3be1771c83eac72784496c6fc4692d954 0001-avformat-adxdec-check-avctx-channels-for-invalid-val.patch From 2578976a0a9eec03d168f393795119fd274ee81f Mon Sep 17 00:00:00 2001 From: Andreas Cadhalpun andreas.cadhal...@googlemail.com Date: Wed, 25 Feb 2015 22:55:44 +0100 Subject: [PATCH] avformat/adxdec: check avctx-channels for invalid values This avoids a null pointer dereference of pkt-data. Signed-off-by: Andreas Cadhalpun andreas.cadhal...@googlemail.com --- libavformat/adxdec.c | 5 + 1 file changed, 5 insertions(+) diff --git a/libavformat/adxdec.c b/libavformat/adxdec.c index ddaa201..24a8a1f 100644 --- a/libavformat/adxdec.c +++ b/libavformat/adxdec.c @@ -40,6 +40,11 @@ static int adx_read_packet(AVFormatContext *s, AVPacket *pkt) AVCodecContext *avctx = s-streams[0]-codec; int ret, size; +if (avctx-channels = 0) { +av_log(s, AV_LOG_ERROR, invalid number of channels %d\n, avctx-channels); +return AVERROR_INVALIDDATA; +} the demuxer should extract the channel value in adx_read_header() and check it there. (if it needs the channels, which it does currently) its not good for demuxing to depend on a decoder/parser setting this value between reading the file header and before demuxing the first packet [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB I know you won't believe me, but the highest form of Human Excellence is to question oneself and others. -- Socrates signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH][RFC] Interpretation of duration field in AVI super index chunk
On Wed, Feb 25, 2015 at 05:35:33PM +0100, Tobias Rapp wrote: On 24.02.2015 16:33, Tobias Rapp wrote: Hi, I am currently trying to interpret the index data of HuffYuv/PCM AVI files written by FFmpeg. If the file is larger than 2GiB an AVIX RIFF chunk is added for each 2GiB-block, each of these blocks gets an OpenDML chunk index ix00 and an OpenDML super index chunk indx is written for each stream that contains the file offset of the according chunk indexes together with its size and total duration. For this super index duration entry the documentation I have found states that it shall be the time span in stream ticks [1]. FFmpeg seems to write the chunk count instead which is correct for video streams where one data chunk 00dc corresponds to one frame. For audio streams one data chunk 01wb usually contains multiple samples. I guess for audio streams the duration should be multiplied by the samples-per-chunk factor in avienc.c to allow third-party applications to jump to the correct AVIX RIFF segment based on a given timestamp. Any comments? To illustrate my issue I have attached a patch that makes the OpenDML super index chunk consistent with my interpretation of the specs. A test with VirtualDub 1.10.4 shows that it also writes the audio duration using sample count instead of packet count. Using sample count allows to seek the audio stream to the right RIFF segment without reading through all OpenDML standard ix01 chunks. Please comment, Tobias avienc.c |9 - 1 file changed, 8 insertions(+), 1 deletion(-) 09f8c8250ce5ec1bdad79a1bf280028c9d3af376 0001-libavformat-avienc-Fix-duration-for-audio-stream-Ope.patch From 3a8e15598fa3e044f3ef65b5063c633cb4b3afed Mon Sep 17 00:00:00 2001 From: Tobias Rapp t.r...@noa-audio.com Date: Wed, 25 Feb 2015 17:10:13 +0100 Subject: [PATCH] libavformat/avienc: Fix duration for audio stream OpenDML super index Fixes the duration field of the OpenDML super index indx chunk to contain the number of samples instead of the number of packets for (linear/PCM) audio streams. This matches the OpenDML V1.02 standard text which states that the duration field shall contain time span in stream ticks. --- libavformat/avienc.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/libavformat/avienc.c b/libavformat/avienc.c index 53c2fe7..b77b295 100644 --- a/libavformat/avienc.c +++ b/libavformat/avienc.c @@ -49,6 +49,7 @@ typedef struct AVIIentry { typedef struct AVIIndex { int64_t indx_start; +int64_t audio_strm_offset; int entry; int ents_allocated; AVIIentry** cluster; @@ -91,6 +92,7 @@ static int64_t avi_start_new_riff(AVFormatContext *s, AVIOContext *pb, avi-riff_id++; for (i = 0; i s-nb_streams; i++) { AVIStream *avist = s-streams[i]-priv_data; +avist-indexes.audio_strm_offset = avist-audio_strm_length; avist-indexes.entry = 0; } @@ -476,6 +478,7 @@ static int avi_write_ix(AVFormatContext *s) for (i = 0; i s-nb_streams; i++) { AVIStream *avist = s-streams[i]-priv_data; int64_t ix, pos; +int au_byterate, au_ssize, au_scale; avi_stream2fourcc(tag, i, s-streams[i]-codec-codec_type); ix_tag[3] = '0' + i; @@ -511,7 +514,11 @@ static int avi_write_ix(AVFormatContext *s) avio_skip(pb, 16 * avi-riff_id); avio_wl64(pb, ix);/* qwOffset */ avio_wl32(pb, pos - ix); /* dwSize */ -avio_wl32(pb, avist-indexes.entry); /* dwDuration */ +ff_parse_specific_params(s-streams[i], au_byterate, au_ssize, au_scale); +if (au_ssize == 0) +avio_wl32(pb, avist-indexes.entry); /* dwDuration (packet count) */ +else +avio_wl32(pb, (avist-audio_strm_length - avist-indexes.audio_strm_offset) / au_ssize); /* dwDuration (sample count) */ the rounding here is wrong the index to be useable needs to have segments duration summable as is this would lead to significangt errors if many of these durations would be summed up something like avist-audio_strm_length / au_ssize - avist-indexes.audio_strm_offset / au_ssize might avoid that but ive not tested it [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Many things microsoft did are stupid, but not doing something just because microsoft did it is even more stupid. If everything ms did were stupid they would be bankrupt already. signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/3] avformat/dss: correct sample rate
On Tue, Feb 24, 2015 at 03:22:37PM +0100, Michael Niedermayer wrote: Signed-off-by: Michael Niedermayer michae...@gmx.at --- libavformat/dss.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) applied [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB When the tyrant has disposed of foreign enemies by conquest or treaty, and there is nothing more to fear from them, then he is always stirring up some war or other, in order that the people may require a leader. -- Plato signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH] libavfilter/thumbnail filter, copy buffers instead of reference
This fixes the buffer handling in the thumbnail filter, it seems that the referencing buffers it does isn't safe (could be hundreds of them held in ref) and the main ffmpeg buffer handling can fail. It now just grabs the video buffers and copies them, frees the frame immediately, and works just the same now, and doesn't use 1.7 Gig of memory in my tests with 300 buffers, it uses around 100 meg max now. There was obviously something nasty in how it was doing this. diff --git a/libavfilter/vf_thumbnail.c b/libavfilter/vf_thumbnail.c index 1883154..82b8ebe 100644 --- a/libavfilter/vf_thumbnail.c +++ b/libavfilter/vf_thumbnail.c @@ -137,9 +137,6 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) int *hist = thumb-frames[thumb-n].histogram; const uint8_t *p = frame-data[0]; -// keep a reference of each frame -thumb-frames[thumb-n].buf = frame; - // update current frame RGB histogram for (j = 0; j inlink-h; j++) { for (i = 0; i inlink-w; i++) { @@ -150,6 +147,17 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) p += frame-linesize[0]; } +// Copy orginal frame into buffer(s) +thumb-frames[thumb-n].buf = ff_get_video_buffer(inlink, inlink-w, inlink-h); +if (!thumb-frames[thumb-n].buf) { +av_frame_free(frame); +return AVERROR(ENOMEM); +} +av_frame_copy_props(thumb-frames[thumb-n].buf, frame); + +// Free original frame we have copied +av_frame_free(frame); + // no selection until the buffer of N frames is filled up thumb-n++; if (thumb-n thumb-n_frames) I previously sent a patch which was wrong, and just avoided the issue for certain media but hit it on others, this should fix it for good now (since it also doesn't alter the chosen thumbnail that results)... Thanks, Chris --- Chris Kennedy Video Engineer CrunchyRoll - http://www.crunchyroll.com -- --- Chris Kennedy diff --git a/libavfilter/vf_thumbnail.c b/libavfilter/vf_thumbnail.c index 1883154..a1272a0 100644 --- a/libavfilter/vf_thumbnail.c +++ b/libavfilter/vf_thumbnail.c @@ -142,7 +142,8 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) // update current frame RGB histogram for (j = 0; j inlink-h; j++) { -for (i = 0; i inlink-w; i++) { +// last third of image, walk every 3 bytes/pixels reading RGB +for (i = 0; i inlink-w/3; i++) { hist[0*256 + p[i*3]]++; hist[1*256 + p[i*3 + 1]]++; hist[2*256 + p[i*3 + 2]]++; ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avformat/dss: set packet duration
On Tue, Feb 24, 2015 at 04:01:22PM +0100, Michael Niedermayer wrote: Value taken from the decoder implementation Signed-off-by: Michael Niedermayer michae...@gmx.at applied [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB In a rich man's house there is no place to spit but his face. -- Diogenes of Sinope signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount
On 25/02/15 3:29 PM, Ronald S. Bultje wrote: Otherwise i think the second version with builtins i sent is good to apply. Much cleaner than the configure approach to be honest. OK. Ronald Pushed, thanks. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 3/3] avformat/dss: implement seeking
On Tue, Feb 24, 2015 at 03:22:39PM +0100, Michael Niedermayer wrote: This assumes CBR (which is true for all samples i have) Signed-off-by: Michael Niedermayer michae...@gmx.at --- libavformat/dss.c | 39 +++ 1 file changed, 39 insertions(+) reviewed by Oleksij some bugs fixed applied [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB The worst form of inequality is to try to make unequal things equal. -- Aristotle signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/3] avformat/dss: set bitrate
On Tue, Feb 24, 2015 at 03:22:38PM +0100, Michael Niedermayer wrote: Signed-off-by: Michael Niedermayer michae...@gmx.at --- libavformat/dss.c |8 1 file changed, 8 insertions(+) applied [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Awnsering whenever a program halts or runs forever is On a turing machine, in general impossible (turings halting problem). On any real computer, always possible as a real computer has a finite number of states N, and will either halt in less than N cycles or never halt. signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] ffmpeg: allow to set the thread message queue size.
On Wed, Feb 25, 2015 at 10:29:09PM +0100, wm4 wrote: On Wed, 25 Feb 2015 22:24:39 +0100 Michael Niedermayer michae...@gmx.at wrote: On Wed, Feb 25, 2015 at 08:00:03PM +0100, Nicolas George wrote: Le septidi 7 ventôse, an CCXXIII, Michael Niedermayer a écrit : without pthreads, this seems to fail to build Forgot to test that, sorry. Fixed. By the way, what are the remaining supported cases where threads are not available? i think the case i tested had HAVE_W32THREADS set also theres HAVE_OS2THREADS [...] Surely OS/2 support is worth of forcing every single developer who touches thread stuff to consider a dead platform that's only used by 2 persons in Korea (?). did any mail above sound like it impled that someone has to consider OS/2 ? the existing OS/2 support is maintained by the people using OS/2, AFAIK they never asked others to maintain the support [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Opposition brings concord. Out of discord comes the fairest harmony. -- Heraclitus signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avformat/adxdec: check avctx-channels for invalid values
On 26.02.2015 00:24, Michael Niedermayer wrote: On Wed, Feb 25, 2015 at 11:48:33PM +0100, Andreas Cadhalpun wrote: Hi, if avctx-channels is 0 in adx_read_packet, size gets set to 0, av_get_packet sets pkt-data to NULL and then AV_RB16(pkt-data) results in a null pointer dereference. Attached patch fixes this. Best regards, Andreas adxdec.c |5 + 1 file changed, 5 insertions(+) 7312e6a3be1771c83eac72784496c6fc4692d954 0001-avformat-adxdec-check-avctx-channels-for-invalid-val.patch From 2578976a0a9eec03d168f393795119fd274ee81f Mon Sep 17 00:00:00 2001 From: Andreas Cadhalpun andreas.cadhal...@googlemail.com Date: Wed, 25 Feb 2015 22:55:44 +0100 Subject: [PATCH] avformat/adxdec: check avctx-channels for invalid values This avoids a null pointer dereference of pkt-data. Signed-off-by: Andreas Cadhalpun andreas.cadhal...@googlemail.com --- libavformat/adxdec.c | 5 + 1 file changed, 5 insertions(+) diff --git a/libavformat/adxdec.c b/libavformat/adxdec.c index ddaa201..24a8a1f 100644 --- a/libavformat/adxdec.c +++ b/libavformat/adxdec.c @@ -40,6 +40,11 @@ static int adx_read_packet(AVFormatContext *s, AVPacket *pkt) AVCodecContext *avctx = s-streams[0]-codec; int ret, size; +if (avctx-channels = 0) { +av_log(s, AV_LOG_ERROR, invalid number of channels %d\n, avctx-channels); +return AVERROR_INVALIDDATA; +} the demuxer should extract the channel value in adx_read_header() and check it there. (if it needs the channels, which it does currently) its not good for demuxing to depend on a decoder/parser setting this value between reading the file header and before demuxing the first packet You're right about that. Attached is a patch for this. However it might still be a good idea to apply above patch, because the decoder/parser could set avctx-channels to 0, even if the demuxer has set it to something positive. Best regards, Andreas From 2a0e342cb095a6bd62dcbd313db9f33346c79f98 Mon Sep 17 00:00:00 2001 From: Andreas Cadhalpun andreas.cadhal...@googlemail.com Date: Thu, 26 Feb 2015 01:06:57 +0100 Subject: [PATCH 2/2] avformat/adxdec: set avctx-channels in adx_read_header It is used in adx_read_packet, which currently depends on the decoder/parser setting this value between reading the file header and demuxing the first packet. Signed-off-by: Andreas Cadhalpun andreas.cadhal...@googlemail.com --- libavformat/adxdec.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/libavformat/adxdec.c b/libavformat/adxdec.c index 24a8a1f..e7107ac 100644 --- a/libavformat/adxdec.c +++ b/libavformat/adxdec.c @@ -88,8 +88,14 @@ static int adx_read_header(AVFormatContext *s) av_log(s, AV_LOG_ERROR, Invalid extradata size.\n); return AVERROR_INVALIDDATA; } +avctx-channels= AV_RB8(avctx-extradata + 7); avctx-sample_rate = AV_RB32(avctx-extradata + 8); +if (avctx-channels = 0) { +av_log(s, AV_LOG_ERROR, invalid number of channels %d\n, avctx-channels); +return AVERROR_INVALIDDATA; +} + st-codec-codec_type = AVMEDIA_TYPE_AUDIO; st-codec-codec_id= s-iformat-raw_codec_id; -- 2.1.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] libavfilter/thumbnail filter, copy buffers instead of reference
I forgot the copy of the original buffer, oddly if I copy it then causes things to crash at the end when the codec is freed, only for certain .avi mpeg4 and in the mpeg4 codec where it frees some data there. I am digging in to that, but patch at least shows deeper into something odd going on. I can just copy the R and G but not B and it doesn't hit the issue for this test video. Seems really strange, before it seemed to be having backtraces in ffmpeg.c and buffer freeing conflicting or some problem, now when copying the buffers it is mpeg4 codec freeing and only if reading that blue value into the destination buffer, odd. diff --git a/libavfilter/vf_thumbnail.c b/libavfilter/vf_thumbnail.c index 186ac8d..925eaaf 100644 --- a/libavfilter/vf_thumbnail.c +++ b/libavfilter/vf_thumbnail.c @@ -137,10 +137,19 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) int *hist = thumb-frames[thumb-n].histogram; const uint8_t *p = frame-data[0]; +// Copy orginal frame into buffer(s) +thumb-frames[thumb-n].buf = ff_get_video_buffer(outlink, outlink-w, outlink-h); +if (!thumb-frames[thumb-n].buf) { +av_frame_free(frame); +return AVERROR(ENOMEM); +} +av_frame_copy_props(thumb-frames[thumb-n].buf, frame); +if (av_frame_copy(thumb-frames[thumb-n].buf, frame)) +return AVERROR(ENOMEM); + // update current frame RGB histogram for (j = 0; j inlink-h; j++) { @@ -148,14 +157,6 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) p += frame-linesize[0]; } -// Copy orginal frame into buffer(s) -thumb-frames[thumb-n].buf = ff_get_video_buffer(inlink, inlink-w, inlink-h); -if (!thumb-frames[thumb-n].buf) { -av_frame_free(frame); -return AVERROR(ENOMEM); -} -av_frame_copy_props(thumb-frames[thumb-n].buf, frame); - // Free original frame we have copied av_frame_free(frame); On Wed, Feb 25, 2015 at 3:40 PM, Chris Kennedy bitbyte...@gmail.com wrote: This fixes the buffer handling in the thumbnail filter, it seems that the referencing buffers it does isn't safe (could be hundreds of them held in ref) and the main ffmpeg buffer handling can fail. It now just grabs the video buffers and copies them, frees the frame immediately, and works just the same now, and doesn't use 1.7 Gig of memory in my tests with 300 buffers, it uses around 100 meg max now. There was obviously something nasty in how it was doing this. diff --git a/libavfilter/vf_thumbnail.c b/libavfilter/vf_thumbnail.c index 1883154..82b8ebe 100644 --- a/libavfilter/vf_thumbnail.c +++ b/libavfilter/vf_thumbnail.c @@ -137,9 +137,6 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) int *hist = thumb-frames[thumb-n].histogram; const uint8_t *p = frame-data[0]; -// keep a reference of each frame -thumb-frames[thumb-n].buf = frame; - // update current frame RGB histogram for (j = 0; j inlink-h; j++) { for (i = 0; i inlink-w; i++) { @@ -150,6 +147,17 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *frame) p += frame-linesize[0]; } +// Copy orginal frame into buffer(s) +thumb-frames[thumb-n].buf = ff_get_video_buffer(inlink, inlink-w, inlink-h); +if (!thumb-frames[thumb-n].buf) { +av_frame_free(frame); +return AVERROR(ENOMEM); +} +av_frame_copy_props(thumb-frames[thumb-n].buf, frame); + +// Free original frame we have copied +av_frame_free(frame); + // no selection until the buffer of N frames is filled up thumb-n++; if (thumb-n thumb-n_frames) I previously sent a patch which was wrong, and just avoided the issue for certain media but hit it on others, this should fix it for good now (since it also doesn't alter the chosen thumbnail that results)... Thanks, Chris --- Chris Kennedy Video Engineer CrunchyRoll - http://www.crunchyroll.com -- --- Chris Kennedy -- --- Chris Kennedy ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Port FFT domain filter.
On Wed, Feb 25, 2015 at 11:55:51AM +0530, arwa arif wrote: On Tue, Feb 24, 2015 at 4:12 PM, Michael Niedermayer michae...@gmx.at wrote: On Tue, Feb 24, 2015 at 02:27:01PM +0530, arwa arif wrote: Hello, I have written a very primitive code for porting FFT domain filter. It accepts only gray8 format images. The output should be a grayscale image, but the ouput image is coming out to be a black and white image. Also, I am getting confused when to do the vertical pass. After taking irdft of the horizontal pass or before it? you can do both rdft first and then the 2 irdft passes, this should give more possibilities in filtering but it could be done the other way around too I have attached the patch. Makefile |1 allfilters.c |1 vf_fftfilt.c | 139 +++ 3 files changed, 141 insertions(+) d4b25d6a204534a66400f52c1f5312652e8208af 0001-Port-FFT-domain-filter.patch From 455a261d7e2b3afba767aac2e73448aeee02d159 Mon Sep 17 00:00:00 2001 From: Arwa Arif arwaarif1...@gmail.com Date: Tue, 24 Feb 2015 12:17:30 +0530 Subject: [PATCH] Port FFT domain filter. --- libavfilter/Makefile |1 + libavfilter/allfilters.c |1 + libavfilter/vf_fftfilt.c | 139 ++ 3 files changed, 141 insertions(+) create mode 100644 libavfilter/vf_fftfilt.c diff --git a/libavfilter/Makefile b/libavfilter/Makefile index 289c63b..b184f07 100644 --- a/libavfilter/Makefile +++ b/libavfilter/Makefile @@ -120,6 +120,7 @@ OBJS-$(CONFIG_EDGEDETECT_FILTER) += vf_edgedetect.o OBJS-$(CONFIG_EQ_FILTER) += vf_eq.o OBJS-$(CONFIG_EXTRACTPLANES_FILTER) += vf_extractplanes.o OBJS-$(CONFIG_FADE_FILTER) += vf_fade.o +OBJS-$(CONFIG_FFTFILT_FILTER)+= vf_fftfilt.o OBJS-$(CONFIG_FIELD_FILTER) += vf_field.o OBJS-$(CONFIG_FIELDMATCH_FILTER) += vf_fieldmatch.o OBJS-$(CONFIG_FIELDORDER_FILTER) += vf_fieldorder.o diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c index 55de154..043ac56 100644 --- a/libavfilter/allfilters.c +++ b/libavfilter/allfilters.c @@ -136,6 +136,7 @@ void avfilter_register_all(void) REGISTER_FILTER(EQ, eq, vf); REGISTER_FILTER(EXTRACTPLANES, extractplanes, vf); REGISTER_FILTER(FADE, fade, vf); +REGISTER_FILTER(FFTFILT,fftfilt,vf); REGISTER_FILTER(FIELD, field, vf); REGISTER_FILTER(FIELDMATCH, fieldmatch, vf); REGISTER_FILTER(FIELDORDER, fieldorder, vf); diff --git a/libavfilter/vf_fftfilt.c b/libavfilter/vf_fftfilt.c new file mode 100644 index 000..753bc8e --- /dev/null +++ b/libavfilter/vf_fftfilt.c @@ -0,0 +1,139 @@ +/* + * Copyright (c) 2015 Arwa Arif arwaarif1...@gmail.com + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with FFmpeg; if not, write to the Free Software Foundation, Inc., + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + */ + +/** + * @file + * FFT domain filtering. + */ + +#include libavfilter/internal.h +#include libavutil/common.h +#include libavutil/imgutils.h +#include libavutil/opt.h +#include libavutil/pixdesc.h +#include libavcodec/avfft.h + +typedef struct { +const AVClass *class; + +RDFTContext *rdft; +int rdft_bits; +FFTSample *rdft_data; + +} FFTFILTContext; + +static int filter_frame(AVFilterLink *inlink, AVFrame *in) +{ +AVFilterContext *ctx = inlink-dst; +AVFilterLink *outlink = inlink-dst-outputs[0]; +FFTFILTContext *fftfilt = ctx-priv; +AVFrame *out; +int i, j, k, rdft_bits; +size_t rdft_len, w, h; + +w = inlink-w; +h = inlink-h; + +/* RDFT window size (precision) according to the requested output frame height */ +for (rdft_bits = 1; 1 rdft_bits 2 * w; rdft_bits++); +rdft_len = 1 rdft_bits; +fftfilt-rdft_data = av_malloc_array(h, rdft_len * sizeof(FFTSample)); +memset(fftfilt-rdft_data, 0, rdft_len * h *
[FFmpeg-devel] [PATCH 2/2] ffmpeg: notify when the thread message queue blocks.
This can help finding the source of A-V desync with live input. Signed-off-by: Nicolas George geo...@nsup.org --- ffmpeg.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) No functional change since the first version, but reworded to point to the relevant option and give the current value. diff --git a/ffmpeg.c b/ffmpeg.c index 54ba0cb..c6f3dcd 100644 --- a/ffmpeg.c +++ b/ffmpeg.c @@ -3356,6 +3356,7 @@ static int check_keyboard_interaction(int64_t cur_time) static void *input_thread(void *arg) { InputFile *f = arg; +unsigned flags = f-non_blocking ? AV_THREAD_MESSAGE_NONBLOCK : 0; int ret = 0; while (1) { @@ -3371,7 +3372,15 @@ static void *input_thread(void *arg) break; } av_dup_packet(pkt); -ret = av_thread_message_queue_send(f-in_thread_queue, pkt, 0); +ret = av_thread_message_queue_send(f-in_thread_queue, pkt, flags); +if (flags ret == AVERROR(EAGAIN)) { +flags = 0; +ret = av_thread_message_queue_send(f-in_thread_queue, pkt, flags); +av_log(f-ctx, AV_LOG_WARNING, + Thread message queue blocking; consider raising the + thread_queue_size option (current value: %d)\n, + f-thread_queue_size); +} if (ret 0) { if (ret != AVERROR_EOF) av_log(f-ctx, AV_LOG_ERROR, -- 2.1.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] [PATCH 1/2] ffmpeg: allow to set the thread message queue size.
Signed-off-by: Nicolas George geo...@nsup.org --- doc/ffmpeg.texi | 6 ++ ffmpeg.c| 2 +- ffmpeg.h| 2 ++ ffmpeg_opt.c| 4 4 files changed, 13 insertions(+), 1 deletion(-) diff --git a/doc/ffmpeg.texi b/doc/ffmpeg.texi index 6e51c44..6772f2f 100644 --- a/doc/ffmpeg.texi +++ b/doc/ffmpeg.texi @@ -1158,6 +1158,12 @@ This option enables or disables accurate seeking in input files with the transcoding. Use @option{-noaccurate_seek} to disable it, which may be useful e.g. when copying some streams and transcoding the others. +@item -thread_message_queue @var{size} (@emph{input}) +This option sets the maximum number of queued packets when reading from the +file or device. With low latency / high rate live streams, packets may be +discarded if they are not read in a timely manner; raising this value can +avoid it. + @item -override_ffserver (@emph{global}) Overrides the input specifications from @command{ffserver}. Using this option you can map any input stream to @command{ffserver} and control diff --git a/ffmpeg.c b/ffmpeg.c index becd5df..54ba0cb 100644 --- a/ffmpeg.c +++ b/ffmpeg.c @@ -3420,7 +3420,7 @@ static int init_input_threads(void) strcmp(f-ctx-iformat-name, lavfi)) f-non_blocking = 1; ret = av_thread_message_queue_alloc(f-in_thread_queue, -8, sizeof(AVPacket)); +f-thread_queue_size, sizeof(AVPacket)); if (ret 0) return ret; diff --git a/ffmpeg.h b/ffmpeg.h index d2e0c5c..7c0c22c 100644 --- a/ffmpeg.h +++ b/ffmpeg.h @@ -111,6 +111,7 @@ typedef struct OptionsContext { int64_t input_ts_offset; int rate_emu; int accurate_seek; +int thread_queue_size; SpecifierOpt *ts_scale; intnb_ts_scale; @@ -350,6 +351,7 @@ typedef struct InputFile { pthread_t thread; /* thread reading from this file */ int non_blocking; /* reading packets from the thread should not block */ int joined; /* the thread has been joined */ +int thread_queue_size; /* maximum number of queued packets */ #endif } InputFile; diff --git a/ffmpeg_opt.c b/ffmpeg_opt.c index 09e6e33..05b37d4 100644 --- a/ffmpeg_opt.c +++ b/ffmpeg_opt.c @@ -951,6 +951,7 @@ static int open_input_file(OptionsContext *o, const char *filename) f-nb_streams = ic-nb_streams; f-rate_emu = o-rate_emu; f-accurate_seek = o-accurate_seek; +f-thread_queue_size = o-thread_queue_size 0 ? o-thread_queue_size : 8; /* check if all codec options have been used */ unused_opts = strip_specifiers(o-g-codec_opts); @@ -2957,6 +2958,9 @@ const OptionDef options[] = { { disposition,OPT_STRING | HAS_ARG | OPT_SPEC | OPT_OUTPUT, { .off = OFFSET(disposition) }, disposition, }, +{ thread_queue_size, HAS_ARG | OPT_INT | OPT_OFFSET | OPT_EXPERT | OPT_INPUT, + { .off = OFFSET(thread_queue_size) }, +set the maximum number of queued packets from the demuxer }, /* video options */ { vframes, OPT_VIDEO | HAS_ARG | OPT_PERFILE | OPT_OUTPUT, { .func_arg = opt_video_frames }, -- 2.1.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] DSP function ARM NEON patches for hevc
17/02/15 12:44, Michael Niedermayer mich...@niedermayer.cc: On Tue, Feb 17, 2015 at 07:33:04AM +, Tomperi Seppo wrote: On 16 Feb 2015, at 19:54, Michael Niedermayer mich...@niedermayer.cc wrote: On Mon, Feb 16, 2015 at 12:47:36PM +, Tomperi Seppo wrote: More NEON optimizations for testing. fate-hevc passes on Tegra K1, but these haven't been tested for NEON clobbering. -Seppo From: Tomperi Seppo Sent: Monday, February 16, 2015 1:30 PM To: Michael Niedermayer Cc: Michael Niedermayer; FFmpeg development discussions and patches; Mickaël Raulet Subject: RE: [FFmpeg-devel] DSP function ARM NEON patches for hevc Hi Michael, Here is a totally shot in a dark fix attempt for NEON register clobbering for deblocking. Could you test it with qemu and check if it works. -Seppo From: Michael Niedermayer [mich...@niedermayer.cc] Sent: Monday, February 16, 2015 3:28 AM To: Tomperi Seppo Cc: Michael Niedermayer; FFmpeg development discussions and patches; Mickaël Raulet Subject: Re: [FFmpeg-devel] DSP function ARM NEON patches for hevc Hi On Sun, Feb 15, 2015 at 08:31:32PM +, Tomperi Seppo wrote: Hi! The reason is chroma deblocking which is using q4 without pushing it to stack. :/ Unfortunately I am in Geneve this week and don't have ARM linux board with me so it is not easy to test. Mickael Raulet: maybe guys at INSA could run tests this week if I make a fix? Could you ask? If they cant, then i probably can test it too if its a patch which applies cleanly to ffmpeg and testing fate-hevc with --enable-neon-clobber-test under qemu is what is needed i could test on a arm board too if needed I also have SAO, qpel and epel NEON patches for latest FFmpeg. They pass fate-hevc on Jetson TK1, but should be iOS and clobber checked. -Seppo From: Michael Niedermayer [michae...@gmx.at] Sent: Friday, February 13, 2015 5:38 PM To: FFmpeg development discussions and patches Cc: Tomperi Seppo; Mickaël Raulet Subject: Re: [FFmpeg-devel] DSP function ARM NEON patches for hevc On Thu, Feb 05, 2015 at 02:22:28PM +0100, Mickaël Raulet wrote: Michael, Please find some commits that can be cherry picked from https://github.com/OpenHEVC/FFmpeg/commits/ffmpeg_patch Optimized deblocking filter (8bits only) 1b9ee47d2f43b0a029a9468233626102eb1473b8 this breaks the neon clobber test see: fate.ffmpeg.org/report.cgi?time=20150211030204slot=armv7l-panda-gcc4.6-c ortexa8-clobber [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB The worst form of inequality is to try to make unequal things equal. -- Aristotle -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Opposition brings concord. Out of discord comes the fairest harmony. -- Heraclitus Makefile|3 hevcdsp_init_neon.c | 159 hevcdsp_qpel_neon.S | 999 3 files changed, 1160 insertions(+), 1 deletion(-) 9fb0b3c33edf085845b7a0fba3ca77d1ba55dd6c 0001-hevcdsp-ARM-NEON-optimized-qpel-functions.patch From ce06cb2bea4b051995608b11651b185e7a825a4c Mon Sep 17 00:00:00 2001 From: Seppo Tomperi seppo.tomp...@vtt.fi Date: Wed, 11 Feb 2015 10:20:26 + Subject: [PATCH] hevcdsp: ARM NEON optimized qpel functions --- libavcodec/arm/Makefile| 3 +- libavcodec/arm/hevcdsp_init_neon.c | 159 ++ libavcodec/arm/hevcdsp_qpel_neon.S | 999 + 3 files changed, 1160 insertions(+), 1 deletion(-) create mode 100644 libavcodec/arm/hevcdsp_qpel_neon.S seems to fail building: libavformat/utils.o CC libavcodec/arm/hevcdsp_init_neon.o AS libavcodec/arm/hevcdsp_qpel_neon.o ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S: Assembler messages: ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: expected } -- `vld1.32 {d0[0]d0[1]d1[0]d1[1]},[r2],r3' ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: Neon double or quad precision register expected -- `vld1.32 {},[r2],r3' ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: Neon double or quad precision register expected -- `vld1.32 {},[r2],r3' ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: Neon double or quad precision register expected -- `vld1.32 {},[r2],r3' ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: expected } -- `vst1.32 {d0[0]d0[1]d1[0]d1[1]},[r0],r1' ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: Neon double or quad precision register expected -- `vst1.32 {},[r0],r1' ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: Neon double or quad precision register expected -- `vst1.32 {},[r0],r1' ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: Neon double or quad