date:20150225

Re: [FFmpeg-devel] [PATCH] avfilter/palettegen: export color quantization ratio

2015-02-25 Thread Nicolas George

Le septidi 7 ventôse, an CCXXIII, Clement Boesch a écrit :
 +float ratio = (float)nb_out / nb_in;
 +snprintf(buf, sizeof(buf), %f, ratio);

I wonder if both values could be useful instead individually. If so, either
set the string to %d/%d or set two metadata keys.

If not, use double instead of float, float is very rarely useful. And in
this particular case it will be converted to double by the vararg call.

 +av_log(ctx, AV_LOG_INFO, %d%s colors generated out of %d colors; 
 ratio=%f\n,
 +   s-nb_boxes, s-reserve_transparent ? (+1) : , s-nb_refs,
 +   set_colorquant_ratio_meta(out, s-nb_boxes, s-nb_refs));

This is your code, but I must say that I find returning the double value as
a secondary effect of a function called set_something() not very readable.

Regards,

-- 
  Nicolas George


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH] avfilter/palettegen: export color quantization ratio

2015-02-25 Thread Clément Bœsch

From: Clément Bœsch clem...@stupeflix.com

---
 doc/filters.texi|  5 +
 libavfilter/version.h   |  2 +-
 libavfilter/vf_palettegen.c | 14 --
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/doc/filters.texi b/doc/filters.texi
index baef346..0c72145 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -6980,6 +6980,11 @@ the background is static.
 Default value is @var{full}.
 @end table
 
+The filter also exports the frame metadata @code{lavfi.color_quant_ratio}
+(@code{nb_color_in / nb_color_out}) which you can use to evaluate the degree of
+color quantization of the palette. This information is also visible at
+@var{info} logging level.
+
 @subsection Examples
 
 @itemize
diff --git a/libavfilter/version.h b/libavfilter/version.h
index 83fd2a5..986d7a7 100644
--- a/libavfilter/version.h
+++ b/libavfilter/version.h
@@ -31,7 +31,7 @@
 
 #define LIBAVFILTER_VERSION_MAJOR  5
 #define LIBAVFILTER_VERSION_MINOR  11
-#define LIBAVFILTER_VERSION_MICRO 101
+#define LIBAVFILTER_VERSION_MICRO 102
 
 #define LIBAVFILTER_VERSION_INT AV_VERSION_INT(LIBAVFILTER_VERSION_MAJOR, \
LIBAVFILTER_VERSION_MINOR, \
diff --git a/libavfilter/vf_palettegen.c b/libavfilter/vf_palettegen.c
index 0e303a9..9a4a701 100644
--- a/libavfilter/vf_palettegen.c
+++ b/libavfilter/vf_palettegen.c
@@ -275,6 +275,15 @@ static struct color_ref **load_color_refs(const struct 
hist_node *hist, int nb_r
 return refs;
 }
 
+static float set_colorquant_ratio_meta(AVFrame *out, int nb_out, int nb_in)
+{
+char buf[32];
+float ratio = (float)nb_out / nb_in;
+snprintf(buf, sizeof(buf), %f, ratio);
+av_dict_set(out-metadata, lavfi.color_quant_ratio, buf, 0);
+return ratio;
+}
+
 /**
  * Main function implementing the Median Cut Algorithm defined by Paul Heckbert
  * in Color Image Quantization for Frame Buffer Display (1982)
@@ -363,8 +372,9 @@ static AVFrame *get_palette_frame(AVFilterContext *ctx)
 box = box_id = 0 ? s-boxes[box_id] : NULL;
 }
 
-av_log(ctx, AV_LOG_DEBUG, %d%s boxes generated out of %d colors\n,
-   s-nb_boxes, s-reserve_transparent ? (+1) : , s-nb_refs);
+av_log(ctx, AV_LOG_INFO, %d%s colors generated out of %d colors; 
ratio=%f\n,
+   s-nb_boxes, s-reserve_transparent ? (+1) : , s-nb_refs,
+   set_colorquant_ratio_meta(out, s-nb_boxes, s-nb_refs));
 
 qsort(s-boxes, s-nb_boxes, sizeof(*s-boxes), cmp_color);
 
-- 
2.3.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount

2015-02-25 Thread Ronald S. Bultje

Hi,

On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com wrote:

 +#if HAVE_FAST_POPCNT
 +#if AV_GCC_VERSION_AT_LEAST(4,5)
 +#ifndef av_popcount
 +#define av_popcount   __builtin_popcount
 +#endif /* av_popcount */
 +#if HAVE_FAST_64BIT
 +#ifndef av_popcount64
 +#define av_popcount64 __builtin_popcountll
 +#endif /* av_popcount64 */
 +#endif /* HAVE_FAST_64BIT */
 +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */
 +#endif /* HAVE_FAST_POPCNT */


Is this just to get the sse4 popcnt instruction if we compile with
-mcpu=sse4? The slightly odd thing is that we're using a built-in, yet
configure still does an arch/cpu check. I'd expect the built-in/compiler to
do that for us based on -mcpu, and we could always unconditionally use this
(as long as gcc = 4.5); alternatively, you could use inline asm and then
have the configure check (HAVE_FAST_POPCNT). But doing both seems a little
odd. I have no objection to it, patch is still fine, just odd.

Ronald
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH] lavfi: add erosion dilation filter

2015-02-25 Thread Paul B Mahol

Signed-off-by: Paul B Mahol one...@gmail.com
---
 doc/filters.texi  |  34 ++
 libavfilter/Makefile  |   2 +
 libavfilter/allfilters.c  |   2 +
 libavfilter/vf_neighbor.c | 289 ++
 4 files changed, 327 insertions(+)
 create mode 100644 libavfilter/vf_neighbor.c

diff --git a/doc/filters.texi b/doc/filters.texi
index baef346..13ba797 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -3728,6 +3728,23 @@ FFmpeg was configured with @code{--enable-opencl}. 
Default value is 0.
 
 @end table
 
+@section dilation
+
+Apply dilation effect to the video.
+
+This filter replaces the pixel by the local(3x3) maximum.
+
+It accepts the following parameters:
+
+@table @option
+@item threshold
+Allows to limit the maximum change, default is 65535.
+
+@item coordinates
+Flag which specifies the pixel to refer to. Default is 255 ie. all eight
+pixels are used.
+@end table
+
 @section drawbox
 
 Draw a colored box on the input image.
@@ -4437,6 +4454,23 @@ value.
 
 @end table
 
+@section erosion
+
+Apply erosion effect to the video.
+
+This filter replaces the pixel by the local(3x3) minimum.
+
+It accepts the following parameters:
+
+@table @option
+@item threshold
+Allows to limit the maximum change, default is 65535.
+
+@item coordinates
+Flag which specifies the pixel to refer to. Default is 255 ie. all eight
+pixels are used.
+@end table
+
 @section extractplanes
 
 Extract color channel components from input video stream into
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index 289c63b..97a38a8 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -112,12 +112,14 @@ OBJS-$(CONFIG_DECIMATE_FILTER)   += 
vf_decimate.o
 OBJS-$(CONFIG_DEJUDDER_FILTER)   += vf_dejudder.o
 OBJS-$(CONFIG_DELOGO_FILTER) += vf_delogo.o
 OBJS-$(CONFIG_DESHAKE_FILTER)+= vf_deshake.o
+OBJS-$(CONFIG_DILATION_FILTER)   += vf_neighbor.o
 OBJS-$(CONFIG_DRAWBOX_FILTER)+= vf_drawbox.o
 OBJS-$(CONFIG_DRAWGRID_FILTER)   += vf_drawbox.o
 OBJS-$(CONFIG_DRAWTEXT_FILTER)   += vf_drawtext.o
 OBJS-$(CONFIG_ELBG_FILTER)   += vf_elbg.o
 OBJS-$(CONFIG_EDGEDETECT_FILTER) += vf_edgedetect.o
 OBJS-$(CONFIG_EQ_FILTER) += vf_eq.o
+OBJS-$(CONFIG_EROSION_FILTER)+= vf_neighbor.o
 OBJS-$(CONFIG_EXTRACTPLANES_FILTER)  += vf_extractplanes.o
 OBJS-$(CONFIG_FADE_FILTER)   += vf_fade.o
 OBJS-$(CONFIG_FIELD_FILTER)  += vf_field.o
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index 55de154..060744a 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -128,12 +128,14 @@ void avfilter_register_all(void)
 REGISTER_FILTER(DEJUDDER,   dejudder,   vf);
 REGISTER_FILTER(DELOGO, delogo, vf);
 REGISTER_FILTER(DESHAKE,deshake,vf);
+REGISTER_FILTER(DILATION,   dilation,   vf);
 REGISTER_FILTER(DRAWBOX,drawbox,vf);
 REGISTER_FILTER(DRAWGRID,   drawgrid,   vf);
 REGISTER_FILTER(DRAWTEXT,   drawtext,   vf);
 REGISTER_FILTER(EDGEDETECT, edgedetect, vf);
 REGISTER_FILTER(ELBG,   elbg,   vf);
 REGISTER_FILTER(EQ, eq, vf);
+REGISTER_FILTER(EROSION,erosion,vf);
 REGISTER_FILTER(EXTRACTPLANES,  extractplanes,  vf);
 REGISTER_FILTER(FADE,   fade,   vf);
 REGISTER_FILTER(FIELD,  field,  vf);
diff --git a/libavfilter/vf_neighbor.c b/libavfilter/vf_neighbor.c
new file mode 100644
index 000..1a923dd
--- /dev/null
+++ b/libavfilter/vf_neighbor.c
@@ -0,0 +1,289 @@
+/*
+ * Copyright (c) 2012-2013 Oka Motofumi (chikuzen.mo at gmail dot com)
+ * Copyright (c) 2015 Paul B Mahol
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include libavutil/imgutils.h
+#include libavutil/pixdesc.h
+#include libavutil/opt.h
+#include avfilter.h
+#include formats.h
+#include internal.h
+#include video.h
+
+typedef struct EDContext {
+const AVClass *class;
+int planeheight[4];
+int linesize[4];
+int nb_planes;
+int

Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount

2015-02-25 Thread Clément Bœsch

On Tue, Feb 24, 2015 at 10:05:24PM -0300, James Almer wrote:
 Signed-off-by: James Almer jamr...@gmail.com
 ---
 I decided to go the configure route since other features (cmov, clz) also do
 it , but if prefered this could instead be done with a new intmath.h header 
 in the x86/ folder containing something like
 
 #if defined(__GNUC__)  defined(__POPCNT__)
 #define av_popcount   __builtin_popcount
 #if ARCH_X86_64
 #define av_popcount64 __builtin_popcountll
 #endif
 #endif
 
 For a cleaner compile time check.
 
  configure   | 12 ++--
  libavutil/intmath.h | 13 +
  2 files changed, 23 insertions(+), 2 deletions(-)
 

For the record, the builtin implementation looks like this here:

 av_popcount_c:
   0:   89 f8   mov%edi,%eax
   2:   d1 e8   shr%eax
   4:   25 55 55 55 55  and$0x,%eax
   9:   29 c7   sub%eax,%edi
   b:   89 fa   mov%edi,%edx
   d:   c1 ef 02shr$0x2,%edi
  10:   81 e2 33 33 33 33   and$0x,%edx
  16:   81 e7 33 33 33 33   and$0x,%edi
  1c:   8d 04 17lea(%rdi,%rdx,1),%eax
  1f:   89 c2   mov%eax,%edx
  21:   c1 ea 04shr$0x4,%edx
  24:   01 d0   add%edx,%eax
  26:   25 0f 0f 0f 0f  and$0xf0f0f0f,%eax
  2b:   89 c2   mov%eax,%edx
  2d:   c1 ea 08shr$0x8,%edx
  30:   01 d0   add%edx,%eax
  32:   89 c2   mov%eax,%edx
  34:   c1 ea 10shr$0x10,%edx
  37:   01 d0   add%edx,%eax
  39:   83 e0 3fand$0x3f,%eax
  3c:   c3  retq   
  3d:   0f 1f 00nopl   (%rax)

0040 popcount_gcc:
  40:   48 83 ec 08 sub$0x8,%rsp
  44:   89 ff   mov%edi,%edi
  46:   e8 00 00 00 00  callq  4b popcount_gcc+0xb
  4b:   48 83 c4 08 add$0x8,%rsp
  4f:   c3  retq   

0040 popcount_clang:
  40:   89 f8   mov%edi,%eax
  42:   d1 e8   shr%eax
  44:   25 55 55 55 55  and$0x,%eax
  49:   29 c7   sub%eax,%edi
  4b:   89 f8   mov%edi,%eax
  4d:   25 33 33 33 33  and$0x,%eax
  52:   c1 ef 02shr$0x2,%edi
  55:   81 e7 33 33 33 33   and$0x,%edi
  5b:   01 c7   add%eax,%edi
  5d:   89 f8   mov%edi,%eax
  5f:   c1 e8 04shr$0x4,%eax
  62:   01 f8   add%edi,%eax
  64:   25 0f 0f 0f 0f  and$0xf0f0f0f,%eax
  69:   69 c0 01 01 01 01   imul   $0x1010101,%eax,%eax
  6f:   c1 e8 18shr$0x18,%eax
  72:   c3  retq   

We might see relevant optimizations for our reference code.

[...]

-- 
Clément B.


pgpzj_uiJQRKV.pgp
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH] avfilter/palettegen: use AV_QSORT()

2015-02-25 Thread Clément Bœsch

From: Clément Bœsch clem...@stupeflix.com

This makes the sorting of the colors along an axis (r, g or b)
predictible, and thus testable under FATE. The performance is not really
an issue here since the function is called only once at the end and will
need to sort very small number of entries, so an alternative would be to
make the sorting functions (see DECLARE_CMP_FUNC()) fallback on another
axis in case of equality. This approach was actually simpler.

I don't know if there is any advantage in using a multidimensional sort,
but it will affect the final palette one way or another.
---
 libavfilter/vf_palettegen.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/libavfilter/vf_palettegen.c b/libavfilter/vf_palettegen.c
index 9a4a701..67c99d9 100644
--- a/libavfilter/vf_palettegen.c
+++ b/libavfilter/vf_palettegen.c
@@ -25,6 +25,7 @@
 
 #include libavutil/avassert.h
 #include libavutil/opt.h
+#include libavutil/qsort.h
 #include avfilter.h
 #include internal.h
 
@@ -351,7 +352,8 @@ static AVFrame *get_palette_frame(AVFilterContext *ctx)
 
 /* sort the range by its longest axis if it's not already sorted */
 if (box-sorted_by != longest) {
-qsort(s-refs[box-start], box-len, sizeof(*s-refs), 
cmp_funcs[longest]);
+cmp_func cmpf = cmp_funcs[longest];
+AV_QSORT(s-refs[box-start], box-len, const struct color_ref *, 
cmpf);
 box-sorted_by = longest;
 }
 
-- 
2.3.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] lavfi: add erosion dilation filter

2015-02-25 Thread Clément Bœsch

On Wed, Feb 25, 2015 at 02:55:58PM +, Paul B Mahol wrote:
 Signed-off-by: Paul B Mahol one...@gmail.com
 ---
  doc/filters.texi  |  34 ++
  libavfilter/Makefile  |   2 +
  libavfilter/allfilters.c  |   2 +
  libavfilter/vf_neighbor.c | 289 
 ++
  4 files changed, 327 insertions(+)
  create mode 100644 libavfilter/vf_neighbor.c
 
[...]

If you put 1 filters in the same file, you need to protect with
CONFIG_*_FILTER.

-- 
Clément B.


pgpLwMeMhKlvv.pgp
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] avfilter/palettegen: export color quantization ratio

2015-02-25 Thread Clément Bœsch

On Wed, Feb 25, 2015 at 03:43:23PM +0100, Nicolas George wrote:
 Le septidi 7 ventôse, an CCXXIII, Clement Boesch a écrit :
  +float ratio = (float)nb_out / nb_in;
  +snprintf(buf, sizeof(buf), %f, ratio);
 
 I wonder if both values could be useful instead individually. If so, either
 set the string to %d/%d or set two metadata keys.
 

The numerator is the user input, so only one value is necessary. We could
export the number of colors found in the input (nb_in), but this code is
all about the quantization process so I think the ratio is actually more
relevant.

 If not, use double instead of float, float is very rarely useful. And in
 this particular case it will be converted to double by the vararg call.

Sure, OK

 
  +av_log(ctx, AV_LOG_INFO, %d%s colors generated out of %d colors; 
  ratio=%f\n,
  +   s-nb_boxes, s-reserve_transparent ? (+1) : , s-nb_refs,
  +   set_colorquant_ratio_meta(out, s-nb_boxes, s-nb_refs));
 
 This is your code, but I must say that I find returning the double value as
 a secondary effect of a function called set_something() not very readable.

The alternative was to have the ratio declared on top of the function (far
away), or pass the context to set_colorquant_ratio_meta() to make it log
as well. This is really a cosmetic issue, maybe I'll just declare ratio on
top so it's more readable.

-- 
Clément B.


pgpJTidiVe0p6.pgp
Description: PGP signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] lavfi: add erosion dilation filter

2015-02-25 Thread Paul B Mahol

On 2/25/15, Clement Boesch u...@pkh.me wrote:
 On Wed, Feb 25, 2015 at 02:55:58PM +, Paul B Mahol wrote:
 Signed-off-by: Paul B Mahol one...@gmail.com
 ---
  doc/filters.texi  |  34 ++
  libavfilter/Makefile  |   2 +
  libavfilter/allfilters.c  |   2 +
  libavfilter/vf_neighbor.c | 289
 ++
  4 files changed, 327 insertions(+)
  create mode 100644 libavfilter/vf_neighbor.c

 [...]

 If you put 1 filters in the same file, you need to protect with
 CONFIG_*_FILTER.

 --
 Clement B.


locally fixed
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] Adding Webvtt in hls muxer

2015-02-25 Thread Deron


On 2/21/15 8:05 AM, Deron wrote:

On 2/15/15 5:44 AM, Anshul wrote:


attached another cleaned patch.

-Anshul


Not sure if I should be posting here, privately, or do the user list 
since it is an unaccepted patch... This patch applies cleanly to 
ffmpeg git of that day, and with minor adjustment to current git, but 
either crashes the same for me right away. Here is the back trace from 
gdb:



Program received signal SIGSEGV, Segmentation fault.
__GI___libc_realloc (oldmem=0x70, bytes=8) at malloc.c:2977
2977malloc.c: No such file or directory.
(gdb) bt
#0  __GI___libc_realloc (oldmem=0x70, bytes=8) at malloc.c:2977
#1  0x77609b99 in avformat_new_stream (s=s@entry=0xe2dbc0, 
c=c@entry=0x0) at libavformat/utils.c:3655
#2  0x775451c4 in hls_mux_init (s=0x6787c0) at 
libavformat/hlsenc.c:194

#3  hls_write_header (s=0x6787c0) at libavformat/hlsenc.c:490
#4  0x775999ec in avformat_write_header (s=s@entry=0x6787c0, 
options=0x6a9948) at libavformat/mux.c:406

#5  0x00424c00 in transcode_init () at ffmpeg.c:3096
#6  0x00407581 in transcode () at ffmpeg.c:3815
#7  main (argc=13, argv=0x7fffe5a8) at ffmpeg.c:4022


The command (I've tried all sorts of combinations. If I don't provide 
a subtitle stream, it does not crash. Otherwise it does. Unpatched can 
generate a webvtt from the subtitle stream without crashing.)


ffmpeg -loglevel debug -f lavfi -i movie=out.ts\[out0+subcc\] -f hls 
-hls_segment_filename /var/www/html/stream/kota/v.low.%d.ts 
-hls_subtitle_path /var/www/html/stream/kota/ -y 
/var/www/html/stream/kota/v.low.m3u8




I did find the problem. The patch does not properly initialize 
hls-vtt_oformat (etc) if you provide the -hls_segment_filename 
parameter as I have done.


Deron

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH][RFC] Interpretation of duration field in AVI super index chunk

2015-02-25 Thread Tobias Rapp


On 24.02.2015 16:33, Tobias Rapp wrote:

Hi,

I am currently trying to interpret the index data of HuffYuv/PCM AVI
files written by FFmpeg. If the file is larger than 2GiB an AVIX RIFF
chunk is added for each 2GiB-block, each of these blocks gets an OpenDML
chunk index ix00 and an OpenDML super index chunk indx is written
for each stream that contains the file offset of the according chunk
indexes together with its size and total duration.

For this super index duration entry the documentation I have found
states that it shall be the time span in stream ticks [1]. FFmpeg
seems to write the chunk count instead which is correct for video
streams where one data chunk 00dc corresponds to one frame. For audio
streams one data chunk 01wb usually contains multiple samples.

I guess for audio streams the duration should be multiplied by the
samples-per-chunk factor in avienc.c to allow third-party applications
to jump to the correct AVIX RIFF segment based on a given timestamp.

Any comments?


To illustrate my issue I have attached a patch that makes the OpenDML 
super index chunk consistent with my interpretation of the specs. A test 
with VirtualDub 1.10.4 shows that it also writes the audio duration 
using sample count instead of packet count.


Using sample count allows to seek the audio stream to the right RIFF 
segment without reading through all OpenDML standard ix01 chunks.


Please comment,
Tobias
From 3a8e15598fa3e044f3ef65b5063c633cb4b3afed Mon Sep 17 00:00:00 2001
From: Tobias Rapp t.r...@noa-audio.com
Date: Wed, 25 Feb 2015 17:10:13 +0100
Subject: [PATCH] libavformat/avienc: Fix duration for audio stream OpenDML
 super index

Fixes the duration field of the OpenDML super index indx chunk to
contain the number of samples instead of the number of packets for
(linear/PCM) audio streams.

This matches the OpenDML V1.02 standard text which states that the
duration field shall contain time span in stream ticks.
---
 libavformat/avienc.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/libavformat/avienc.c b/libavformat/avienc.c
index 53c2fe7..b77b295 100644
--- a/libavformat/avienc.c
+++ b/libavformat/avienc.c
@@ -49,6 +49,7 @@ typedef struct AVIIentry {
 
 typedef struct AVIIndex {
 int64_t indx_start;
+int64_t audio_strm_offset;
 int entry;
 int ents_allocated;
 AVIIentry** cluster;
@@ -91,6 +92,7 @@ static int64_t avi_start_new_riff(AVFormatContext *s, 
AVIOContext *pb,
 avi-riff_id++;
 for (i = 0; i  s-nb_streams; i++) {
 AVIStream *avist = s-streams[i]-priv_data;
+avist-indexes.audio_strm_offset = avist-audio_strm_length;
 avist-indexes.entry = 0;
 }
 
@@ -476,6 +478,7 @@ static int avi_write_ix(AVFormatContext *s)
 for (i = 0; i  s-nb_streams; i++) {
 AVIStream *avist = s-streams[i]-priv_data;
 int64_t ix, pos;
+int au_byterate, au_ssize, au_scale;
 
 avi_stream2fourcc(tag, i, s-streams[i]-codec-codec_type);
 ix_tag[3] = '0' + i;
@@ -511,7 +514,11 @@ static int avi_write_ix(AVFormatContext *s)
 avio_skip(pb, 16 * avi-riff_id);
 avio_wl64(pb, ix);/* qwOffset */
 avio_wl32(pb, pos - ix);  /* dwSize */
-avio_wl32(pb, avist-indexes.entry);  /* dwDuration */
+ff_parse_specific_params(s-streams[i], au_byterate, au_ssize, 
au_scale);
+if (au_ssize == 0)
+avio_wl32(pb, avist-indexes.entry);  /* dwDuration (packet count) 
*/
+else
+avio_wl32(pb, (avist-audio_strm_length - 
avist-indexes.audio_strm_offset) / au_ssize);  /* dwDuration (sample count) */
 
 avio_seek(pb, pos, SEEK_SET);
 }
-- 
1.9.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount

2015-02-25 Thread Ronald S. Bultje

Hi James,

On Wed, Feb 25, 2015 at 12:25 PM, James Almer jamr...@gmail.com wrote:

 On 25/02/15 9:41 AM, Ronald S. Bultje wrote:
  Hi,
 
  On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com wrote:
 
  +#if HAVE_FAST_POPCNT
  +#if AV_GCC_VERSION_AT_LEAST(4,5)
  +#ifndef av_popcount
  +#define av_popcount   __builtin_popcount
  +#endif /* av_popcount */
  +#if HAVE_FAST_64BIT
  +#ifndef av_popcount64
  +#define av_popcount64 __builtin_popcountll
  +#endif /* av_popcount64 */
  +#endif /* HAVE_FAST_64BIT */
  +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */
  +#endif /* HAVE_FAST_POPCNT */
 
 
  Is this just to get the sse4 popcnt instruction if we compile with
  -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet
  configure still does an arch/cpu check. I'd expect the built-in/compiler
 to
  do that for us based on -mcpu, and we could always unconditionally use
 this
  (as long as gcc = 4.5); alternatively, you could use inline asm and then
  have the configure check (HAVE_FAST_POPCNT). But doing both seems a
 little
  odd. I have no objection to it, patch is still fine, just odd.
 
  Ronald

 I purposely made the checks for gcc 4.5 and in configure for cpus that
 support popcnt
 because otherwise __builtin_popcount (at least gcc's) is slower than our
 generic
 av_popcount_c function from lavu/common.h.
 When the CPU supports popcnt the builtin becomes a single inlined
 instruction.

 I tried the __asm__ approach, but the code generated by the builtin seemed
 better.


That's interesting, can you show the code you tried?

Ronald
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount

2015-02-25 Thread James Almer

On 25/02/15 9:41 AM, Ronald S. Bultje wrote:
 Hi,
 
 On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com wrote:

 +#if HAVE_FAST_POPCNT
 +#if AV_GCC_VERSION_AT_LEAST(4,5)
 +#ifndef av_popcount
 +#define av_popcount   __builtin_popcount
 +#endif /* av_popcount */
 +#if HAVE_FAST_64BIT
 +#ifndef av_popcount64
 +#define av_popcount64 __builtin_popcountll
 +#endif /* av_popcount64 */
 +#endif /* HAVE_FAST_64BIT */
 +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */
 +#endif /* HAVE_FAST_POPCNT */

 
 Is this just to get the sse4 popcnt instruction if we compile with
 -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet
 configure still does an arch/cpu check. I'd expect the built-in/compiler to
 do that for us based on -mcpu, and we could always unconditionally use this
 (as long as gcc = 4.5); alternatively, you could use inline asm and then
 have the configure check (HAVE_FAST_POPCNT). But doing both seems a little
 odd. I have no objection to it, patch is still fine, just odd.
 
 Ronald

I purposely made the checks for gcc 4.5 and in configure for cpus that support 
popcnt 
because otherwise __builtin_popcount (at least gcc's) is slower than our 
generic 
av_popcount_c function from lavu/common.h.
When the CPU supports popcnt the builtin becomes a single inlined instruction.

I tried the __asm__ approach, but the code generated by the builtin seemed 
better.
And I agree it looks odd and maybe way too specific, which is why i said i can 
add 
this to a new header in the x86/ folder instead.

Patch attached. I don't have clang so i can't test it, nor i know how to check 
for a 
version that supports the builtin.
From 056a0ec11618bc02ea52eff269dc7bc016dc9897 Mon Sep 17 00:00:00 2001
From: James Almer jamr...@gmail.com
Date: Wed, 25 Feb 2015 14:21:02 -0300
Subject: [PATCH] libavutil: add x86 optimized av_popcount

Signed-off-by: James Almer jamr...@gmail.com
---
 libavutil/intmath.h |  3 +++
 libavutil/x86/intmath.h | 38 ++
 2 files changed, 41 insertions(+)
 create mode 100644 libavutil/x86/intmath.h

diff --git a/libavutil/intmath.h b/libavutil/intmath.h
index fa549c8..f5ecc77 100644
--- a/libavutil/intmath.h
+++ b/libavutil/intmath.h
@@ -29,6 +29,9 @@
 #if ARCH_ARM
 #   include arm/intmath.h
 #endif
+#if ARCH_X86
+#   include x86/intmath.h
+#endif
 
 /**
  * @addtogroup lavu_internal
diff --git a/libavutil/x86/intmath.h b/libavutil/x86/intmath.h
new file mode 100644
index 000..2b71cec
--- /dev/null
+++ b/libavutil/x86/intmath.h
@@ -0,0 +1,38 @@
+/*
+ * Copyright (c) 2015 James Almer
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVUTIL_X86_INTMATH_H
+#define AVUTIL_X86_INTMATH_H
+
+#include stdint.h
+#include config.h
+
+/* Our generic version of av_popcount is faster than GCC's built-in on 
+ * CPUs that don't support the popcnt instruction.
+ */
+#if defined(__GNUC__)  defined(__POPCNT__)
+#define av_popcount   __builtin_popcount
+#if ARCH_X86_64
+#define av_popcount64 __builtin_popcountll
+#endif
+
+#endif /* defined(__GNUC__)  defined(__POPCNT__) */
+
+#endif /* AVUTIL_X86_INTMATH_H */
-- 
2.3.0

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount

2015-02-25 Thread James Almer

On 25/02/15 2:36 PM, Ronald S. Bultje wrote:
 Hi James,
 
 On Wed, Feb 25, 2015 at 12:25 PM, James Almer jamr...@gmail.com wrote:
 
 On 25/02/15 9:41 AM, Ronald S. Bultje wrote:
 Hi,

 On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com wrote:

 +#if HAVE_FAST_POPCNT
 +#if AV_GCC_VERSION_AT_LEAST(4,5)
 +#ifndef av_popcount
 +#define av_popcount   __builtin_popcount
 +#endif /* av_popcount */
 +#if HAVE_FAST_64BIT
 +#ifndef av_popcount64
 +#define av_popcount64 __builtin_popcountll
 +#endif /* av_popcount64 */
 +#endif /* HAVE_FAST_64BIT */
 +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */
 +#endif /* HAVE_FAST_POPCNT */


 Is this just to get the sse4 popcnt instruction if we compile with
 -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet
 configure still does an arch/cpu check. I'd expect the built-in/compiler
 to
 do that for us based on -mcpu, and we could always unconditionally use
 this
 (as long as gcc = 4.5); alternatively, you could use inline asm and then
 have the configure check (HAVE_FAST_POPCNT). But doing both seems a
 little
 odd. I have no objection to it, patch is still fine, just odd.

 Ronald

 I purposely made the checks for gcc 4.5 and in configure for cpus that
 support popcnt
 because otherwise __builtin_popcount (at least gcc's) is slower than our
 generic
 av_popcount_c function from lavu/common.h.
 When the CPU supports popcnt the builtin becomes a single inlined
 instruction.

 I tried the __asm__ approach, but the code generated by the builtin seemed
 better.
 
 
 That's interesting, can you show the code you tried?
 
 Ronald

If instead of builtins i use

+#if HAVE_INLINE_ASM
+
+#ifdef __POPCNT__
+
+#define av_popcount av_popcount_abm
+static av_always_inline av_const int av_popcount_abm(uint32_t a)
+{
+int x;
+
+__asm__ (popcnt %1, %0 : =r (x) : rm (a));
+return x;
+}
+
+#if ARCH_X86_64
+#define av_popcount64 av_popcount64_abm
+static av_always_inline av_const int av_popcount64_abm(uint64_t a)
+{
+uint64_t x;
+
+__asm__ (popcnt %1, %0 : =r (x) : rm (a));
+return x;
+}
+#endif
+
+#endif /* __POPCNT__ */
+
+#endif /* HAVE_INLINE_ASM */

in the x86/ header from the second version i sent, on x86_32 av_cpu_count from 
libavutil/cpu.o on Windows (Will not work with other OSes because of 
HAVE_GETPROCESSAFFINITYMASK) generates

 1af:   f3 0f b8 5c 24 18   popcnt ebx,DWORD PTR [esp+0x18]
 1b5:   31 c0   xoreax,eax
 1b7:   f3 0f b8 c0 popcnt eax,eax
 1bb:   01 c3   addebx,eax

Whereas with the builtins i instead get

 1af:   f3 0f b8 5c 24 18   popcnt ebx,DWORD PTR [esp+0x18]

This is probably because of av_popcount64_c (Used unconditionally on x86_32) 
calling 
av_popcount twice, and proc_aff being DWORD_PTR type means it's 4 bytes in 
x86_32.
The builtin seems to know proc_aff can't be right shifted 32 bits so it 
generates 
a single popcnt.

This is probably a rare case since av_popcount64 is not normally called with a 
32bits 
argument, but the builtin handled it better.
If the argument used with av_popcount64 is effectively 64bits then both the 
builtin 
and the above asm seem to generate the same code.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount

2015-02-25 Thread James Almer

On 25/02/15 12:43 PM, Clément Bœsch wrote:
 On Tue, Feb 24, 2015 at 10:05:24PM -0300, James Almer wrote:
 Signed-off-by: James Almer jamr...@gmail.com
 ---
 I decided to go the configure route since other features (cmov, clz) also do
 it , but if prefered this could instead be done with a new intmath.h header 
 in the x86/ folder containing something like

 #if defined(__GNUC__)  defined(__POPCNT__)
 #define av_popcount   __builtin_popcount
 #if ARCH_X86_64
 #define av_popcount64 __builtin_popcountll
 #endif
 #endif

 For a cleaner compile time check.

  configure   | 12 ++--
  libavutil/intmath.h | 13 +
  2 files changed, 23 insertions(+), 2 deletions(-)

 
 For the record, the builtin implementation looks like this here:
 
  av_popcount_c:
0:   89 f8   mov%edi,%eax
2:   d1 e8   shr%eax
4:   25 55 55 55 55  and$0x,%eax
9:   29 c7   sub%eax,%edi
b:   89 fa   mov%edi,%edx
d:   c1 ef 02shr$0x2,%edi
   10:   81 e2 33 33 33 33   and$0x,%edx
   16:   81 e7 33 33 33 33   and$0x,%edi
   1c:   8d 04 17lea(%rdi,%rdx,1),%eax
   1f:   89 c2   mov%eax,%edx
   21:   c1 ea 04shr$0x4,%edx
   24:   01 d0   add%edx,%eax
   26:   25 0f 0f 0f 0f  and$0xf0f0f0f,%eax
   2b:   89 c2   mov%eax,%edx
   2d:   c1 ea 08shr$0x8,%edx
   30:   01 d0   add%edx,%eax
   32:   89 c2   mov%eax,%edx
   34:   c1 ea 10shr$0x10,%edx
   37:   01 d0   add%edx,%eax
   39:   83 e0 3fand$0x3f,%eax
   3c:   c3  retq   
   3d:   0f 1f 00nopl   (%rax)
 
 0040 popcount_gcc:
   40:   48 83 ec 08 sub$0x8,%rsp
   44:   89 ff   mov%edi,%edi
   46:   e8 00 00 00 00  callq  4b popcount_gcc+0xb
   4b:   48 83 c4 08 add$0x8,%rsp
   4f:   c3  retq   

This is the reason i made it only use the builtin if the CPU supports 
popcnt, because the fallback implementation as seen here is slower 
than our version above.

 
 0040 popcount_clang:
   40:   89 f8   mov%edi,%eax
   42:   d1 e8   shr%eax
   44:   25 55 55 55 55  and$0x,%eax
   49:   29 c7   sub%eax,%edi
   4b:   89 f8   mov%edi,%eax
   4d:   25 33 33 33 33  and$0x,%eax
   52:   c1 ef 02shr$0x2,%edi
   55:   81 e7 33 33 33 33   and$0x,%edi
   5b:   01 c7   add%eax,%edi
   5d:   89 f8   mov%edi,%eax
   5f:   c1 e8 04shr$0x4,%eax
   62:   01 f8   add%edi,%eax
   64:   25 0f 0f 0f 0f  and$0xf0f0f0f,%eax
   69:   69 c0 01 01 01 01   imul   $0x1010101,%eax,%eax
   6f:   c1 e8 18shr$0x18,%eax
   72:   c3  retq   
 
 We might see relevant optimizations for our reference code.
 
 [...]
 
 
 
 ___
 ffmpeg-devel mailing list
 ffmpeg-devel@ffmpeg.org
 http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount

2015-02-25 Thread James Almer

On 25/02/15 12:43 PM, Clément Bœsch wrote:
 On Tue, Feb 24, 2015 at 10:05:24PM -0300, James Almer wrote:
 Signed-off-by: James Almer jamr...@gmail.com
 ---
 I decided to go the configure route since other features (cmov, clz) also do
 it , but if prefered this could instead be done with a new intmath.h header 
 in the x86/ folder containing something like

 #if defined(__GNUC__)  defined(__POPCNT__)
 #define av_popcount   __builtin_popcount
 #if ARCH_X86_64
 #define av_popcount64 __builtin_popcountll
 #endif
 #endif

 For a cleaner compile time check.

  configure   | 12 ++--
  libavutil/intmath.h | 13 +
  2 files changed, 23 insertions(+), 2 deletions(-)

 
 For the record, the builtin implementation looks like this here:
 
  av_popcount_c:
0:   89 f8   mov%edi,%eax
2:   d1 e8   shr%eax
4:   25 55 55 55 55  and$0x,%eax
9:   29 c7   sub%eax,%edi
b:   89 fa   mov%edi,%edx
d:   c1 ef 02shr$0x2,%edi
   10:   81 e2 33 33 33 33   and$0x,%edx
   16:   81 e7 33 33 33 33   and$0x,%edi
   1c:   8d 04 17lea(%rdi,%rdx,1),%eax
   1f:   89 c2   mov%eax,%edx
   21:   c1 ea 04shr$0x4,%edx
   24:   01 d0   add%edx,%eax
   26:   25 0f 0f 0f 0f  and$0xf0f0f0f,%eax
   2b:   89 c2   mov%eax,%edx
   2d:   c1 ea 08shr$0x8,%edx
   30:   01 d0   add%edx,%eax
   32:   89 c2   mov%eax,%edx
   34:   c1 ea 10shr$0x10,%edx
   37:   01 d0   add%edx,%eax
   39:   83 e0 3fand$0x3f,%eax
   3c:   c3  retq   
   3d:   0f 1f 00nopl   (%rax)
 
 0040 popcount_gcc:
   40:   48 83 ec 08 sub$0x8,%rsp
   44:   89 ff   mov%edi,%edi
   46:   e8 00 00 00 00  callq  4b popcount_gcc+0xb
   4b:   48 83 c4 08 add$0x8,%rsp
   4f:   c3  retq   
 
 0040 popcount_clang:
   40:   89 f8   mov%edi,%eax
   42:   d1 e8   shr%eax
   44:   25 55 55 55 55  and$0x,%eax
   49:   29 c7   sub%eax,%edi
   4b:   89 f8   mov%edi,%eax
   4d:   25 33 33 33 33  and$0x,%eax
   52:   c1 ef 02shr$0x2,%edi
   55:   81 e7 33 33 33 33   and$0x,%edi
   5b:   01 c7   add%eax,%edi
   5d:   89 f8   mov%edi,%eax
   5f:   c1 e8 04shr$0x4,%eax
   62:   01 f8   add%edi,%eax
   64:   25 0f 0f 0f 0f  and$0xf0f0f0f,%eax
   69:   69 c0 01 01 01 01   imul   $0x1010101,%eax,%eax
   6f:   c1 e8 18shr$0x18,%eax
   72:   c3  retq   
 
 We might see relevant optimizations for our reference code.

What's clang code for av_popcount64_c, or their builtin?
We're currently calling av_popcount_c twice from within av_popcount64_c, 
when on x86_64 cpus we could probably take advantage of the 64bits gprs.

 
 [...]
 
 
 
 ___
 ffmpeg-devel mailing list
 ffmpeg-devel@ffmpeg.org
 http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount

2015-02-25 Thread Ronald S. Bultje

Hi,

On Wed, Feb 25, 2015 at 12:52 PM, James Almer jamr...@gmail.com wrote:

 On 25/02/15 2:36 PM, Ronald S. Bultje wrote:
  Hi James,
 
  On Wed, Feb 25, 2015 at 12:25 PM, James Almer jamr...@gmail.com wrote:
 
  On 25/02/15 9:41 AM, Ronald S. Bultje wrote:
  Hi,
 
  On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com
 wrote:
 
  +#if HAVE_FAST_POPCNT
  +#if AV_GCC_VERSION_AT_LEAST(4,5)
  +#ifndef av_popcount
  +#define av_popcount   __builtin_popcount
  +#endif /* av_popcount */
  +#if HAVE_FAST_64BIT
  +#ifndef av_popcount64
  +#define av_popcount64 __builtin_popcountll
  +#endif /* av_popcount64 */
  +#endif /* HAVE_FAST_64BIT */
  +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */
  +#endif /* HAVE_FAST_POPCNT */
 
 
  Is this just to get the sse4 popcnt instruction if we compile with
  -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet
  configure still does an arch/cpu check. I'd expect the
 built-in/compiler
  to
  do that for us based on -mcpu, and we could always unconditionally use
  this
  (as long as gcc = 4.5); alternatively, you could use inline asm and
 then
  have the configure check (HAVE_FAST_POPCNT). But doing both seems a
  little
  odd. I have no objection to it, patch is still fine, just odd.
 
  Ronald
 
  I purposely made the checks for gcc 4.5 and in configure for cpus that
  support popcnt
  because otherwise __builtin_popcount (at least gcc's) is slower than our
  generic
  av_popcount_c function from lavu/common.h.
  When the CPU supports popcnt the builtin becomes a single inlined
  instruction.
 
  I tried the __asm__ approach, but the code generated by the builtin
 seemed
  better.
 
 
  That's interesting, can you show the code you tried?
 
  Ronald

 If instead of builtins i use

 +#if HAVE_INLINE_ASM
 +
 +#ifdef __POPCNT__
 +
 +#define av_popcount av_popcount_abm
 +static av_always_inline av_const int av_popcount_abm(uint32_t a)
 +{
 +int x;
 +
 +__asm__ (popcnt %1, %0 : =r (x) : rm (a));
 +return x;
 +}
 +
 +#if ARCH_X86_64
 +#define av_popcount64 av_popcount64_abm
 +static av_always_inline av_const int av_popcount64_abm(uint64_t a)
 +{
 +uint64_t x;
 +
 +__asm__ (popcnt %1, %0 : =r (x) : rm (a));
 +return x;
 +}
 +#endif
 +
 +#endif /* __POPCNT__ */
 +
 +#endif /* HAVE_INLINE_ASM */

 in the x86/ header from the second version i sent, on x86_32 av_cpu_count
 from
 libavutil/cpu.o on Windows (Will not work with other OSes because of
 HAVE_GETPROCESSAFFINITYMASK) generates

  1af:   f3 0f b8 5c 24 18   popcnt ebx,DWORD PTR [esp+0x18]
  1b5:   31 c0   xoreax,eax
  1b7:   f3 0f b8 c0 popcnt eax,eax
  1bb:   01 c3   addebx,eax

 Whereas with the builtins i instead get

  1af:   f3 0f b8 5c 24 18   popcnt ebx,DWORD PTR [esp+0x18]

 This is probably because of av_popcount64_c (Used unconditionally on
 x86_32) calling
 av_popcount twice, and proc_aff being DWORD_PTR type means it's 4 bytes in
 x86_32.
 The builtin seems to know proc_aff can't be right shifted 32 bits so it
 generates
 a single popcnt.


It seems to me rather than the second version is defined as pure (or
whatever the keyword was), i.e. we don't depend on external state - if
you call this with a constant, we can calculate this at compile-time.

Ronald
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] avfilter/palettegen: use AV_QSORT()

2015-02-25 Thread Michael Niedermayer

On Wed, Feb 25, 2015 at 04:10:44PM +0100, Clément Bœsch wrote:
 From: Clément Bœsch clem...@stupeflix.com
 
 This makes the sorting of the colors along an axis (r, g or b)
 predictible, and thus testable under FATE. The performance is not really
 an issue here since the function is called only once at the end and will
 need to sort very small number of entries, so an alternative would be to
 make the sorting functions (see DECLARE_CMP_FUNC()) fallback on another
 axis in case of equality. This approach was actually simpler.
 
 I don't know if there is any advantage in using a multidimensional sort,
 but it will affect the final palette one way or another.
 ---
  libavfilter/vf_palettegen.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

LGTM 

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Into a blind darkness they enter who follow after the Ignorance,
they as if into a greater darkness enter who devote themselves
to the Knowledge alone. -- Isha Upanishad


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount

2015-02-25 Thread Ronald S. Bultje

Hi,

On Wed, Feb 25, 2015 at 1:21 PM, James Almer jamr...@gmail.com wrote:

 On 25/02/15 2:58 PM, Ronald S. Bultje wrote:
  Hi,
 
  On Wed, Feb 25, 2015 at 12:52 PM, James Almer jamr...@gmail.com wrote:
 
  On 25/02/15 2:36 PM, Ronald S. Bultje wrote:
  Hi James,
 
  On Wed, Feb 25, 2015 at 12:25 PM, James Almer jamr...@gmail.com
 wrote:
 
  On 25/02/15 9:41 AM, Ronald S. Bultje wrote:
  Hi,
 
  On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com
  wrote:
 
  +#if HAVE_FAST_POPCNT
  +#if AV_GCC_VERSION_AT_LEAST(4,5)
  +#ifndef av_popcount
  +#define av_popcount   __builtin_popcount
  +#endif /* av_popcount */
  +#if HAVE_FAST_64BIT
  +#ifndef av_popcount64
  +#define av_popcount64 __builtin_popcountll
  +#endif /* av_popcount64 */
  +#endif /* HAVE_FAST_64BIT */
  +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */
  +#endif /* HAVE_FAST_POPCNT */
 
 
  Is this just to get the sse4 popcnt instruction if we compile with
  -mcpu=sse4? The slightly odd thing is that we're using a built-in,
 yet
  configure still does an arch/cpu check. I'd expect the
  built-in/compiler
  to
  do that for us based on -mcpu, and we could always unconditionally
 use
  this
  (as long as gcc = 4.5); alternatively, you could use inline asm and
  then
  have the configure check (HAVE_FAST_POPCNT). But doing both seems a
  little
  odd. I have no objection to it, patch is still fine, just odd.
 
  Ronald
 
  I purposely made the checks for gcc 4.5 and in configure for cpus that
  support popcnt
  because otherwise __builtin_popcount (at least gcc's) is slower than
 our
  generic
  av_popcount_c function from lavu/common.h.
  When the CPU supports popcnt the builtin becomes a single inlined
  instruction.
 
  I tried the __asm__ approach, but the code generated by the builtin
  seemed
  better.
 
 
  That's interesting, can you show the code you tried?
 
  Ronald
 
  If instead of builtins i use
 
  +#if HAVE_INLINE_ASM
  +
  +#ifdef __POPCNT__
  +
  +#define av_popcount av_popcount_abm
  +static av_always_inline av_const int av_popcount_abm(uint32_t a)
  +{
  +int x;
  +
  +__asm__ (popcnt %1, %0 : =r (x) : rm (a));
  +return x;
  +}
  +
  +#if ARCH_X86_64
  +#define av_popcount64 av_popcount64_abm
  +static av_always_inline av_const int av_popcount64_abm(uint64_t a)
  +{
  +uint64_t x;
  +
  +__asm__ (popcnt %1, %0 : =r (x) : rm (a));
  +return x;
  +}
  +#endif
  +
  +#endif /* __POPCNT__ */
  +
  +#endif /* HAVE_INLINE_ASM */
 
  in the x86/ header from the second version i sent, on x86_32
 av_cpu_count
  from
  libavutil/cpu.o on Windows (Will not work with other OSes because of
  HAVE_GETPROCESSAFFINITYMASK) generates
 
   1af:   f3 0f b8 5c 24 18   popcnt ebx,DWORD PTR [esp+0x18]
   1b5:   31 c0   xoreax,eax
   1b7:   f3 0f b8 c0 popcnt eax,eax
   1bb:   01 c3   addebx,eax
 
  Whereas with the builtins i instead get
 
   1af:   f3 0f b8 5c 24 18   popcnt ebx,DWORD PTR [esp+0x18]
 
  This is probably because of av_popcount64_c (Used unconditionally on
  x86_32) calling
  av_popcount twice, and proc_aff being DWORD_PTR type means it's 4 bytes
 in
  x86_32.
  The builtin seems to know proc_aff can't be right shifted 32 bits so it
  generates
  a single popcnt.
 
 
  It seems to me rather than the second version is defined as pure (or
  whatever the keyword was), i.e. we don't depend on external state -
 if
  you call this with a constant, we can calculate this at compile-time.
 
  Ronald

 Do you know how to get the asm version working right for this one case?


No, const should do it but looks like inline asm breaks that assumption.


 Otherwise i think the second version with builtins i sent is good to
 apply. Much
 cleaner than the configure approach to be honest.


OK.

Ronald
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount

2015-02-25 Thread James Almer

On 25/02/15 2:58 PM, Ronald S. Bultje wrote:
 Hi,
 
 On Wed, Feb 25, 2015 at 12:52 PM, James Almer jamr...@gmail.com wrote:
 
 On 25/02/15 2:36 PM, Ronald S. Bultje wrote:
 Hi James,

 On Wed, Feb 25, 2015 at 12:25 PM, James Almer jamr...@gmail.com wrote:

 On 25/02/15 9:41 AM, Ronald S. Bultje wrote:
 Hi,

 On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com
 wrote:

 +#if HAVE_FAST_POPCNT
 +#if AV_GCC_VERSION_AT_LEAST(4,5)
 +#ifndef av_popcount
 +#define av_popcount   __builtin_popcount
 +#endif /* av_popcount */
 +#if HAVE_FAST_64BIT
 +#ifndef av_popcount64
 +#define av_popcount64 __builtin_popcountll
 +#endif /* av_popcount64 */
 +#endif /* HAVE_FAST_64BIT */
 +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */
 +#endif /* HAVE_FAST_POPCNT */


 Is this just to get the sse4 popcnt instruction if we compile with
 -mcpu=sse4? The slightly odd thing is that we're using a built-in, yet
 configure still does an arch/cpu check. I'd expect the
 built-in/compiler
 to
 do that for us based on -mcpu, and we could always unconditionally use
 this
 (as long as gcc = 4.5); alternatively, you could use inline asm and
 then
 have the configure check (HAVE_FAST_POPCNT). But doing both seems a
 little
 odd. I have no objection to it, patch is still fine, just odd.

 Ronald

 I purposely made the checks for gcc 4.5 and in configure for cpus that
 support popcnt
 because otherwise __builtin_popcount (at least gcc's) is slower than our
 generic
 av_popcount_c function from lavu/common.h.
 When the CPU supports popcnt the builtin becomes a single inlined
 instruction.

 I tried the __asm__ approach, but the code generated by the builtin
 seemed
 better.


 That's interesting, can you show the code you tried?

 Ronald

 If instead of builtins i use

 +#if HAVE_INLINE_ASM
 +
 +#ifdef __POPCNT__
 +
 +#define av_popcount av_popcount_abm
 +static av_always_inline av_const int av_popcount_abm(uint32_t a)
 +{
 +int x;
 +
 +__asm__ (popcnt %1, %0 : =r (x) : rm (a));
 +return x;
 +}
 +
 +#if ARCH_X86_64
 +#define av_popcount64 av_popcount64_abm
 +static av_always_inline av_const int av_popcount64_abm(uint64_t a)
 +{
 +uint64_t x;
 +
 +__asm__ (popcnt %1, %0 : =r (x) : rm (a));
 +return x;
 +}
 +#endif
 +
 +#endif /* __POPCNT__ */
 +
 +#endif /* HAVE_INLINE_ASM */

 in the x86/ header from the second version i sent, on x86_32 av_cpu_count
 from
 libavutil/cpu.o on Windows (Will not work with other OSes because of
 HAVE_GETPROCESSAFFINITYMASK) generates

  1af:   f3 0f b8 5c 24 18   popcnt ebx,DWORD PTR [esp+0x18]
  1b5:   31 c0   xoreax,eax
  1b7:   f3 0f b8 c0 popcnt eax,eax
  1bb:   01 c3   addebx,eax

 Whereas with the builtins i instead get

  1af:   f3 0f b8 5c 24 18   popcnt ebx,DWORD PTR [esp+0x18]

 This is probably because of av_popcount64_c (Used unconditionally on
 x86_32) calling
 av_popcount twice, and proc_aff being DWORD_PTR type means it's 4 bytes in
 x86_32.
 The builtin seems to know proc_aff can't be right shifted 32 bits so it
 generates
 a single popcnt.
 
 
 It seems to me rather than the second version is defined as pure (or
 whatever the keyword was), i.e. we don't depend on external state - if
 you call this with a constant, we can calculate this at compile-time.
 
 Ronald

Do you know how to get the asm version working right for this one case? 
Otherwise i think the second version with builtins i sent is good to apply. Much
cleaner than the configure approach to be honest.
Someone else can change it to add the relevant clang check/s because i can't 
test 
that.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 1/2] ffmpeg: allow to set the thread message queue size.

2015-02-25 Thread Michael Niedermayer

On Wed, Feb 25, 2015 at 12:01:26PM +0100, Nicolas George wrote:
 Signed-off-by: Nicolas George geo...@nsup.org
 ---
  doc/ffmpeg.texi | 6 ++
  ffmpeg.c| 2 +-
  ffmpeg.h| 2 ++
  ffmpeg_opt.c| 4 
  4 files changed, 13 insertions(+), 1 deletion(-)

without pthreads, this seems to fail to build

ffmpeg/ffmpeg_opt.c: In function ‘open_input_file’:
ffmpeg/ffmpeg_opt.c:954:6: error: ‘InputFile’ has no member named 
‘thread_queue_size’
make: *** [ffmpeg_opt.o] Error 1

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In a rich man's house there is no place to spit but his face.
-- Diogenes of Sinope


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 1/2] ffmpeg: allow to set the thread message queue size.

2015-02-25 Thread Nicolas George

Le septidi 7 ventôse, an CCXXIII, Michael Niedermayer a écrit :
 without pthreads, this seems to fail to build

Forgot to test that, sorry. Fixed.

By the way, what are the remaining supported cases where threads are not
available?

Regards,

-- 
  Nicolas George
From 16fc0adda7fa35b69053205abf9dd61a3ebe4b03 Mon Sep 17 00:00:00 2001
From: Nicolas George geo...@nsup.org
Date: Wed, 25 Feb 2015 11:50:44 +0100
Subject: [PATCH 2/3] ffmpeg: allow to set the thread message queue size.

Signed-off-by: Nicolas George geo...@nsup.org
---
 doc/ffmpeg.texi | 6 ++
 ffmpeg.c| 2 +-
 ffmpeg.h| 2 ++
 ffmpeg_opt.c| 6 ++
 4 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/doc/ffmpeg.texi b/doc/ffmpeg.texi
index 6e51c44..6772f2f 100644
--- a/doc/ffmpeg.texi
+++ b/doc/ffmpeg.texi
@@ -1158,6 +1158,12 @@ This option enables or disables accurate seeking in input files with the
 transcoding. Use @option{-noaccurate_seek} to disable it, which may be useful
 e.g. when copying some streams and transcoding the others.
 
+@item -thread_message_queue @var{size} (@emph{input})
+This option sets the maximum number of queued packets when reading from the
+file or device. With low latency / high rate live streams, packets may be
+discarded if they are not read in a timely manner; raising this value can
+avoid it.
+
 @item -override_ffserver (@emph{global})
 Overrides the input specifications from @command{ffserver}. Using this
 option you can map any input stream to @command{ffserver} and control
diff --git a/ffmpeg.c b/ffmpeg.c
index becd5df..54ba0cb 100644
--- a/ffmpeg.c
+++ b/ffmpeg.c
@@ -3420,7 +3420,7 @@ static int init_input_threads(void)
 strcmp(f-ctx-iformat-name, lavfi))
 f-non_blocking = 1;
 ret = av_thread_message_queue_alloc(f-in_thread_queue,
-8, sizeof(AVPacket));
+f-thread_queue_size, sizeof(AVPacket));
 if (ret  0)
 return ret;
 
diff --git a/ffmpeg.h b/ffmpeg.h
index d2e0c5c..7c0c22c 100644
--- a/ffmpeg.h
+++ b/ffmpeg.h
@@ -111,6 +111,7 @@ typedef struct OptionsContext {
 int64_t input_ts_offset;
 int rate_emu;
 int accurate_seek;
+int thread_queue_size;
 
 SpecifierOpt *ts_scale;
 intnb_ts_scale;
@@ -350,6 +351,7 @@ typedef struct InputFile {
 pthread_t thread;   /* thread reading from this file */
 int non_blocking;   /* reading packets from the thread should not block */
 int joined; /* the thread has been joined */
+int thread_queue_size;  /* maximum number of queued packets */
 #endif
 } InputFile;
 
diff --git a/ffmpeg_opt.c b/ffmpeg_opt.c
index 09e6e33..39c5f49 100644
--- a/ffmpeg_opt.c
+++ b/ffmpeg_opt.c
@@ -951,6 +951,9 @@ static int open_input_file(OptionsContext *o, const char *filename)
 f-nb_streams = ic-nb_streams;
 f-rate_emu   = o-rate_emu;
 f-accurate_seek = o-accurate_seek;
+#if HAVE_PTHREADS
+f-thread_queue_size = o-thread_queue_size  0 ? o-thread_queue_size : 8;
+#endif
 
 /* check if all codec options have been used */
 unused_opts = strip_specifiers(o-g-codec_opts);
@@ -2957,6 +2960,9 @@ const OptionDef options[] = {
 { disposition,OPT_STRING | HAS_ARG | OPT_SPEC |
 OPT_OUTPUT,  { .off = OFFSET(disposition) },
 disposition,  },
+{ thread_queue_size, HAS_ARG | OPT_INT | OPT_OFFSET | OPT_EXPERT | OPT_INPUT,
+ { .off = OFFSET(thread_queue_size) },
+set the maximum number of queued packets from the demuxer },
 
 /* video options */
 { vframes,  OPT_VIDEO | HAS_ARG  | OPT_PERFILE | OPT_OUTPUT,   { .func_arg = opt_video_frames },
-- 
2.1.4



signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount

2015-02-25 Thread Reimar Döffinger

On 25.02.2015, at 19:29, Ronald S. Bultje rsbul...@gmail.com wrote:
 Hi,
 
 On Wed, Feb 25, 2015 at 1:21 PM, James Almer jamr...@gmail.com wrote:
 
 On 25/02/15 2:58 PM, Ronald S. Bultje wrote:
 Hi,
 
 On Wed, Feb 25, 2015 at 12:52 PM, James Almer jamr...@gmail.com wrote:
 
 On 25/02/15 2:36 PM, Ronald S. Bultje wrote:
 Hi James,
 
 On Wed, Feb 25, 2015 at 12:25 PM, James Almer jamr...@gmail.com
 wrote:
 
 On 25/02/15 9:41 AM, Ronald S. Bultje wrote:
 Hi,
 
 On Tue, Feb 24, 2015 at 8:05 PM, James Almer jamr...@gmail.com
 wrote:
 
 +#if HAVE_FAST_POPCNT
 +#if AV_GCC_VERSION_AT_LEAST(4,5)
 +#ifndef av_popcount
 +#define av_popcount   __builtin_popcount
 +#endif /* av_popcount */
 +#if HAVE_FAST_64BIT
 +#ifndef av_popcount64
 +#define av_popcount64 __builtin_popcountll
 +#endif /* av_popcount64 */
 +#endif /* HAVE_FAST_64BIT */
 +#endif /* AV_GCC_VERSION_AT_LEAST(4,5) */
 +#endif /* HAVE_FAST_POPCNT */
 
 
 Is this just to get the sse4 popcnt instruction if we compile with
 -mcpu=sse4? The slightly odd thing is that we're using a built-in,
 yet
 configure still does an arch/cpu check. I'd expect the
 built-in/compiler
 to
 do that for us based on -mcpu, and we could always unconditionally
 use
 this
 (as long as gcc = 4.5); alternatively, you could use inline asm and
 then
 have the configure check (HAVE_FAST_POPCNT). But doing both seems a
 little
 odd. I have no objection to it, patch is still fine, just odd.
 
 Ronald
 
 I purposely made the checks for gcc 4.5 and in configure for cpus that
 support popcnt
 because otherwise __builtin_popcount (at least gcc's) is slower than
 our
 generic
 av_popcount_c function from lavu/common.h.
 When the CPU supports popcnt the builtin becomes a single inlined
 instruction.
 
 I tried the __asm__ approach, but the code generated by the builtin
 seemed
 better.
 
 
 That's interesting, can you show the code you tried?
 
 Ronald
 
 If instead of builtins i use
 
 +#if HAVE_INLINE_ASM
 +
 +#ifdef __POPCNT__
 +
 +#define av_popcount av_popcount_abm
 +static av_always_inline av_const int av_popcount_abm(uint32_t a)
 +{
 +int x;
 +
 +__asm__ (popcnt %1, %0 : =r (x) : rm (a));
 +return x;
 +}
 +
 +#if ARCH_X86_64
 +#define av_popcount64 av_popcount64_abm
 +static av_always_inline av_const int av_popcount64_abm(uint64_t a)
 +{
 +uint64_t x;
 +
 +__asm__ (popcnt %1, %0 : =r (x) : rm (a));
 +return x;
 +}
 +#endif
 +
 +#endif /* __POPCNT__ */
 +
 +#endif /* HAVE_INLINE_ASM */
 
 in the x86/ header from the second version i sent, on x86_32
 av_cpu_count
 from
 libavutil/cpu.o on Windows (Will not work with other OSes because of
 HAVE_GETPROCESSAFFINITYMASK) generates
 
 1af:   f3 0f b8 5c 24 18   popcnt ebx,DWORD PTR [esp+0x18]
 1b5:   31 c0   xoreax,eax
 1b7:   f3 0f b8 c0 popcnt eax,eax
 1bb:   01 c3   addebx,eax
 
 Whereas with the builtins i instead get
 
 1af:   f3 0f b8 5c 24 18   popcnt ebx,DWORD PTR [esp+0x18]
 
 This is probably because of av_popcount64_c (Used unconditionally on
 x86_32) calling
 av_popcount twice, and proc_aff being DWORD_PTR type means it's 4 bytes
 in
 x86_32.
 The builtin seems to know proc_aff can't be right shifted 32 bits so it
 generates
 a single popcnt.
 
 
 It seems to me rather than the second version is defined as pure (or
 whatever the keyword was), i.e. we don't depend on external state -
 if
 you call this with a constant, we can calculate this at compile-time.
 
 Ronald
 
 Do you know how to get the asm version working right for this one case?
 
 
 No, const should do it but looks like inline asm breaks that assumption.

There are gcc builtins to check if something is a constant and use e.g. the c 
version then.
Not sure if it works here or makes sense, just in principle.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 1/2] ffmpeg: allow to set the thread message queue size.

2015-02-25 Thread Michael Niedermayer

On Wed, Feb 25, 2015 at 08:00:03PM +0100, Nicolas George wrote:
 Le septidi 7 ventôse, an CCXXIII, Michael Niedermayer a écrit :
  without pthreads, this seems to fail to build
 
 Forgot to test that, sorry. Fixed.
 
 By the way, what are the remaining supported cases where threads are not
 available?

i think the case i tested had HAVE_W32THREADS set
also theres HAVE_OS2THREADS

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No snowflake in an avalanche ever feels responsible. -- Voltaire


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 1/2] ffmpeg: allow to set the thread message queue size.

2015-02-25 Thread wm4

On Wed, 25 Feb 2015 22:24:39 +0100
Michael Niedermayer michae...@gmx.at wrote:

 On Wed, Feb 25, 2015 at 08:00:03PM +0100, Nicolas George wrote:
  Le septidi 7 ventôse, an CCXXIII, Michael Niedermayer a écrit :
   without pthreads, this seems to fail to build
  
  Forgot to test that, sorry. Fixed.
  
  By the way, what are the remaining supported cases where threads are not
  available?
 
 i think the case i tested had HAVE_W32THREADS set
 also theres HAVE_OS2THREADS
 
 [...]
 

Surely OS/2 support is worth of forcing every single developer who
touches thread stuff to consider a dead platform that's only used by 2
persons in Korea (?).
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount

2015-02-25 Thread Clément Bœsch

On Wed, Feb 25, 2015 at 02:29:11PM -0300, James Almer wrote:
 On 25/02/15 12:43 PM, Clément Bœsch wrote:
  On Tue, Feb 24, 2015 at 10:05:24PM -0300, James Almer wrote:
  Signed-off-by: James Almer jamr...@gmail.com
  ---
  I decided to go the configure route since other features (cmov, clz) also 
  do
  it , but if prefered this could instead be done with a new intmath.h 
  header 
  in the x86/ folder containing something like
 
  #if defined(__GNUC__)  defined(__POPCNT__)
  #define av_popcount   __builtin_popcount
  #if ARCH_X86_64
  #define av_popcount64 __builtin_popcountll
  #endif
  #endif
 
  For a cleaner compile time check.
 
   configure   | 12 ++--
   libavutil/intmath.h | 13 +
   2 files changed, 23 insertions(+), 2 deletions(-)
 
  
  For the record, the builtin implementation looks like this here:
  
   av_popcount_c:
 0:   89 f8   mov%edi,%eax
 2:   d1 e8   shr%eax
 4:   25 55 55 55 55  and$0x,%eax
 9:   29 c7   sub%eax,%edi
 b:   89 fa   mov%edi,%edx
 d:   c1 ef 02shr$0x2,%edi
10:   81 e2 33 33 33 33   and$0x,%edx
16:   81 e7 33 33 33 33   and$0x,%edi
1c:   8d 04 17lea(%rdi,%rdx,1),%eax
1f:   89 c2   mov%eax,%edx
21:   c1 ea 04shr$0x4,%edx
24:   01 d0   add%edx,%eax
26:   25 0f 0f 0f 0f  and$0xf0f0f0f,%eax
2b:   89 c2   mov%eax,%edx
2d:   c1 ea 08shr$0x8,%edx
30:   01 d0   add%edx,%eax
32:   89 c2   mov%eax,%edx
34:   c1 ea 10shr$0x10,%edx
37:   01 d0   add%edx,%eax
39:   83 e0 3fand$0x3f,%eax
3c:   c3  retq   
3d:   0f 1f 00nopl   (%rax)
  
  0040 popcount_gcc:
40:   48 83 ec 08 sub$0x8,%rsp
44:   89 ff   mov%edi,%edi
46:   e8 00 00 00 00  callq  4b popcount_gcc+0xb
4b:   48 83 c4 08 add$0x8,%rsp
4f:   c3  retq   
  
  0040 popcount_clang:
40:   89 f8   mov%edi,%eax
42:   d1 e8   shr%eax
44:   25 55 55 55 55  and$0x,%eax
49:   29 c7   sub%eax,%edi
4b:   89 f8   mov%edi,%eax
4d:   25 33 33 33 33  and$0x,%eax
52:   c1 ef 02shr$0x2,%edi
55:   81 e7 33 33 33 33   and$0x,%edi
5b:   01 c7   add%eax,%edi
5d:   89 f8   mov%edi,%eax
5f:   c1 e8 04shr$0x4,%eax
62:   01 f8   add%edi,%eax
64:   25 0f 0f 0f 0f  and$0xf0f0f0f,%eax
69:   69 c0 01 01 01 01   imul   $0x1010101,%eax,%eax
6f:   c1 e8 18shr$0x18,%eax
72:   c3  retq   
  
  We might see relevant optimizations for our reference code.
 
 What's clang code for av_popcount64_c, or their builtin?

 popcount64_clang:
   0:   48 89 f8movrax,rdi
   3:   48 d1 e8shrrax,1
   6:   48 b9 55 55 55 55 55movabs rcx,0x
   d:   55 55 55 
  10:   48 21 c1andrcx,rax
  13:   48 29 cfsubrdi,rcx
  16:   48 b8 33 33 33 33 33movabs rax,0x
  1d:   33 33 33 
  20:   48 89 f9movrcx,rdi
  23:   48 21 c1andrcx,rax
  26:   48 c1 ef 02 shrrdi,0x2
  2a:   48 21 c7andrdi,rax
  2d:   48 01 cfaddrdi,rcx
  30:   48 89 f8movrax,rdi
  33:   48 c1 e8 04 shrrax,0x4
  37:   48 01 f8addrax,rdi
  3a:   48 b9 0f 0f 0f 0f 0fmovabs rcx,0xf0f0f0f0f0f0f0f
  41:   0f 0f 0f 
  44:   48 21 c1andrcx,rax
  47:   48 b8 01 01 01 01 01movabs rax,0x101010101010101
  4e:   01 01 01 
  51:   48 0f af c1 imul   rax,rcx
  55:   48 c1 e8 38 shrrax,0x38
  59:   c3  ret
  5a:   66 0f 1f 44 00 00   nopWORD PTR [rax+rax*1+0x0]

0060 av_popcount64_c:
  60:   89 f8   moveax,edi
  62:   d1 e8   shreax,1
  64:   25 55 55 55 55  andeax,0x
  69:   89 f9   movecx,edi
  6b:   29 c1   subecx,eax
  6d:   89 c8   moveax,ecx
  6f:   25 33 33 33 33  andeax,0x
  74:   c1 e9 02shrecx,0x2
  77:   81 e1 33 33 33 33   andecx,0x
  7d:   01 c1

Re: [FFmpeg-devel] [PATCH] avcodec/utils: Allocate dummy codec_frame for video encoders which do not allocate one

2015-02-25 Thread Michael Niedermayer

On Mon, Feb 23, 2015 at 07:51:30PM +0100, wm4 wrote:
 On Mon, 23 Feb 2015 19:11:58 +0100
 Michael Niedermayer michae...@gmx.at wrote:
 
  On Mon, Feb 23, 2015 at 06:49:36PM +0100, Michael Niedermayer wrote:
   On Mon, Feb 23, 2015 at 06:15:15PM +0100, wm4 wrote:
On Mon, 23 Feb 2015 17:19:16 +0100
Michael Niedermayer michae...@gmx.at wrote:

 This should allow simplifying many encoders
 
 Signed-off-by: Michael Niedermayer michae...@gmx.at
 ---
  libavcodec/utils.c |   12 
  1 file changed, 12 insertions(+)
 
 diff --git a/libavcodec/utils.c b/libavcodec/utils.c
 index c5e6300..c6d588a 100644
 --- a/libavcodec/utils.c
 +++ b/libavcodec/utils.c
 @@ -1695,6 +1695,16 @@ int attribute_align_arg 
 avcodec_open2(AVCodecContext *avctx, const AVCodec *code
  av_assert0(*(const AVClass **)avctx-priv_data == 
 codec-priv_class);
  }
  
 +if (av_codec_is_encoder(avctx-codec) 
 +avctx-codec_type == AVMEDIA_TYPE_VIDEO 
 +!avctx-coded_frame) {
 +avctx-coded_frame = av_frame_alloc();
 +if (avctx-coded_frame) {
 +av_assert0(avctx-coded_frame-key_frame);
 +avctx-coded_frame-pict_type = AV_PICTURE_TYPE_I;
 +}
 +}
 +
  end:
  ff_unlock_avcodec();
  if (options) {
 @@ -2852,6 +2862,8 @@ av_cold int avcodec_close(AVCodecContext *avctx)
  av_freep(avctx-internal-hwaccel_priv_data);
  
  av_freep(avctx-internal);
 +if (av_codec_is_encoder(avctx-codec))
 +av_frame_free(avctx-coded_frame);

This isn't the same condition. You might free the frame, even if you
didn't allocate it.
   
   fixed locally by seting a internal variable so the condition is
   also not duplicated
   
   

What's the value in allocating the frame anyway? Seems like it
increases the memory management pain levels by 200%.
   
   if its good or not is entirely for the lib uses to decide, nothing
   in FFmpeg should need it
   i just would like to avoid having this allocation duplicated in half
   the encoders
  
  so IMHO encoders which do not export anything in their coded_frame
  should either be able to leave it NULL or have it handled by some
  generic code that does not need to be duplicated in each encoder
 
 Can you explain more? Why do some encoders need it, and some not?

coded_frame is used to export information from encoders, encoders
should generally not need it themselfs

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Everything should be made as simple as possible, but not simpler.
-- Albert Einstein


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] Socket options for librtmp

2015-02-25 Thread Michael Niedermayer

On Mon, Feb 23, 2015 at 10:28:29PM -0600, Cary Tetrick wrote:
 
  please ping this thread when this happens
 
  about the patch itself
 
  IIUC a user using old rtmpdump with new ffmpeg would loose the
  rtmp_buffer_size option.
  Would it make sense to support the old setsockopt() code for this
  case ?
 
  [...]
 
 
 
 That's a really good point. So the only real question then is whether to
 keep it the way it is, or to submit another patch to support the other two
 options (which are not as important IMO) in the same manner as Brian's
 patch.

thats simply a question of if someone wants to implement and maintain
it


 Either way, it would eliminate the dependency on rtmpdump.
 What do you think?
 
 Thanks,
 Cary
 ___
 ffmpeg-devel mailing list
 ffmpeg-devel@ffmpeg.org
 http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

No snowflake in an avalanche ever feels responsible. -- Voltaire


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH] avformat/adxdec: check avctx-channels for invalid values

2015-02-25 Thread Andreas Cadhalpun


Hi,

if avctx-channels is 0 in adx_read_packet, size gets set to 0, av_get_packet 
sets pkt-data to NULL and then AV_RB16(pkt-data) results in a null pointer 
dereference.


Attached patch fixes this.

Best regards,
Andreas
From 2578976a0a9eec03d168f393795119fd274ee81f Mon Sep 17 00:00:00 2001
From: Andreas Cadhalpun andreas.cadhal...@googlemail.com
Date: Wed, 25 Feb 2015 22:55:44 +0100
Subject: [PATCH] avformat/adxdec: check avctx-channels for invalid values

This avoids a null pointer dereference of pkt-data.

Signed-off-by: Andreas Cadhalpun andreas.cadhal...@googlemail.com
---
 libavformat/adxdec.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libavformat/adxdec.c b/libavformat/adxdec.c
index ddaa201..24a8a1f 100644
--- a/libavformat/adxdec.c
+++ b/libavformat/adxdec.c
@@ -40,6 +40,11 @@ static int adx_read_packet(AVFormatContext *s, AVPacket *pkt)
 AVCodecContext *avctx = s-streams[0]-codec;
 int ret, size;
 
+if (avctx-channels = 0) {
+av_log(s, AV_LOG_ERROR, invalid number of channels %d\n, avctx-channels);
+return AVERROR_INVALIDDATA;
+}
+
 size = BLOCK_SIZE * avctx-channels;
 
 pkt-pos = avio_tell(s-pb);
-- 
2.1.4

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] avformat/adxdec: check avctx-channels for invalid values

2015-02-25 Thread Michael Niedermayer

On Wed, Feb 25, 2015 at 11:48:33PM +0100, Andreas Cadhalpun wrote:
 Hi,
 
 if avctx-channels is 0 in adx_read_packet, size gets set to 0,
 av_get_packet sets pkt-data to NULL and then AV_RB16(pkt-data)
 results in a null pointer dereference.
 
 Attached patch fixes this.
 
 Best regards,
 Andreas

  adxdec.c |5 +
  1 file changed, 5 insertions(+)
 7312e6a3be1771c83eac72784496c6fc4692d954  
 0001-avformat-adxdec-check-avctx-channels-for-invalid-val.patch
 From 2578976a0a9eec03d168f393795119fd274ee81f Mon Sep 17 00:00:00 2001
 From: Andreas Cadhalpun andreas.cadhal...@googlemail.com
 Date: Wed, 25 Feb 2015 22:55:44 +0100
 Subject: [PATCH] avformat/adxdec: check avctx-channels for invalid values
 
 This avoids a null pointer dereference of pkt-data.
 
 Signed-off-by: Andreas Cadhalpun andreas.cadhal...@googlemail.com
 ---
  libavformat/adxdec.c | 5 +
  1 file changed, 5 insertions(+)
 
 diff --git a/libavformat/adxdec.c b/libavformat/adxdec.c
 index ddaa201..24a8a1f 100644
 --- a/libavformat/adxdec.c
 +++ b/libavformat/adxdec.c
 @@ -40,6 +40,11 @@ static int adx_read_packet(AVFormatContext *s, AVPacket 
 *pkt)
  AVCodecContext *avctx = s-streams[0]-codec;
  int ret, size;
  
 +if (avctx-channels = 0) {
 +av_log(s, AV_LOG_ERROR, invalid number of channels %d\n, 
 avctx-channels);
 +return AVERROR_INVALIDDATA;
 +}

the demuxer should extract the channel value in adx_read_header()
and check it there. (if it needs the channels, which it does currently)

its not good for demuxing to depend on a decoder/parser setting this
value between reading the file header and before demuxing the first
packet

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I know you won't believe me, but the highest form of Human Excellence is
to question oneself and others. -- Socrates


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH][RFC] Interpretation of duration field in AVI super index chunk

2015-02-25 Thread Michael Niedermayer

On Wed, Feb 25, 2015 at 05:35:33PM +0100, Tobias Rapp wrote:
 On 24.02.2015 16:33, Tobias Rapp wrote:
 Hi,
 
 I am currently trying to interpret the index data of HuffYuv/PCM AVI
 files written by FFmpeg. If the file is larger than 2GiB an AVIX RIFF
 chunk is added for each 2GiB-block, each of these blocks gets an OpenDML
 chunk index ix00 and an OpenDML super index chunk indx is written
 for each stream that contains the file offset of the according chunk
 indexes together with its size and total duration.
 
 For this super index duration entry the documentation I have found
 states that it shall be the time span in stream ticks [1]. FFmpeg
 seems to write the chunk count instead which is correct for video
 streams where one data chunk 00dc corresponds to one frame. For audio
 streams one data chunk 01wb usually contains multiple samples.
 
 I guess for audio streams the duration should be multiplied by the
 samples-per-chunk factor in avienc.c to allow third-party applications
 to jump to the correct AVIX RIFF segment based on a given timestamp.
 
 Any comments?
 
 To illustrate my issue I have attached a patch that makes the
 OpenDML super index chunk consistent with my interpretation of the
 specs. A test with VirtualDub 1.10.4 shows that it also writes the
 audio duration using sample count instead of packet count.
 
 Using sample count allows to seek the audio stream to the right RIFF
 segment without reading through all OpenDML standard ix01 chunks.
 
 Please comment,
 Tobias

  avienc.c |9 -
  1 file changed, 8 insertions(+), 1 deletion(-)
 09f8c8250ce5ec1bdad79a1bf280028c9d3af376  
 0001-libavformat-avienc-Fix-duration-for-audio-stream-Ope.patch
 From 3a8e15598fa3e044f3ef65b5063c633cb4b3afed Mon Sep 17 00:00:00 2001
 From: Tobias Rapp t.r...@noa-audio.com
 Date: Wed, 25 Feb 2015 17:10:13 +0100
 Subject: [PATCH] libavformat/avienc: Fix duration for audio stream OpenDML
  super index
 
 Fixes the duration field of the OpenDML super index indx chunk to
 contain the number of samples instead of the number of packets for
 (linear/PCM) audio streams.
 
 This matches the OpenDML V1.02 standard text which states that the
 duration field shall contain time span in stream ticks.
 ---
  libavformat/avienc.c | 9 -
  1 file changed, 8 insertions(+), 1 deletion(-)
 
 diff --git a/libavformat/avienc.c b/libavformat/avienc.c
 index 53c2fe7..b77b295 100644
 --- a/libavformat/avienc.c
 +++ b/libavformat/avienc.c
 @@ -49,6 +49,7 @@ typedef struct AVIIentry {
  
  typedef struct AVIIndex {
  int64_t indx_start;
 +int64_t audio_strm_offset;
  int entry;
  int ents_allocated;
  AVIIentry** cluster;
 @@ -91,6 +92,7 @@ static int64_t avi_start_new_riff(AVFormatContext *s, 
 AVIOContext *pb,
  avi-riff_id++;
  for (i = 0; i  s-nb_streams; i++) {
  AVIStream *avist = s-streams[i]-priv_data;
 +avist-indexes.audio_strm_offset = avist-audio_strm_length;
  avist-indexes.entry = 0;
  }
  
 @@ -476,6 +478,7 @@ static int avi_write_ix(AVFormatContext *s)
  for (i = 0; i  s-nb_streams; i++) {
  AVIStream *avist = s-streams[i]-priv_data;
  int64_t ix, pos;
 +int au_byterate, au_ssize, au_scale;
  
  avi_stream2fourcc(tag, i, s-streams[i]-codec-codec_type);
  ix_tag[3] = '0' + i;
 @@ -511,7 +514,11 @@ static int avi_write_ix(AVFormatContext *s)
  avio_skip(pb, 16 * avi-riff_id);
  avio_wl64(pb, ix);/* qwOffset */
  avio_wl32(pb, pos - ix);  /* dwSize */
 -avio_wl32(pb, avist-indexes.entry);  /* dwDuration */
 +ff_parse_specific_params(s-streams[i], au_byterate, au_ssize, 
 au_scale);
 +if (au_ssize == 0)
 +avio_wl32(pb, avist-indexes.entry);  /* dwDuration (packet 
 count) */
 +else

 +avio_wl32(pb, (avist-audio_strm_length - 
 avist-indexes.audio_strm_offset) / au_ssize);  /* dwDuration (sample count) 
 */

the rounding here is wrong
the index to be useable needs to have segments duration summable
as is this would lead to significangt errors if many of these
durations would be summed up
something like
avist-audio_strm_length / au_ssize - avist-indexes.audio_strm_offset / 
au_ssize
might avoid that but ive not tested it

[...]

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Many things microsoft did are stupid, but not doing something just because
microsoft did it is even more stupid. If everything ms did were stupid they
would be bankrupt already.


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 1/3] avformat/dss: correct sample rate

2015-02-25 Thread Michael Niedermayer

On Tue, Feb 24, 2015 at 03:22:37PM +0100, Michael Niedermayer wrote:
 Signed-off-by: Michael Niedermayer michae...@gmx.at
 ---
  libavformat/dss.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

applied

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

When the tyrant has disposed of foreign enemies by conquest or treaty, and
there is nothing more to fear from them, then he is always stirring up
some war or other, in order that the people may require a leader. -- Plato


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH] libavfilter/thumbnail filter, copy buffers instead of reference

2015-02-25 Thread Chris Kennedy

This fixes the buffer handling in the thumbnail filter, it seems that the
referencing buffers it does isn't safe (could be hundreds of them held in
ref) and the main ffmpeg buffer handling can fail.

It now just grabs the video buffers and copies them, frees the frame
immediately, and works just the same now, and doesn't use 1.7 Gig of memory
in my tests with 300 buffers, it uses around 100 meg max now.  There was
obviously something nasty in how it was doing this.



diff --git a/libavfilter/vf_thumbnail.c b/libavfilter/vf_thumbnail.c
index 1883154..82b8ebe 100644
--- a/libavfilter/vf_thumbnail.c
+++ b/libavfilter/vf_thumbnail.c
@@ -137,9 +137,6 @@ static int filter_frame(AVFilterLink *inlink, AVFrame
*frame)
 int *hist = thumb-frames[thumb-n].histogram;
 const uint8_t *p = frame-data[0];

-// keep a reference of each frame
-thumb-frames[thumb-n].buf = frame;
-
 // update current frame RGB histogram
 for (j = 0; j  inlink-h; j++) {
 for (i = 0; i  inlink-w; i++) {
@@ -150,6 +147,17 @@ static int filter_frame(AVFilterLink *inlink, AVFrame
*frame)
 p += frame-linesize[0];
 }

+// Copy orginal frame into buffer(s)
+thumb-frames[thumb-n].buf = ff_get_video_buffer(inlink, inlink-w,
inlink-h);
+if (!thumb-frames[thumb-n].buf) {
+av_frame_free(frame);
+return AVERROR(ENOMEM);
+}
+av_frame_copy_props(thumb-frames[thumb-n].buf, frame);
+
+// Free original frame we have copied
+av_frame_free(frame);
+
 // no selection until the buffer of N frames is filled up
 thumb-n++;
 if (thumb-n  thumb-n_frames)


I previously sent a patch which was wrong, and just avoided the issue for
certain media but hit it on others, this should fix it for good now (since
it also doesn't alter the chosen thumbnail that results)...


Thanks,
Chris
---
Chris Kennedy
Video Engineer
CrunchyRoll - http://www.crunchyroll.com



-- 
---
Chris Kennedy
diff --git a/libavfilter/vf_thumbnail.c b/libavfilter/vf_thumbnail.c
index 1883154..a1272a0 100644
--- a/libavfilter/vf_thumbnail.c
+++ b/libavfilter/vf_thumbnail.c
@@ -142,7 +142,8 @@ static int filter_frame(AVFilterLink *inlink, AVFrame 
*frame)

 // update current frame RGB histogram
 for (j = 0; j  inlink-h; j++) {
-for (i = 0; i  inlink-w; i++) {
+// last third of image, walk every 3 bytes/pixels reading RGB
+for (i = 0; i  inlink-w/3; i++) {
 hist[0*256 + p[i*3]]++;
 hist[1*256 + p[i*3 + 1]]++;
 hist[2*256 + p[i*3 + 2]]++;

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] avformat/dss: set packet duration

2015-02-25 Thread Michael Niedermayer

On Tue, Feb 24, 2015 at 04:01:22PM +0100, Michael Niedermayer wrote:
 Value taken from the decoder implementation
 
 Signed-off-by: Michael Niedermayer michae...@gmx.at

applied

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In a rich man's house there is no place to spit but his face.
-- Diogenes of Sinope


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavutil: add x86 optimized av_popcount

2015-02-25 Thread James Almer

On 25/02/15 3:29 PM, Ronald S. Bultje wrote:
 Otherwise i think the second version with builtins i sent is good to
  apply. Much
  cleaner than the configure approach to be honest.
 
 OK.
 
 Ronald

Pushed, thanks.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 3/3] avformat/dss: implement seeking

2015-02-25 Thread Michael Niedermayer

On Tue, Feb 24, 2015 at 03:22:39PM +0100, Michael Niedermayer wrote:
 This assumes CBR (which is true for all samples i have)
 
 Signed-off-by: Michael Niedermayer michae...@gmx.at
 ---
  libavformat/dss.c |   39 +++
  1 file changed, 39 insertions(+)

reviewed by Oleksij
some bugs fixed
applied

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The worst form of inequality is to try to make unequal things equal.
-- Aristotle


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 2/3] avformat/dss: set bitrate

2015-02-25 Thread Michael Niedermayer

On Tue, Feb 24, 2015 at 03:22:38PM +0100, Michael Niedermayer wrote:
 Signed-off-by: Michael Niedermayer michae...@gmx.at
 ---
  libavformat/dss.c |8 
  1 file changed, 8 insertions(+)

applied

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Awnsering whenever a program halts or runs forever is
On a turing machine, in general impossible (turings halting problem).
On any real computer, always possible as a real computer has a finite number
of states N, and will either halt in less than N cycles or never halt.


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 1/2] ffmpeg: allow to set the thread message queue size.

2015-02-25 Thread Michael Niedermayer

On Wed, Feb 25, 2015 at 10:29:09PM +0100, wm4 wrote:
 On Wed, 25 Feb 2015 22:24:39 +0100
 Michael Niedermayer michae...@gmx.at wrote:
 
  On Wed, Feb 25, 2015 at 08:00:03PM +0100, Nicolas George wrote:
   Le septidi 7 ventôse, an CCXXIII, Michael Niedermayer a écrit :
without pthreads, this seems to fail to build
   
   Forgot to test that, sorry. Fixed.
   
   By the way, what are the remaining supported cases where threads are not
   available?
  
  i think the case i tested had HAVE_W32THREADS set
  also theres HAVE_OS2THREADS
  
  [...]
  
 
 Surely OS/2 support is worth of forcing every single developer who
 touches thread stuff to consider a dead platform that's only used by 2
 persons in Korea (?).

did any mail above sound like it impled that someone has to consider
OS/2 ?

the existing OS/2 support is maintained by the people using OS/2,
AFAIK they never asked others to maintain the support

[...]
-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Opposition brings concord. Out of discord comes the fairest harmony.
-- Heraclitus


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] avformat/adxdec: check avctx-channels for invalid values

2015-02-25 Thread Andreas Cadhalpun


On 26.02.2015 00:24, Michael Niedermayer wrote:

On Wed, Feb 25, 2015 at 11:48:33PM +0100, Andreas Cadhalpun wrote:

Hi,

if avctx-channels is 0 in adx_read_packet, size gets set to 0,
av_get_packet sets pkt-data to NULL and then AV_RB16(pkt-data)
results in a null pointer dereference.

Attached patch fixes this.

Best regards,
Andreas



  adxdec.c |5 +
  1 file changed, 5 insertions(+)
7312e6a3be1771c83eac72784496c6fc4692d954  
0001-avformat-adxdec-check-avctx-channels-for-invalid-val.patch
 From 2578976a0a9eec03d168f393795119fd274ee81f Mon Sep 17 00:00:00 2001
From: Andreas Cadhalpun andreas.cadhal...@googlemail.com
Date: Wed, 25 Feb 2015 22:55:44 +0100
Subject: [PATCH] avformat/adxdec: check avctx-channels for invalid values

This avoids a null pointer dereference of pkt-data.

Signed-off-by: Andreas Cadhalpun andreas.cadhal...@googlemail.com
---
  libavformat/adxdec.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/libavformat/adxdec.c b/libavformat/adxdec.c
index ddaa201..24a8a1f 100644
--- a/libavformat/adxdec.c
+++ b/libavformat/adxdec.c
@@ -40,6 +40,11 @@ static int adx_read_packet(AVFormatContext *s, AVPacket *pkt)
  AVCodecContext *avctx = s-streams[0]-codec;
  int ret, size;

+if (avctx-channels = 0) {
+av_log(s, AV_LOG_ERROR, invalid number of channels %d\n, 
avctx-channels);
+return AVERROR_INVALIDDATA;
+}


the demuxer should extract the channel value in adx_read_header()
and check it there. (if it needs the channels, which it does currently)

its not good for demuxing to depend on a decoder/parser setting this
value between reading the file header and before demuxing the first
packet


You're right about that. Attached is a patch for this.

However it might still be a good idea to apply above patch, because the 
decoder/parser could set avctx-channels to 0, even if the demuxer has set it to 
something positive.


Best regards,
Andreas
From 2a0e342cb095a6bd62dcbd313db9f33346c79f98 Mon Sep 17 00:00:00 2001
From: Andreas Cadhalpun andreas.cadhal...@googlemail.com
Date: Thu, 26 Feb 2015 01:06:57 +0100
Subject: [PATCH 2/2] avformat/adxdec: set avctx-channels in adx_read_header

It is used in adx_read_packet, which currently depends on the decoder/parser setting this value between reading the file header and demuxing the first packet.

Signed-off-by: Andreas Cadhalpun andreas.cadhal...@googlemail.com
---
 libavformat/adxdec.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libavformat/adxdec.c b/libavformat/adxdec.c
index 24a8a1f..e7107ac 100644
--- a/libavformat/adxdec.c
+++ b/libavformat/adxdec.c
@@ -88,8 +88,14 @@ static int adx_read_header(AVFormatContext *s)
 av_log(s, AV_LOG_ERROR, Invalid extradata size.\n);
 return AVERROR_INVALIDDATA;
 }
+avctx-channels= AV_RB8(avctx-extradata + 7);
 avctx-sample_rate = AV_RB32(avctx-extradata + 8);
 
+if (avctx-channels = 0) {
+av_log(s, AV_LOG_ERROR, invalid number of channels %d\n, avctx-channels);
+return AVERROR_INVALIDDATA;
+}
+
 st-codec-codec_type  = AVMEDIA_TYPE_AUDIO;
 st-codec-codec_id= s-iformat-raw_codec_id;
 
-- 
2.1.4

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] libavfilter/thumbnail filter, copy buffers instead of reference

2015-02-25 Thread Chris Kennedy

I forgot the copy of the original buffer, oddly if I copy it then causes
things to crash at the end when the codec is freed, only for certain .avi
mpeg4 and in the mpeg4 codec where it frees some data there.  I am digging
in to that, but patch at least shows deeper into something odd going on.  I
can just copy the R and G but not B and it doesn't hit the issue for this
test video.  Seems really strange, before it seemed to be having backtraces
in ffmpeg.c and buffer freeing conflicting or some problem, now when
copying the buffers it is mpeg4 codec freeing and only if reading that blue
value into the destination buffer, odd.

diff --git a/libavfilter/vf_thumbnail.c b/libavfilter/vf_thumbnail.c
index 186ac8d..925eaaf 100644
--- a/libavfilter/vf_thumbnail.c
+++ b/libavfilter/vf_thumbnail.c
@@ -137,10 +137,19 @@ static int filter_frame(AVFilterLink *inlink, AVFrame
*frame)
 int *hist = thumb-frames[thumb-n].histogram;
 const uint8_t *p = frame-data[0];

+// Copy orginal frame into buffer(s)
+thumb-frames[thumb-n].buf = ff_get_video_buffer(outlink, outlink-w,
outlink-h);
+if (!thumb-frames[thumb-n].buf) {
+av_frame_free(frame);
+return AVERROR(ENOMEM);
+}
+av_frame_copy_props(thumb-frames[thumb-n].buf, frame);
+if (av_frame_copy(thumb-frames[thumb-n].buf, frame))
+return AVERROR(ENOMEM);
+
 // update current frame RGB histogram
 for (j = 0; j  inlink-h; j++) {
@@ -148,14 +157,6 @@ static int filter_frame(AVFilterLink *inlink, AVFrame
*frame)
 p += frame-linesize[0];
 }

-// Copy orginal frame into buffer(s)
-thumb-frames[thumb-n].buf = ff_get_video_buffer(inlink, inlink-w,
inlink-h);
-if (!thumb-frames[thumb-n].buf) {
-av_frame_free(frame);
-return AVERROR(ENOMEM);
-}
-av_frame_copy_props(thumb-frames[thumb-n].buf, frame);
-
 // Free original frame we have copied
 av_frame_free(frame);


On Wed, Feb 25, 2015 at 3:40 PM, Chris Kennedy bitbyte...@gmail.com wrote:

 This fixes the buffer handling in the thumbnail filter, it seems that the
 referencing buffers it does isn't safe (could be hundreds of them held in
 ref) and the main ffmpeg buffer handling can fail.

 It now just grabs the video buffers and copies them, frees the frame
 immediately, and works just the same now, and doesn't use 1.7 Gig of memory
 in my tests with 300 buffers, it uses around 100 meg max now.  There was
 obviously something nasty in how it was doing this.



 diff --git a/libavfilter/vf_thumbnail.c b/libavfilter/vf_thumbnail.c
 index 1883154..82b8ebe 100644
 --- a/libavfilter/vf_thumbnail.c
 +++ b/libavfilter/vf_thumbnail.c
 @@ -137,9 +137,6 @@ static int filter_frame(AVFilterLink *inlink, AVFrame
 *frame)
  int *hist = thumb-frames[thumb-n].histogram;
  const uint8_t *p = frame-data[0];

 -// keep a reference of each frame
 -thumb-frames[thumb-n].buf = frame;
 -
  // update current frame RGB histogram
  for (j = 0; j  inlink-h; j++) {
  for (i = 0; i  inlink-w; i++) {
 @@ -150,6 +147,17 @@ static int filter_frame(AVFilterLink *inlink, AVFrame
 *frame)
  p += frame-linesize[0];
  }

 +// Copy orginal frame into buffer(s)
 +thumb-frames[thumb-n].buf = ff_get_video_buffer(inlink, inlink-w,
 inlink-h);
 +if (!thumb-frames[thumb-n].buf) {
 +av_frame_free(frame);
 +return AVERROR(ENOMEM);
 +}
 +av_frame_copy_props(thumb-frames[thumb-n].buf, frame);
 +
 +// Free original frame we have copied
 +av_frame_free(frame);
 +
  // no selection until the buffer of N frames is filled up
  thumb-n++;
  if (thumb-n  thumb-n_frames)


 I previously sent a patch which was wrong, and just avoided the issue for
 certain media but hit it on others, this should fix it for good now (since
 it also doesn't alter the chosen thumbnail that results)...


 Thanks,
 Chris
 ---
 Chris Kennedy
 Video Engineer
 CrunchyRoll - http://www.crunchyroll.com



 --
 ---
 Chris Kennedy




-- 
---
Chris Kennedy
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] Port FFT domain filter.

2015-02-25 Thread Michael Niedermayer

On Wed, Feb 25, 2015 at 11:55:51AM +0530, arwa arif wrote:
 On Tue, Feb 24, 2015 at 4:12 PM, Michael Niedermayer michae...@gmx.at
 wrote:
 
  On Tue, Feb 24, 2015 at 02:27:01PM +0530, arwa arif wrote:
   Hello,
  
   I have written a very primitive code for porting FFT domain filter. It
   accepts only gray8 format images. The output should be a grayscale image,
   but the ouput image is coming out to be a black and white image. Also, I
  am
   getting confused when to do the vertical pass. After taking irdft of the
   horizontal pass or before it?
 
  you can do both rdft first and then the 2 irdft passes, this should
  give more possibilities in filtering but it could be done the other
  way around too
 
 
  
   I have attached the patch.
 
Makefile |1
allfilters.c |1
vf_fftfilt.c |  139
  +++
3 files changed, 141 insertions(+)
   d4b25d6a204534a66400f52c1f5312652e8208af
  0001-Port-FFT-domain-filter.patch
   From 455a261d7e2b3afba767aac2e73448aeee02d159 Mon Sep 17 00:00:00 2001
   From: Arwa Arif arwaarif1...@gmail.com
   Date: Tue, 24 Feb 2015 12:17:30 +0530
   Subject: [PATCH] Port FFT domain filter.
  
   ---
libavfilter/Makefile |1 +
libavfilter/allfilters.c |1 +
libavfilter/vf_fftfilt.c |  139
  ++
3 files changed, 141 insertions(+)
create mode 100644 libavfilter/vf_fftfilt.c
  
   diff --git a/libavfilter/Makefile b/libavfilter/Makefile
   index 289c63b..b184f07 100644
   --- a/libavfilter/Makefile
   +++ b/libavfilter/Makefile
   @@ -120,6 +120,7 @@ OBJS-$(CONFIG_EDGEDETECT_FILTER) +=
  vf_edgedetect.o
OBJS-$(CONFIG_EQ_FILTER) += vf_eq.o
OBJS-$(CONFIG_EXTRACTPLANES_FILTER)  += vf_extractplanes.o
OBJS-$(CONFIG_FADE_FILTER)   += vf_fade.o
   +OBJS-$(CONFIG_FFTFILT_FILTER)+= vf_fftfilt.o
OBJS-$(CONFIG_FIELD_FILTER)  += vf_field.o
OBJS-$(CONFIG_FIELDMATCH_FILTER) += vf_fieldmatch.o
OBJS-$(CONFIG_FIELDORDER_FILTER) += vf_fieldorder.o
   diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
   index 55de154..043ac56 100644
   --- a/libavfilter/allfilters.c
   +++ b/libavfilter/allfilters.c
   @@ -136,6 +136,7 @@ void avfilter_register_all(void)
REGISTER_FILTER(EQ, eq, vf);
REGISTER_FILTER(EXTRACTPLANES,  extractplanes,  vf);
REGISTER_FILTER(FADE,   fade,   vf);
   +REGISTER_FILTER(FFTFILT,fftfilt,vf);
REGISTER_FILTER(FIELD,  field,  vf);
REGISTER_FILTER(FIELDMATCH, fieldmatch, vf);
REGISTER_FILTER(FIELDORDER, fieldorder, vf);
   diff --git a/libavfilter/vf_fftfilt.c b/libavfilter/vf_fftfilt.c
   new file mode 100644
   index 000..753bc8e
   --- /dev/null
   +++ b/libavfilter/vf_fftfilt.c
   @@ -0,0 +1,139 @@
   +/*
   + * Copyright (c) 2015 Arwa Arif arwaarif1...@gmail.com
   + *
   + * This file is part of FFmpeg.
   + *
   + * FFmpeg is free software; you can redistribute it and/or modify
   + * it under the terms of the GNU General Public License as published by
   + * the Free Software Foundation; either version 2 of the License, or
   + * (at your option) any later version.
   + *
   + * FFmpeg is distributed in the hope that it will be useful,
   + * but WITHOUT ANY WARRANTY; without even the implied warranty of
   + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   + * GNU General Public License for more details.
   + *
   + * You should have received a copy of the GNU General Public License
  along
   + * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
   + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
   + */
   +
   +/**
   + * @file
   + * FFT domain filtering.
   + */
   +
   +#include libavfilter/internal.h
   +#include libavutil/common.h
   +#include libavutil/imgutils.h
   +#include libavutil/opt.h
   +#include libavutil/pixdesc.h
   +#include libavcodec/avfft.h
   +
   +typedef struct {
   +const AVClass *class;
   +
   +RDFTContext *rdft;
   +int rdft_bits;
   +FFTSample *rdft_data;
   +
   +} FFTFILTContext;
   +
   +static int filter_frame(AVFilterLink *inlink, AVFrame *in)
   +{
   +AVFilterContext *ctx = inlink-dst;
   +AVFilterLink *outlink = inlink-dst-outputs[0];
   +FFTFILTContext *fftfilt = ctx-priv;
   +AVFrame *out;
   +int i, j, k, rdft_bits;
   +size_t rdft_len, w, h;
   +
   +w = inlink-w;
   +h = inlink-h;
   +
   +/* RDFT window size (precision) according to the requested output
  frame height */
   +for (rdft_bits = 1; 1  rdft_bits  2 * w; rdft_bits++);
   +rdft_len = 1  rdft_bits;
   +fftfilt-rdft_data = av_malloc_array(h, rdft_len *
  sizeof(FFTSample));
   +memset(fftfilt-rdft_data, 0, rdft_len * h *

[FFmpeg-devel] [PATCH 2/2] ffmpeg: notify when the thread message queue blocks.

2015-02-25 Thread Nicolas George

This can help finding the source of A-V desync with live input.

Signed-off-by: Nicolas George geo...@nsup.org
---
 ffmpeg.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)


No functional change since the first version, but reworded to point to the
relevant option and give the current value.


diff --git a/ffmpeg.c b/ffmpeg.c
index 54ba0cb..c6f3dcd 100644
--- a/ffmpeg.c
+++ b/ffmpeg.c
@@ -3356,6 +3356,7 @@ static int check_keyboard_interaction(int64_t cur_time)
 static void *input_thread(void *arg)
 {
 InputFile *f = arg;
+unsigned flags = f-non_blocking ? AV_THREAD_MESSAGE_NONBLOCK : 0;
 int ret = 0;
 
 while (1) {
@@ -3371,7 +3372,15 @@ static void *input_thread(void *arg)
 break;
 }
 av_dup_packet(pkt);
-ret = av_thread_message_queue_send(f-in_thread_queue, pkt, 0);
+ret = av_thread_message_queue_send(f-in_thread_queue, pkt, flags);
+if (flags  ret == AVERROR(EAGAIN)) {
+flags = 0;
+ret = av_thread_message_queue_send(f-in_thread_queue, pkt, 
flags);
+av_log(f-ctx, AV_LOG_WARNING,
+   Thread message queue blocking; consider raising the 
+   thread_queue_size option (current value: %d)\n,
+   f-thread_queue_size);
+}
 if (ret  0) {
 if (ret != AVERROR_EOF)
 av_log(f-ctx, AV_LOG_ERROR,
-- 
2.1.4

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH 1/2] ffmpeg: allow to set the thread message queue size.

2015-02-25 Thread Nicolas George

Signed-off-by: Nicolas George geo...@nsup.org
---
 doc/ffmpeg.texi | 6 ++
 ffmpeg.c| 2 +-
 ffmpeg.h| 2 ++
 ffmpeg_opt.c| 4 
 4 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/doc/ffmpeg.texi b/doc/ffmpeg.texi
index 6e51c44..6772f2f 100644
--- a/doc/ffmpeg.texi
+++ b/doc/ffmpeg.texi
@@ -1158,6 +1158,12 @@ This option enables or disables accurate seeking in 
input files with the
 transcoding. Use @option{-noaccurate_seek} to disable it, which may be useful
 e.g. when copying some streams and transcoding the others.
 
+@item -thread_message_queue @var{size} (@emph{input})
+This option sets the maximum number of queued packets when reading from the
+file or device. With low latency / high rate live streams, packets may be
+discarded if they are not read in a timely manner; raising this value can
+avoid it.
+
 @item -override_ffserver (@emph{global})
 Overrides the input specifications from @command{ffserver}. Using this
 option you can map any input stream to @command{ffserver} and control
diff --git a/ffmpeg.c b/ffmpeg.c
index becd5df..54ba0cb 100644
--- a/ffmpeg.c
+++ b/ffmpeg.c
@@ -3420,7 +3420,7 @@ static int init_input_threads(void)
 strcmp(f-ctx-iformat-name, lavfi))
 f-non_blocking = 1;
 ret = av_thread_message_queue_alloc(f-in_thread_queue,
-8, sizeof(AVPacket));
+f-thread_queue_size, 
sizeof(AVPacket));
 if (ret  0)
 return ret;
 
diff --git a/ffmpeg.h b/ffmpeg.h
index d2e0c5c..7c0c22c 100644
--- a/ffmpeg.h
+++ b/ffmpeg.h
@@ -111,6 +111,7 @@ typedef struct OptionsContext {
 int64_t input_ts_offset;
 int rate_emu;
 int accurate_seek;
+int thread_queue_size;
 
 SpecifierOpt *ts_scale;
 intnb_ts_scale;
@@ -350,6 +351,7 @@ typedef struct InputFile {
 pthread_t thread;   /* thread reading from this file */
 int non_blocking;   /* reading packets from the thread should not 
block */
 int joined; /* the thread has been joined */
+int thread_queue_size;  /* maximum number of queued packets */
 #endif
 } InputFile;
 
diff --git a/ffmpeg_opt.c b/ffmpeg_opt.c
index 09e6e33..05b37d4 100644
--- a/ffmpeg_opt.c
+++ b/ffmpeg_opt.c
@@ -951,6 +951,7 @@ static int open_input_file(OptionsContext *o, const char 
*filename)
 f-nb_streams = ic-nb_streams;
 f-rate_emu   = o-rate_emu;
 f-accurate_seek = o-accurate_seek;
+f-thread_queue_size = o-thread_queue_size  0 ? o-thread_queue_size : 8;
 
 /* check if all codec options have been used */
 unused_opts = strip_specifiers(o-g-codec_opts);
@@ -2957,6 +2958,9 @@ const OptionDef options[] = {
 { disposition,OPT_STRING | HAS_ARG | OPT_SPEC |
 OPT_OUTPUT,  { .off = 
OFFSET(disposition) },
 disposition,  },
+{ thread_queue_size, HAS_ARG | OPT_INT | OPT_OFFSET | OPT_EXPERT | 
OPT_INPUT,
+ { .off = 
OFFSET(thread_queue_size) },
+set the maximum number of queued packets from the demuxer },
 
 /* video options */
 { vframes,  OPT_VIDEO | HAS_ARG  | OPT_PERFILE | OPT_OUTPUT, 
  { .func_arg = opt_video_frames },
-- 
2.1.4

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] DSP function ARM NEON patches for hevc

2015-02-25 Thread Tomperi Seppo

17/02/15 12:44, Michael Niedermayer mich...@niedermayer.cc:

On Tue, Feb 17, 2015 at 07:33:04AM +, Tomperi Seppo wrote:
 
  On 16 Feb 2015, at 19:54, Michael Niedermayer
mich...@niedermayer.cc wrote:
  
  On Mon, Feb 16, 2015 at 12:47:36PM +, Tomperi Seppo wrote:
  More NEON optimizations for testing. fate-hevc passes on Tegra K1,
but these haven't been tested for NEON clobbering.
  
  -Seppo
  
  
  From: Tomperi Seppo
  Sent: Monday, February 16, 2015 1:30 PM
  To: Michael Niedermayer
  Cc: Michael Niedermayer; FFmpeg development discussions and patches;
Mickaël Raulet
  Subject: RE: [FFmpeg-devel]  DSP function ARM NEON patches for hevc
  
  Hi Michael,
  
  Here is a totally shot in a dark fix attempt for NEON register
clobbering for deblocking. Could you test it with qemu and check if it
works.
  
  
  -Seppo
  
  
  From: Michael Niedermayer [mich...@niedermayer.cc]
  Sent: Monday, February 16, 2015 3:28 AM
  To: Tomperi Seppo
  Cc: Michael Niedermayer; FFmpeg development discussions and patches;
Mickaël Raulet
  Subject: Re: [FFmpeg-devel]  DSP function ARM NEON patches for hevc
  
  Hi
  
  On Sun, Feb 15, 2015 at 08:31:32PM +, Tomperi Seppo wrote:
  Hi!
  
  The reason is chroma deblocking which is using q4 without pushing
it to stack. :/
  Unfortunately I am in Geneve this week and don't have ARM linux
board with me so it is not easy to test.
  
  Mickael Raulet: maybe guys at INSA could run tests this week if I
make a fix? Could you ask?
  
  If they cant, then i probably can test it too if its a patch which
  applies cleanly to ffmpeg and testing fate-hevc with
  --enable-neon-clobber-test under qemu is what is needed
  i could test on a arm board too if needed
  
  
  
  I also have SAO, qpel and epel NEON patches for latest FFmpeg. They
pass fate-hevc on Jetson TK1, but should be iOS and clobber checked.
  
  -Seppo
  
  
  
  From: Michael Niedermayer [michae...@gmx.at]
  Sent: Friday, February 13, 2015 5:38 PM
  To: FFmpeg development discussions and patches
  Cc: Tomperi Seppo; Mickaël Raulet
  Subject: Re: [FFmpeg-devel]  DSP function ARM NEON patches for hevc
  
  On Thu, Feb 05, 2015 at 02:22:28PM +0100, Mickaël Raulet wrote:
  Michael,
  
  Please find some commits that can be cherry picked from
  https://github.com/OpenHEVC/FFmpeg/commits/ffmpeg_patch
  
  
  Optimized deblocking filter (8bits only)
  1b9ee47d2f43b0a029a9468233626102eb1473b8
  
  this breaks the neon clobber test see:
  
fate.ffmpeg.org/report.cgi?time=20150211030204slot=armv7l-panda-gcc4.6-c
ortexa8-clobber
  
  [...]
  --
  Michael GnuPG fingerprint:
9FF2128B147EF6730BADF133611EC787040B0FAB
  
  The worst form of inequality is to try to make unequal things equal.
  -- Aristotle
  
  
  --
  Michael GnuPG fingerprint:
9FF2128B147EF6730BADF133611EC787040B0FAB
  
  Opposition brings concord. Out of discord comes the fairest harmony.
  -- Heraclitus
  
  Makefile|3
  hevcdsp_init_neon.c |  159 
  hevcdsp_qpel_neon.S |  999

  3 files changed, 1160 insertions(+), 1 deletion(-)
  9fb0b3c33edf085845b7a0fba3ca77d1ba55dd6c
0001-hevcdsp-ARM-NEON-optimized-qpel-functions.patch
  From ce06cb2bea4b051995608b11651b185e7a825a4c Mon Sep 17 00:00:00
2001
  From: Seppo Tomperi seppo.tomp...@vtt.fi
  Date: Wed, 11 Feb 2015 10:20:26 +
  Subject: [PATCH] hevcdsp: ARM NEON optimized qpel functions
  
  ---
  libavcodec/arm/Makefile|   3 +-
  libavcodec/arm/hevcdsp_init_neon.c | 159 ++
  libavcodec/arm/hevcdsp_qpel_neon.S | 999
+
  3 files changed, 1160 insertions(+), 1 deletion(-)
  create mode 100644 libavcodec/arm/hevcdsp_qpel_neon.S
  
  
  seems to fail building:
  
 libavformat/utils.o
  CC  libavcodec/arm/hevcdsp_init_neon.o
  AS  libavcodec/arm/hevcdsp_qpel_neon.o
  ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S: Assembler messages:
  ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: expected } --
`vld1.32 {d0[0]d0[1]d1[0]d1[1]},[r2],r3'
  ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: Neon double or
quad precision register expected -- `vld1.32 {},[r2],r3'
  ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: Neon double or
quad precision register expected -- `vld1.32 {},[r2],r3'
  ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: Neon double or
quad precision register expected -- `vld1.32 {},[r2],r3'
  ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: expected } --
`vst1.32 {d0[0]d0[1]d1[0]d1[1]},[r0],r1'
  ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: Neon double or
quad precision register expected -- `vst1.32 {},[r0],r1'
  ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: Neon double or
quad precision register expected -- `vst1.32 {},[r0],r1'
  ffmpeg/libavcodec/arm/hevcdsp_qpel_neon.S:992: Error: Neon double or
quad

44 matches

Mail list logo