Re: [libav-devel] [PATCH] Drop DCTELEM typedef

2013-01-22 Thread Luca Barbato
On 22/01/13 10:53, Diego Biurrun wrote:
 On Mon, Jan 21, 2013 at 06:18:22PM -0800, Ronald S. Bultje wrote:
 On Mon, Jan 21, 2013 at 4:04 PM, Diego Biurrun di...@biurrun.de wrote:
 It does not help as an abstraction and adds dsputil dependencies.

 I like the commit. I do want to add, though, that you're not actually
 practically removing the dsputil dependency from a lot of files (at
 build time), even though the dependency is (in a code-sense) no longer
 there. Examples are in vp3.c or vp8.c, but there's likely more.
 
 Your comment puzzles me.  vp3.c directly uses DSPContext, vp8.c has no
 dependency on dsputil, before or after my patch ...

In any case I'd rather do that on a second patch, this one is large enough.

 What I did do was push dsputil.h #includes out to the leaves of the
 dependency graph.  An example of this is prores.  I dropped the dsputil.h
 #include from proresdsp.h, but added it to proresdsp.c, proresdec.c and
 proresenc.c.  All three .c files directly use symbols from dsputil.h, so
 they relied on dsputil.h being provided to them via proresdsp.h.  Thus
 the real dependency count was not increased by three, but reduced from
 four files to three.

Sounds fair.

lu

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] dnxhdenc: fix invalid reads in dnxhd_mb_var_thread().

2013-01-22 Thread Anton Khirnov
Do not assume that frame dimensions are mod16 (or that height is mod32
for interlaced).
---
 libavcodec/dnxhdenc.c|   27 ---
 tests/ref/vsynth/vsynth1-dnxhd-1080i |4 ++--
 tests/ref/vsynth/vsynth2-dnxhd-1080i |4 ++--
 3 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/libavcodec/dnxhdenc.c b/libavcodec/dnxhdenc.c
index 8531fe0..97e0fed 100644
--- a/libavcodec/dnxhdenc.c
+++ b/libavcodec/dnxhdenc.c
@@ -615,14 +615,35 @@ static void dnxhd_setup_threads_slices(DNXHDEncContext 
*ctx)
 static int dnxhd_mb_var_thread(AVCodecContext *avctx, void *arg, int jobnr, 
int threadnr)
 {
 DNXHDEncContext *ctx = avctx-priv_data;
-int mb_y = jobnr, mb_x;
+int mb_y = jobnr, mb_x, x, y;
+int partial_last_row = (mb_y == ctx-m.mb_height - 1) 
+   ((avctx-height  ctx-interlaced)  0xF);
+
 ctx = ctx-thread[threadnr];
 if (ctx-cid_table-bit_depth == 8) {
 uint8_t *pix = ctx-thread[0]-src[0] + ((mb_y4) * ctx-m.linesize);
 for (mb_x = 0; mb_x  ctx-m.mb_width; ++mb_x, pix += 16) {
 unsigned mb  = mb_y * ctx-m.mb_width + mb_x;
-int sum = ctx-m.dsp.pix_sum(pix, ctx-m.linesize);
-int varc = (ctx-m.dsp.pix_norm1(pix, ctx-m.linesize) - 
(((unsigned)sum*sum)8)+128)8;
+int sum;
+int varc;
+
+if (!partial_last_row  mb_x * 16 = avctx-width - 16) {
+sum  = ctx-m.dsp.pix_sum(pix, ctx-m.linesize);
+varc = ctx-m.dsp.pix_norm1(pix, ctx-m.linesize);
+} else {
+int bw = FFMIN(avctx-width - 16 * mb_x, 16);
+int bh = FFMIN((avctx-height  ctx-interlaced) - 16 * mb_y, 
16);
+sum = varc = 0;
+for (y = 0; y  bh; y++) {
+for (x = 0; x  bw; x++) {
+uint8_t val = pix[x + y * ctx-m.linesize];
+sum  += val;
+varc += val * val;
+}
+}
+}
+varc = (varc - (((unsigned)sum * sum)  8) + 128)  8;
+
 ctx-mb_cmp[mb].value = varc;
 ctx-mb_cmp[mb].mb = mb;
 }
diff --git a/tests/ref/vsynth/vsynth1-dnxhd-1080i 
b/tests/ref/vsynth/vsynth1-dnxhd-1080i
index 1eddbf8..3a990c5 100644
--- a/tests/ref/vsynth/vsynth1-dnxhd-1080i
+++ b/tests/ref/vsynth/vsynth1-dnxhd-1080i
@@ -1,4 +1,4 @@
-3cfbe36a7dd5b48859b8a569d626ef77 *tests/data/fate/vsynth1-dnxhd-1080i.mov
+2412f206f5efcbbcc3f2bba0c86b73d4 *tests/data/fate/vsynth1-dnxhd-1080i.mov
 3031875 tests/data/fate/vsynth1-dnxhd-1080i.mov
-0c651e840f860592f0d5b66030d9fa32 
*tests/data/fate/vsynth1-dnxhd-1080i.out.rawvideo
+34076f61254997c8157eafed1c916472 
*tests/data/fate/vsynth1-dnxhd-1080i.out.rawvideo
 stddev:6.29 PSNR: 32.15 MAXDIFF:   64 bytes:  7603200/   760320
diff --git a/tests/ref/vsynth/vsynth2-dnxhd-1080i 
b/tests/ref/vsynth/vsynth2-dnxhd-1080i
index 41a8d51..27c79a5 100644
--- a/tests/ref/vsynth/vsynth2-dnxhd-1080i
+++ b/tests/ref/vsynth/vsynth2-dnxhd-1080i
@@ -1,4 +1,4 @@
-19a91b7da35cecf41e5e3cb322485627 *tests/data/fate/vsynth2-dnxhd-1080i.mov
+65ca6385b565b6ea9a2e28150eef1d46 *tests/data/fate/vsynth2-dnxhd-1080i.mov
 3031875 tests/data/fate/vsynth2-dnxhd-1080i.mov
-3c559af629ae0a8fb1a9a0e4b4da7733 
*tests/data/fate/vsynth2-dnxhd-1080i.out.rawvideo
+42262a2325441b38b3b3c8a42d888e7d 
*tests/data/fate/vsynth2-dnxhd-1080i.out.rawvideo
 stddev:1.31 PSNR: 45.77 MAXDIFF:   23 bytes:  7603200/   760320
-- 
1.7.10.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] dnxhdenc: fix invalid reads in dnxhd_mb_var_thread().

2013-01-22 Thread Kostya Shishkov
On Tue, Jan 22, 2013 at 12:04:56PM +0100, Anton Khirnov wrote:
 Do not assume that frame dimensions are mod16 (or that height is mod32
 for interlaced).
 ---
  libavcodec/dnxhdenc.c|   27 ---
  tests/ref/vsynth/vsynth1-dnxhd-1080i |4 ++--
  tests/ref/vsynth/vsynth2-dnxhd-1080i |4 ++--
  3 files changed, 28 insertions(+), 7 deletions(-)

probably OK but I suspect Diego needs to review this as well
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] dsputil: remove avg_no_rnd_pixels8.

2013-01-22 Thread Diego Biurrun
On Mon, Jan 21, 2013 at 06:02:38PM -0800, Ronald S. Bultje wrote:
 
 --- a/libavcodec/dsputil.h
 +++ b/libavcodec/dsputil.h
 @@ -281,15 +281,15 @@ typedef struct DSPContext {
  
  /**
   * Halfpel motion compensation with no rounding (a+b)1.
 - * this is an array[2][4] of motion compensation functions for 2
 - * horizontal blocksizes (8,16) and the 4 halfpel positionsbr
 - * *pixels_tab[ 0-16xH 1-8xH ][ xhalfpel + 2*yhalfpel ]
 + * this is an array[4] of motion compensation functions for 1
 + * horizontal blocksizes (16) and the 4 halfpel positionsbr
 + * *pixels_tab[0][ xhalfpel + 2*yhalfpel ]

one horizontal blocksize_

 -op_pixels_func avg_no_rnd_pixels_tab[4][4];
 +op_pixels_func avg_no_rnd_pixels_tab[1][4];

Why do you keep this array two-dimensional?

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] dsputil: remove avg_no_rnd_pixels8.

2013-01-22 Thread Ronald S. Bultje
Hi,

On Tue, Jan 22, 2013 at 4:00 AM, Diego Biurrun di...@biurrun.de wrote:
 On Mon, Jan 21, 2013 at 06:02:38PM -0800, Ronald S. Bultje wrote:

 --- a/libavcodec/dsputil.h
 +++ b/libavcodec/dsputil.h
 @@ -281,15 +281,15 @@ typedef struct DSPContext {

  /**
   * Halfpel motion compensation with no rounding (a+b)1.
 - * this is an array[2][4] of motion compensation functions for 2
 - * horizontal blocksizes (8,16) and the 4 halfpel positionsbr
 - * *pixels_tab[ 0-16xH 1-8xH ][ xhalfpel + 2*yhalfpel ]
 + * this is an array[4] of motion compensation functions for 1
 + * horizontal blocksizes (16) and the 4 halfpel positionsbr
 + * *pixels_tab[0][ xhalfpel + 2*yhalfpel ]

 one horizontal blocksize_

 -op_pixels_func avg_no_rnd_pixels_tab[4][4];
 +op_pixels_func avg_no_rnd_pixels_tab[1][4];

 Why do you keep this array two-dimensional?

This is currently stuck in dsputil's macro mess. I'm looking into ways
of fixing that (while also fixing some other oddities) but I'm not
quite ready with that yet. Basically, it will be fixed in a later
commit.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] Drop DCTELEM typedef

2013-01-22 Thread Ronald S. Bultje
Hi,

On Tue, Jan 22, 2013 at 1:53 AM, Diego Biurrun di...@biurrun.de wrote:
 On Mon, Jan 21, 2013 at 06:18:22PM -0800, Ronald S. Bultje wrote:
 On Mon, Jan 21, 2013 at 4:04 PM, Diego Biurrun di...@biurrun.de wrote:
  It does not help as an abstraction and adds dsputil dependencies.

 I like the commit. I do want to add, though, that you're not actually
 practically removing the dsputil dependency from a lot of files (at
 build time), even though the dependency is (in a code-sense) no longer
 there. Examples are in vp3.c or vp8.c, but there's likely more.

 Your comment puzzles me.  vp3.c directly uses DSPContext, vp8.c has no
 dependency on dsputil, before or after my patch ...
[..blah..]

$ grep dsputil\.h ../libavcodec/vp*dsp.h
../libavcodec/vp3dsp.h:#include dsputil.h
../libavcodec/vp8dsp.h:#include dsputil.h

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] Drop DCTELEM typedef

2013-01-22 Thread Diego Biurrun
On Tue, Jan 22, 2013 at 07:12:21AM -0800, Ronald S. Bultje wrote:
 On Tue, Jan 22, 2013 at 1:53 AM, Diego Biurrun di...@biurrun.de wrote:
  On Mon, Jan 21, 2013 at 06:18:22PM -0800, Ronald S. Bultje wrote:
  On Mon, Jan 21, 2013 at 4:04 PM, Diego Biurrun di...@biurrun.de wrote:
   It does not help as an abstraction and adds dsputil dependencies.
 
  I like the commit. I do want to add, though, that you're not actually
  practically removing the dsputil dependency from a lot of files (at
  build time), even though the dependency is (in a code-sense) no longer
  there. Examples are in vp3.c or vp8.c, but there's likely more.
 
  Your comment puzzles me.  vp3.c directly uses DSPContext, vp8.c has no
  dependency on dsputil, before or after my patch ...
 [..blah..]
 
 $ grep dsputil\.h ../libavcodec/vp*dsp.h
 ../libavcodec/vp3dsp.h:#include dsputil.h
 ../libavcodec/vp8dsp.h:#include dsputil.h

So it's a game of showing shell output; here's mine:

$ grep dsputil\.h libavcodec/vp*dsp.h
libavcodec/vp3dsp.h:#include dsputil.h
libavcodec/vp8dsp.h:#include dsputil.h
$ git cherry-pick c2567e6c6771a6a5bd66762e486eae0cd608f7a4
Finished one cherry-pick.
[test a10ebd3] Drop DCTELEM typedef
 163 files changed, 835 insertions(+), 812 deletions(-)
$ grep dsputil\.h libavcodec/vp*dsp.h
$ git log -n 1 --oneline c2567e6c6771a6a5bd66762e486eae0cd608f7a4 | cat
c2567e6 Drop DCTELEM typedef

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] dsputil: remove avg_no_rnd_pixels8.

2013-01-22 Thread Diego Elio Pettenò
On 22/01/2013 03:02, Ronald S. Bultje wrote:
 
 This is never used.

This has a strange effect on the other avg_pixels8_* functions, me and
Luca have been looking into it today — it's not bad, but if we can
stagger this a moment, we might be able to figure it out properly.

-- 
Diego Elio Pettenò — Flameeyes
flamee...@flameeyes.eu — http://blog.flameeyes.eu/
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] dsputil: remove avg_no_rnd_pixels8.

2013-01-22 Thread Ronald S. Bultje
Hi,

On Tue, Jan 22, 2013 at 8:59 AM, Diego Elio Pettenò
flamee...@flameeyes.eu wrote:
 On 22/01/2013 03:02, Ronald S. Bultje wrote:

 This is never used.

 This has a strange effect on the other avg_pixels8_* functions, me and
 Luca have been looking into it today — it's not bad, but if we can
 stagger this a moment, we might be able to figure it out properly.

You'll probably want to explain what you mean with strange effect?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] Drop DCTELEM typedef

2013-01-22 Thread Ronald S. Bultje
Hi,

On Tue, Jan 22, 2013 at 7:49 AM, Diego Biurrun di...@biurrun.de wrote:
 On Tue, Jan 22, 2013 at 07:12:21AM -0800, Ronald S. Bultje wrote:
 On Tue, Jan 22, 2013 at 1:53 AM, Diego Biurrun di...@biurrun.de wrote:
  On Mon, Jan 21, 2013 at 06:18:22PM -0800, Ronald S. Bultje wrote:
  On Mon, Jan 21, 2013 at 4:04 PM, Diego Biurrun di...@biurrun.de wrote:
   It does not help as an abstraction and adds dsputil dependencies.
 
  I like the commit. I do want to add, though, that you're not actually
  practically removing the dsputil dependency from a lot of files (at
  build time), even though the dependency is (in a code-sense) no longer
  there. Examples are in vp3.c or vp8.c, but there's likely more.
 
  Your comment puzzles me.  vp3.c directly uses DSPContext, vp8.c has no
  dependency on dsputil, before or after my patch ...
 [..blah..]

 $ grep dsputil\.h ../libavcodec/vp*dsp.h
 ../libavcodec/vp3dsp.h:#include dsputil.h
 ../libavcodec/vp8dsp.h:#include dsputil.h

 So it's a game of showing shell output; here's mine:

 $ grep dsputil\.h libavcodec/vp*dsp.h
 libavcodec/vp3dsp.h:#include dsputil.h
 libavcodec/vp8dsp.h:#include dsputil.h
 $ git cherry-pick c2567e6c6771a6a5bd66762e486eae0cd608f7a4
 Finished one cherry-pick.
 [test a10ebd3] Drop DCTELEM typedef
  163 files changed, 835 insertions(+), 812 deletions(-)
 $ grep dsputil\.h libavcodec/vp*dsp.h
 $ git log -n 1 --oneline c2567e6c6771a6a5bd66762e486eae0cd608f7a4 | cat
 c2567e6 Drop DCTELEM typedef

Very well then.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] dsputil: remove avg_no_rnd_pixels8.

2013-01-22 Thread Luca Barbato
On 22/01/13 18:14, Ronald S. Bultje wrote:
 Hi,
 
 On Tue, Jan 22, 2013 at 8:59 AM, Diego Elio Pettenò
 flamee...@flameeyes.eu wrote:
 On 22/01/2013 03:02, Ronald S. Bultje wrote:

 This is never used.

 This has a strange effect on the other avg_pixels8_* functions, me and
 Luca have been looking into it today — it's not bad, but if we can
 stagger this a moment, we might be able to figure it out properly.
 
 You'll probably want to explain what you mean with strange effect?

Basically by removing that code we give enough space for gcc to decide
to inline more code in the 10bit variant of some h264mc functions.

That results in overall more bytes used.

Overall the speed is around the same so it isn't an huge issue.

(tested using 1080p25 content encoded with x264 10bit to exercise all
the interesting paths)

More about it once I'm back on irc =)

I'm not against pushing it since the problem is deep down in the macro
nest and your patch is yet another step to make things more bearable.

lu
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] dsputil: remove 9/10 bits hpel functions.

2013-01-22 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

These are never used.
---
 libavcodec/dsputil.c  | 31 -
 libavcodec/dsputil_template.c | 64 ---
 2 files changed, 54 insertions(+), 41 deletions(-)

diff --git a/libavcodec/dsputil.c b/libavcodec/dsputil.c
index 7bead1d..a306583 100644
--- a/libavcodec/dsputil.c
+++ b/libavcodec/dsputil.c
@@ -2689,6 +2689,24 @@ av_cold void ff_dsputil_init(DSPContext* c, 
AVCodecContext *avctx)
 c-shrink[2]= ff_shrink44;
 c-shrink[3]= ff_shrink88;
 
+#define hpel_funcs(prefix, idx, num) \
+c-prefix ## _pixels_tab idx [0] = prefix ## _pixels ## num ## _8_c; \
+c-prefix ## _pixels_tab idx [1] = prefix ## _pixels ## num ## _x2_8_c; \
+c-prefix ## _pixels_tab idx [2] = prefix ## _pixels ## num ## _y2_8_c; \
+c-prefix ## _pixels_tab idx [3] = prefix ## _pixels ## num ## _xy2_8_c
+
+hpel_funcs(put, [0], 16);
+hpel_funcs(put, [1],  8);
+hpel_funcs(put, [2],  4);
+hpel_funcs(put, [3],  2);
+hpel_funcs(put_no_rnd, [0], 16);
+hpel_funcs(put_no_rnd, [1],  8);
+hpel_funcs(avg, [0], 16);
+hpel_funcs(avg, [1],  8);
+hpel_funcs(avg, [2],  4);
+hpel_funcs(avg, [3],  2);
+hpel_funcs(avg_no_rnd,[0], 16);
+
 #undef FUNC
 #undef FUNCC
 #define FUNC(f, depth) f ## _ ## depth
@@ -2718,7 +2736,6 @@ av_cold void ff_dsputil_init(DSPContext* c, 
AVCodecContext *avctx)
 c-PFX ## _pixels_tab[IDX][14] = FUNCC(PFX ## NUM ## _mc23, depth);\
 c-PFX ## _pixels_tab[IDX][15] = FUNCC(PFX ## NUM ## _mc33, depth)
 
-
 #define BIT_DEPTH_FUNCS(depth, dct)\
 c-get_pixels= FUNCC(get_pixels   ## dct   , depth);\
 c-draw_edges= FUNCC(draw_edges, depth);\
@@ -2734,18 +2751,6 @@ av_cold void ff_dsputil_init(DSPContext* c, 
AVCodecContext *avctx)
 c-avg_h264_chroma_pixels_tab[1] = FUNCC(avg_h264_chroma_mc4   , depth);\
 c-avg_h264_chroma_pixels_tab[2] = FUNCC(avg_h264_chroma_mc2   , depth);\
 \
-dspfunc1(put   , 0, 16, depth);\
-dspfunc1(put   , 1,  8, depth);\
-dspfunc1(put   , 2,  4, depth);\
-dspfunc1(put   , 3,  2, depth);\
-dspfunc1(put_no_rnd, 0, 16, depth);\
-dspfunc1(put_no_rnd, 1,  8, depth);\
-dspfunc1(avg   , 0, 16, depth);\
-dspfunc1(avg   , 1,  8, depth);\
-dspfunc1(avg   , 2,  4, depth);\
-dspfunc1(avg   , 3,  2, depth);\
-dspfunc1(avg_no_rnd, 0, 16, depth);\
-\
 dspfunc2(put_h264_qpel, 0, 16, depth);\
 dspfunc2(put_h264_qpel, 1,  8, depth);\
 dspfunc2(put_h264_qpel, 2,  4, depth);\
diff --git a/libavcodec/dsputil_template.c b/libavcodec/dsputil_template.c
index bd5c48b..c1199db 100644
--- a/libavcodec/dsputil_template.c
+++ b/libavcodec/dsputil_template.c
@@ -197,15 +197,7 @@ DCTELEM_FUNCS(DCTELEM, _16)
 DCTELEM_FUNCS(dctcoef, _32)
 #endif
 
-#define PIXOP2(OPNAME, OP) \
-static void FUNCC(OPNAME ## _pixels2)(uint8_t *block, const uint8_t *pixels, 
int line_size, int h){\
-int i;\
-for(i=0; ih; i++){\
-OP(*((pixel2*)(block  )), AV_RN2P(pixels  ));\
-pixels+=line_size;\
-block +=line_size;\
-}\
-}\
+#define PIXOP3(OPNAME, OP) \
 static void FUNCC(OPNAME ## _pixels4)(uint8_t *block, const uint8_t *pixels, 
int line_size, int h){\
 int i;\
 for(i=0; ih; i++){\
@@ -227,20 +219,6 @@ static inline void FUNCC(OPNAME ## 
_no_rnd_pixels8)(uint8_t *block, const uint8_
 FUNCC(OPNAME ## _pixels8)(block, pixels, line_size, h);\
 }\
 \
-static inline void FUNC(OPNAME ## _no_rnd_pixels8_l2)(uint8_t *dst, const 
uint8_t *src1, const uint8_t *src2, int dst_stride, \
-int src_stride1, int 
src_stride2, int h){\
-int i;\
-for(i=0; ih; i++){\
-pixel4 a,b;\
-a= AV_RN4P(src1[i*src_stride1  ]);\
-b= AV_RN4P(src2[i*src_stride2  ]);\
-OP(*((pixel4*)dst[i*dst_stride  ]), no_rnd_avg_pixel4(a, b));\
-a= AV_RN4P(src1[i*src_stride1+4*sizeof(pixel)]);\
-b= AV_RN4P(src2[i*src_stride2+4*sizeof(pixel)]);\
-OP(*((pixel4*)dst[i*dst_stride+4*sizeof(pixel)]), 
no_rnd_avg_pixel4(a, b));\
-}\
-}\
-\
 static inline void FUNC(OPNAME ## _pixels8_l2)(uint8_t *dst, const uint8_t 
*src1, const uint8_t *src2, int dst_stride, \
 int src_stride1, int 
src_stride2, int h){\
 int i;\
@@ -283,6 +261,36 @@ static inline void FUNC(OPNAME ## _pixels16_l2)(uint8_t 
*dst, const uint8_t *src
 FUNC(OPNAME ## _pixels8_l2)(dst+8*sizeof(pixel), src1+8*sizeof(pixel), 
src2+8*sizeof(pixel), dst_stride, src_stride1, src_stride2, h);\
 }\
 \
+CALL_2X_PIXELS(FUNCC(OPNAME ## _pixels16), FUNCC(OPNAME ## _pixels8), 
8*sizeof(pixel))
+
+#define PIXOP4(OPNAME, OP) \
+static void FUNCC(OPNAME ## _pixels2)(uint8_t *block, const uint8_t *pixels, 
int line_size, int h){\
+int i;\
+for(i=0; ih; i++){\
+OP(*((pixel2*)(block  )), AV_RN2P(pixels  ));\
+

Re: [libav-devel] [PATCH 1/2] arm: Add some missing header #includes

2013-01-22 Thread Diego Biurrun
On Mon, Jan 21, 2013 at 10:16:02AM +0100, Diego Biurrun wrote:
 ---
 
 This is a preliminary to make the DCTELEM patch work on ARM.
 
  libavcodec/arm/h264pred_init_arm.c |1 +
  libavcodec/arm/vp3dsp_init_arm.c   |1 +
  libavcodec/arm/vp8dsp_init_arm.c   |1 +
  libavcodec/arm/vp8dsp_init_armv6.c |2 ++
  libavcodec/arm/vp8dsp_init_neon.c  |2 ++
  5 files changed, 7 insertions(+), 0 deletions(-)

OKed on IRC.

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 5/6] mlp: implement support for AVCodecContext.request_channel_layout.

2013-01-22 Thread Justin Ruggles
On 12/31/2012 09:33 AM, Tim Walker wrote:
 Also wrap usage of AVCodecContext.request_channels in FF_API_REQUEST_CHANNELS 
 directives.
 ---
  libavcodec/mlp_parser.c |   29 +++--
  libavcodec/mlpdec.c |   18 ++
  2 files changed, 37 insertions(+), 10 deletions(-)

patch looks ok

-Justin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 6/6] mlp_parser: cosmetics: re-indent.

2013-01-22 Thread Justin Ruggles
On 12/31/2012 09:33 AM, Tim Walker wrote:
 ---
  libavcodec/mlp_parser.h |   20 ++--
  1 file changed, 10 insertions(+), 10 deletions(-)

ok

-Justin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] vp3dsp: don't do aligned reads on input.

2013-01-22 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

The input is not guarenteed to be aligned.
---
 libavcodec/vp3dsp.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/libavcodec/vp3dsp.c b/libavcodec/vp3dsp.c
index 1883099..0ce6b81 100644
--- a/libavcodec/vp3dsp.c
+++ b/libavcodec/vp3dsp.c
@@ -282,11 +282,11 @@ static void put_no_rnd_pixels_l2(uint8_t *dst, const 
uint8_t *src1,
 for (i = 0; i  h; i++) {
 uint32_t a, b;
 
-a = AV_RN32A(src1[i * stride]);
-b = AV_RN32A(src2[i * stride]);
+a = AV_RN32(src1[i * stride]);
+b = AV_RN32(src2[i * stride]);
 AV_WN32A(dst[i * stride], no_rnd_avg32(a, b));
-a = AV_RN32A(src1[i * stride + 4]);
-b = AV_RN32A(src2[i * stride + 4]);
+a = AV_RN32(src1[i * stride + 4]);
+b = AV_RN32(src2[i * stride + 4]);
 AV_WN32A(dst[i * stride + 4], no_rnd_avg32(a, b));
 }
 }
-- 
1.8.0

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH 1/2] flac: don't check the number of channels before setting the channel layout.

2013-01-22 Thread Tim Walker
This is unnecessary, as ff_flac_set_channel_layout can handle any number of 
channels.
---
 libavcodec/flac_parser.c |2 +-
 libavcodec/flacdec.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavcodec/flac_parser.c b/libavcodec/flac_parser.c
index 3d8e17f..ee92ee3 100644
--- a/libavcodec/flac_parser.c
+++ b/libavcodec/flac_parser.c
@@ -458,7 +458,7 @@ static int get_best_header(FLACParseContext* fpc, const 
uint8_t **poutbuf,
 }
 
 if (header-fi.channels != fpc-avctx-channels ||
-(!fpc-avctx-channel_layout  header-fi.channels = 6)) {
+!fpc-avctx-channel_layout) {
 fpc-avctx-channels = header-fi.channels;
 ff_flac_set_channel_layout(fpc-avctx);
 }
diff --git a/libavcodec/flacdec.c b/libavcodec/flacdec.c
index 51fd196..f273d14 100644
--- a/libavcodec/flacdec.c
+++ b/libavcodec/flacdec.c
@@ -426,7 +426,7 @@ static int decode_frame(FLACContext *s)
 return ret;
 }
 s-channels = s-avctx-channels = fi.channels;
-if (!s-avctx-channel_layout  s-channels = 6)
+if (!s-avctx-channel_layout)
 ff_flac_set_channel_layout(s-avctx);
 s-ch_mode = fi.ch_mode;
 
-- 
1.7.10.2 (Apple Git-33)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH 2/2] flac: add channel layout masks for streams with 7 or 8 channels.

2013-01-22 Thread Tim Walker
They were added to the latest FLAC specification:
https://git.xiph.org/?p=flac-website.git;a=commit;h=65c199a2
---
 libavcodec/flac.c|6 --
 libavcodec/version.h |2 +-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/libavcodec/flac.c b/libavcodec/flac.c
index 32b28d0..aa322b4 100644
--- a/libavcodec/flac.c
+++ b/libavcodec/flac.c
@@ -29,13 +29,15 @@
 
 static const int8_t sample_size_table[] = { 0, 8, 12, 0, 16, 20, 24, 0 };
 
-static const int64_t flac_channel_layouts[6] = {
+static const uint64_t flac_channel_layouts[8] = {
 AV_CH_LAYOUT_MONO,
 AV_CH_LAYOUT_STEREO,
 AV_CH_LAYOUT_SURROUND,
 AV_CH_LAYOUT_QUAD,
 AV_CH_LAYOUT_5POINT0,
-AV_CH_LAYOUT_5POINT1
+AV_CH_LAYOUT_5POINT1,
+AV_CH_LAYOUT_6POINT1,
+AV_CH_LAYOUT_7POINT1
 };
 
 static int64_t get_utf8(GetBitContext *gb)
diff --git a/libavcodec/version.h b/libavcodec/version.h
index 62f2bcd..1b1c403 100644
--- a/libavcodec/version.h
+++ b/libavcodec/version.h
@@ -28,7 +28,7 @@
 
 #define LIBAVCODEC_VERSION_MAJOR 54
 #define LIBAVCODEC_VERSION_MINOR 40
-#define LIBAVCODEC_VERSION_MICRO  0
+#define LIBAVCODEC_VERSION_MICRO  1
 
 #define LIBAVCODEC_VERSION_INT  AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \
LIBAVCODEC_VERSION_MINOR, \
-- 
1.7.10.2 (Apple Git-33)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/2] flac: don't check the number of channels before setting the channel layout.

2013-01-22 Thread Justin Ruggles
On 01/22/2013 03:53 PM, Tim Walker wrote:
 This is unnecessary, as ff_flac_set_channel_layout can handle any number of 
 channels.
 ---
  libavcodec/flac_parser.c |2 +-
  libavcodec/flacdec.c |2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/libavcodec/flac_parser.c b/libavcodec/flac_parser.c
 index 3d8e17f..ee92ee3 100644
 --- a/libavcodec/flac_parser.c
 +++ b/libavcodec/flac_parser.c
 @@ -458,7 +458,7 @@ static int get_best_header(FLACParseContext* fpc, const 
 uint8_t **poutbuf,
  }
  
  if (header-fi.channels != fpc-avctx-channels ||
 -(!fpc-avctx-channel_layout  header-fi.channels = 6)) {
 +!fpc-avctx-channel_layout) {
  fpc-avctx-channels = header-fi.channels;
  ff_flac_set_channel_layout(fpc-avctx);
  }
 diff --git a/libavcodec/flacdec.c b/libavcodec/flacdec.c
 index 51fd196..f273d14 100644
 --- a/libavcodec/flacdec.c
 +++ b/libavcodec/flacdec.c
 @@ -426,7 +426,7 @@ static int decode_frame(FLACContext *s)
  return ret;
  }
  s-channels = s-avctx-channels = fi.channels;
 -if (!s-avctx-channel_layout  s-channels = 6)
 +if (!s-avctx-channel_layout)
  ff_flac_set_channel_layout(s-avctx);
  s-ch_mode = fi.ch_mode;
  

LGTM

-Justin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] flac: add channel layout masks for streams with 7 or 8 channels.

2013-01-22 Thread Justin Ruggles
On 01/22/2013 03:53 PM, Tim Walker wrote:
 They were added to the latest FLAC specification:
 https://git.xiph.org/?p=flac-website.git;a=commit;h=65c199a2
 ---
  libavcodec/flac.c|6 --
  libavcodec/version.h |2 +-
  2 files changed, 5 insertions(+), 3 deletions(-)
 
 diff --git a/libavcodec/flac.c b/libavcodec/flac.c
 index 32b28d0..aa322b4 100644
 --- a/libavcodec/flac.c
 +++ b/libavcodec/flac.c
 @@ -29,13 +29,15 @@
  
  static const int8_t sample_size_table[] = { 0, 8, 12, 0, 16, 20, 24, 0 };
  
 -static const int64_t flac_channel_layouts[6] = {
 +static const uint64_t flac_channel_layouts[8] = {
  AV_CH_LAYOUT_MONO,
  AV_CH_LAYOUT_STEREO,
  AV_CH_LAYOUT_SURROUND,
  AV_CH_LAYOUT_QUAD,
  AV_CH_LAYOUT_5POINT0,
 -AV_CH_LAYOUT_5POINT1
 +AV_CH_LAYOUT_5POINT1,
 +AV_CH_LAYOUT_6POINT1,
 +AV_CH_LAYOUT_7POINT1
  };
  
  static int64_t get_utf8(GetBitContext *gb)
 diff --git a/libavcodec/version.h b/libavcodec/version.h
 index 62f2bcd..1b1c403 100644
 --- a/libavcodec/version.h
 +++ b/libavcodec/version.h
 @@ -28,7 +28,7 @@
  
  #define LIBAVCODEC_VERSION_MAJOR 54
  #define LIBAVCODEC_VERSION_MINOR 40
 -#define LIBAVCODEC_VERSION_MICRO  0
 +#define LIBAVCODEC_VERSION_MICRO  1
  
  #define LIBAVCODEC_VERSION_INT  AV_VERSION_INT(LIBAVCODEC_VERSION_MAJOR, \
 LIBAVCODEC_VERSION_MINOR, \

LGTM

-Justin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] dsputil: x86: Convert some inline asm to yasm

2013-01-22 Thread Diego Biurrun
On Tue, Jan 22, 2013 at 04:40:34PM -0500, Daniel Kang wrote:
 --- a/libavcodec/x86/dsputil_avg_template.c
 +++ b/libavcodec/x86/dsputil_avg_template.c
 @@ -24,781 +24,32 @@
  
  //FIXME the following could be optimized too ...
 +static void DEF(ff_put_no_rnd_pixels16_x2)(uint8_t *block, const uint8_t 
 *pixels, int line_size, int h){
 +DEF(ff_put_no_rnd_pixels8_x2)(block  , pixels  , line_size, h);
 +DEF(ff_put_no_rnd_pixels8_x2)(block+8, pixels+8, line_size, h);
  }
 +static void DEF(ff_put_pixels16_y2)(uint8_t *block, const uint8_t *pixels, 
 int line_size, int h){
 +DEF(ff_put_pixels8_y2)(block  , pixels  , line_size, h);
 +DEF(ff_put_pixels8_y2)(block+8, pixels+8, line_size, h);
  }
 +static void DEF(ff_put_no_rnd_pixels16_y2)(uint8_t *block, const uint8_t 
 *pixels, int line_size, int h){
 +DEF(ff_put_no_rnd_pixels8_y2)(block  , pixels  , line_size, h);
 +DEF(ff_put_no_rnd_pixels8_y2)(block+8, pixels+8, line_size, h);
  }
 +static void DEF(ff_avg_pixels16)(uint8_t *block, const uint8_t *pixels, int 
 line_size, int h){
 +DEF(ff_avg_pixels8)(block  , pixels  , line_size, h);
 +DEF(ff_avg_pixels8)(block+8, pixels+8, line_size, h);
  }
 +static void DEF(ff_avg_pixels16_x2)(uint8_t *block, const uint8_t *pixels, 
 int line_size, int h){
 +DEF(ff_avg_pixels8_x2)(block  , pixels  , line_size, h);
 +DEF(ff_avg_pixels8_x2)(block+8, pixels+8, line_size, h);
  }
 +static void DEF(ff_avg_pixels16_y2)(uint8_t *block, const uint8_t *pixels, 
 int line_size, int h){
 +DEF(ff_avg_pixels8_y2)(block  , pixels  , line_size, h);
 +DEF(ff_avg_pixels8_y2)(block+8, pixels+8, line_size, h);
  }
 +static void DEF(ff_avg_pixels16_xy2)(uint8_t *block, const uint8_t *pixels, 
 int line_size, int h){
 +DEF(ff_avg_pixels8_xy2)(block  , pixels  , line_size, h);
 +DEF(ff_avg_pixels8_xy2)(block+8, pixels+8, line_size, h);
  }

Moving this to a macro and deleting the file seems saner to me.
Maybe there are other opinions though...

 --- a/libavcodec/x86/dsputil_mmx.c
 +++ b/libavcodec/x86/dsputil_mmx.c
 @@ -83,6 +83,147 @@ DECLARE_ALIGNED(16, const xmm_reg,  ff_pb_FE)   = { 
 0xFEFEFEFEFEFEFEFEULL, 0xFEF
  
 +#if HAVE_YASM
 +/* VC-1-specific */
 +#define ff_put_pixels8_mmx ff_put_pixels8_mmxext
 +void ff_put_vc1_mspel_mc00_mmx(uint8_t *dst, const uint8_t *src,
 +   int stride, int rnd)
 +{
 +ff_put_pixels8_mmx(dst, src, stride, 8);
 +}
 +
 +void ff_avg_vc1_mspel_mc00_mmxext(uint8_t *dst, const uint8_t *src,
 +  int stride, int rnd)
 +{
 +ff_avg_pixels8_mmxext(dst, src, stride, 8);
 +}

Is this used outside of VC-1?  If no, this should be split out and moved
to a VC-1-specific file.

 +/***/
 +/* 3Dnow specific */
 +
 +#define DEF(x) x ## _3dnow
 +
 +#include dsputil_avg_template.c
 +
 +#undef DEF
 +
 +/***/
 +/* MMXEXT specific */
 +
 +#define DEF(x) x ## _mmxext
 +
 +#include dsputil_avg_template.c
 +
 +#undef DEF
 +
 +
 +
 +#endif /* HAVE_YASM */
 +
 +
 +
 +
  #if HAVE_INLINE_ASM

nit: stray large amount of empty lines

 --- a/libavcodec/x86/dsputil.asm
 +++ b/libavcodec/x86/dsputil.asm
 @@ -879,3 +884,986 @@ cglobal avg_pixels16, 4,5,4
  lea  r0, [r0+r2*4]
  jnz   .loop
  REP_RET
 +
 +
 +
 +
 +; HPEL mmxext
 +%macro PAVGB_OP 2

nit: 4 empty lines looks slightly weird; in that file 2 empty lines
between unrelated blocks seem to be the norm.

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] vp3dsp: don't do aligned reads on input.

2013-01-22 Thread Martin Storsjö

On Tue, 22 Jan 2013, Ronald S. Bultje wrote:


From: Ronald S. Bultje rsbul...@gmail.com

The input is not guarenteed to be aligned.
---
libavcodec/vp3dsp.c | 8 
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/libavcodec/vp3dsp.c b/libavcodec/vp3dsp.c
index 1883099..0ce6b81 100644
--- a/libavcodec/vp3dsp.c
+++ b/libavcodec/vp3dsp.c
@@ -282,11 +282,11 @@ static void put_no_rnd_pixels_l2(uint8_t *dst, const 
uint8_t *src1,
for (i = 0; i  h; i++) {
uint32_t a, b;

-a = AV_RN32A(src1[i * stride]);
-b = AV_RN32A(src2[i * stride]);
+a = AV_RN32(src1[i * stride]);
+b = AV_RN32(src2[i * stride]);
AV_WN32A(dst[i * stride], no_rnd_avg32(a, b));
-a = AV_RN32A(src1[i * stride + 4]);
-b = AV_RN32A(src2[i * stride + 4]);
+a = AV_RN32(src1[i * stride + 4]);
+b = AV_RN32(src2[i * stride + 4]);
AV_WN32A(dst[i * stride + 4], no_rnd_avg32(a, b));
}
}
--
1.8.0


Looks about right, I guess this will fix the fate failures on archs that 
don't support unaligned reads.


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] arm: Add mathops.h to ARCH_HEADERS list

2013-01-22 Thread Luca Barbato
On 21/01/13 10:16, Diego Biurrun wrote:
 It is an arch-specific header not suitable for standalone compilation.
 ---
 
 This fixes make checkheaders on ARM.
 
  libavcodec/arm/Makefile |2 ++
  1 files changed, 2 insertions(+), 0 deletions(-)

Ok.

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] vp3dsp: don't do aligned reads on input.

2013-01-22 Thread Luca Barbato
On 22/01/13 21:45, Ronald S. Bultje wrote:
 From: Ronald S. Bultje rsbul...@gmail.com

 The input is not guarenteed to be aligned.

guaranteed

 ---
  libavcodec/vp3dsp.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)
 
 diff --git a/libavcodec/vp3dsp.c b/libavcodec/vp3dsp.c
 index 1883099..0ce6b81 100644
 --- a/libavcodec/vp3dsp.c
 +++ b/libavcodec/vp3dsp.c
 @@ -282,11 +282,11 @@ static void put_no_rnd_pixels_l2(uint8_t *dst, const 
 uint8_t *src1,
  for (i = 0; i  h; i++) {
  uint32_t a, b;
  
 -a = AV_RN32A(src1[i * stride]);
 -b = AV_RN32A(src2[i * stride]);
 +a = AV_RN32(src1[i * stride]);
 +b = AV_RN32(src2[i * stride]);
  AV_WN32A(dst[i * stride], no_rnd_avg32(a, b));
 -a = AV_RN32A(src1[i * stride + 4]);
 -b = AV_RN32A(src2[i * stride + 4]);
 +a = AV_RN32(src1[i * stride + 4]);
 +b = AV_RN32(src2[i * stride + 4]);
  AV_WN32A(dst[i * stride + 4], no_rnd_avg32(a, b));
  }
  }
 

Patch ok.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] dsputil: x86: Convert some inline asm to yasm

2013-01-22 Thread Luca Barbato
On 22/01/13 22:40, Daniel Kang wrote:
 Specifically dsputil_avg_template.c and mpeg4 qpel

dsputil: x86: Convert mpeg4 qpel and dsputil avg to yasm

Maybe?

 ---
 Remove some cosmetic changes
 ---
  libavcodec/x86/dsputil.asm|  988 
 +
  libavcodec/x86/dsputil_avg_template.c |  791 +-
  libavcodec/x86/dsputil_mmx.c  |  927 ---
  libavcodec/x86/h264_qpel.c|   22 -
  libavcodec/x86/vc1dsp_mmx.c   |4 +
  5 files changed, 1357 insertions(+), 1375 deletions(-)
 

Looks ok to me if looks ok to Loren.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel



Re: [libav-devel] [PATCH] vorbisdsp: convert x86 simd functions from inline asm to yasm.

2013-01-22 Thread Loren Merritt
On Mon, 21 Jan 2013, Ronald S. Bultje wrote:

 From: Ronald S. Bultje rsbul...@gmail.com

 ---
  libavcodec/x86/Makefile |  1 +
  libavcodec/x86/dsputil_mmx.c|  3 --
  libavcodec/x86/dsputil_mmx.h|  2 -
  libavcodec/x86/vorbisdsp.asm| 83 
 +
  libavcodec/x86/vorbisdsp_init.c | 77 --
  5 files changed, 92 insertions(+), 74 deletions(-)
  create mode 100644 libavcodec/x86/vorbisdsp.asm

LGTM.

--Loren Merritt
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] Separate h264 qpel from dsputil

2013-01-22 Thread Ronald S. Bultje
Hi,

On Fri, Jan 18, 2013 at 2:37 PM, Diego Biurrun di...@biurrun.de wrote:
[..]

This patch doesn't convert sh4 and ppc. I can do ppc, I don't have
access to a sh4 cross-compilation environment.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] Separate h264 qpel from dsputil

2013-01-22 Thread Ronald S. Bultje
Hi,

On Tue, Jan 22, 2013 at 8:49 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
 Hi,

 On Fri, Jan 18, 2013 at 2:37 PM, Diego Biurrun di...@biurrun.de wrote:
 [..]

 This patch doesn't convert sh4 and ppc. I can do ppc, I don't have
 access to a sh4 cross-compilation environment.

And arm also. I've just fixed ppc, I'll fix arm in a little. I don't
know what to do with sh4. Someone appears to be hosting a qemu-based
sh4 fate instance, so it is possible to test it without owning the
proper hardware. Anyone fancy trying to make that one work?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] Separate h264 qpel from dsputil

2013-01-22 Thread Ronald S. Bultje
Hi,

On Tue, Jan 22, 2013 at 10:26 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
 Hi,

 On Tue, Jan 22, 2013 at 8:49 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
 Hi,

 On Fri, Jan 18, 2013 at 2:37 PM, Diego Biurrun di...@biurrun.de wrote:
 [..]

 This patch doesn't convert sh4 and ppc. I can do ppc, I don't have
 access to a sh4 cross-compilation environment.

 And arm also. I've just fixed ppc, I'll fix arm in a little. I don't
 know what to do with sh4. Someone appears to be hosting a qemu-based
 sh4 fate instance, so it is possible to test it without owning the
 proper hardware. Anyone fancy trying to make that one work?

Top patch in https://github.com/rbultje/ffmpeg/commits/wmv2dsp has ppc
(runtime-tested w/ and w/o altivec) and arm (compiletime-tested w/ and
w/o neon), and of course also tested on x86-32/64.

As for sh4, I had a look, and I don't get it. It's an almost literal
copy of some ages-old copy of the qpel C functions with some slight
modifications to do aligned reads and minor other tricks. Doesn't the
C code do some of this itself nowadays (AV_RN32A vs AV_RN32)? Some
code in sh4/qpel.c even still has _c suffixes (such as, no really,
gmc1_c, some mspel functions, etc.).

I guess what I'm saying is, it can be made to work, but I can't test
it and I'm not sure I see the point.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] dsputil: x86: Convert some inline asm to yasm

2013-01-22 Thread Daniel Kang
On Tue, Jan 22, 2013 at 5:10 PM, Diego Biurrun di...@biurrun.de wrote:
 On Tue, Jan 22, 2013 at 04:40:34PM -0500, Daniel Kang wrote:
 --- a/libavcodec/x86/dsputil_avg_template.c
 +++ b/libavcodec/x86/dsputil_avg_template.c
 @@ -24,781 +24,32 @@

  //FIXME the following could be optimized too ...
 +static void DEF(ff_put_no_rnd_pixels16_x2)(uint8_t *block, const uint8_t 
 *pixels, int line_size, int h){
 +DEF(ff_put_no_rnd_pixels8_x2)(block  , pixels  , line_size, h);
 +DEF(ff_put_no_rnd_pixels8_x2)(block+8, pixels+8, line_size, h);
  }
 +static void DEF(ff_put_pixels16_y2)(uint8_t *block, const uint8_t *pixels, 
 int line_size, int h){
 +DEF(ff_put_pixels8_y2)(block  , pixels  , line_size, h);
 +DEF(ff_put_pixels8_y2)(block+8, pixels+8, line_size, h);
  }
 +static void DEF(ff_put_no_rnd_pixels16_y2)(uint8_t *block, const uint8_t 
 *pixels, int line_size, int h){
 +DEF(ff_put_no_rnd_pixels8_y2)(block  , pixels  , line_size, h);
 +DEF(ff_put_no_rnd_pixels8_y2)(block+8, pixels+8, line_size, h);
  }
 +static void DEF(ff_avg_pixels16)(uint8_t *block, const uint8_t *pixels, int 
 line_size, int h){
 +DEF(ff_avg_pixels8)(block  , pixels  , line_size, h);
 +DEF(ff_avg_pixels8)(block+8, pixels+8, line_size, h);
  }
 +static void DEF(ff_avg_pixels16_x2)(uint8_t *block, const uint8_t *pixels, 
 int line_size, int h){
 +DEF(ff_avg_pixels8_x2)(block  , pixels  , line_size, h);
 +DEF(ff_avg_pixels8_x2)(block+8, pixels+8, line_size, h);
  }
 +static void DEF(ff_avg_pixels16_y2)(uint8_t *block, const uint8_t *pixels, 
 int line_size, int h){
 +DEF(ff_avg_pixels8_y2)(block  , pixels  , line_size, h);
 +DEF(ff_avg_pixels8_y2)(block+8, pixels+8, line_size, h);
  }
 +static void DEF(ff_avg_pixels16_xy2)(uint8_t *block, const uint8_t *pixels, 
 int line_size, int h){
 +DEF(ff_avg_pixels8_xy2)(block  , pixels  , line_size, h);
 +DEF(ff_avg_pixels8_xy2)(block+8, pixels+8, line_size, h);
  }

 Moving this to a macro and deleting the file seems saner to me.
 Maybe there are other opinions though...

I was trying to avoid more macro hell in dsputil. Suggestions appreciated.

 --- a/libavcodec/x86/dsputil_mmx.c
 +++ b/libavcodec/x86/dsputil_mmx.c
 @@ -83,6 +83,147 @@ DECLARE_ALIGNED(16, const xmm_reg,  ff_pb_FE)   = { 
 0xFEFEFEFEFEFEFEFEULL, 0xFEF

 +#if HAVE_YASM
 +/* VC-1-specific */
 +#define ff_put_pixels8_mmx ff_put_pixels8_mmxext
 +void ff_put_vc1_mspel_mc00_mmx(uint8_t *dst, const uint8_t *src,
 +   int stride, int rnd)
 +{
 +ff_put_pixels8_mmx(dst, src, stride, 8);
 +}
 +
 +void ff_avg_vc1_mspel_mc00_mmxext(uint8_t *dst, const uint8_t *src,
 +  int stride, int rnd)
 +{
 +ff_avg_pixels8_mmxext(dst, src, stride, 8);
 +}

 Is this used outside of VC-1?  If no, this should be split out and moved
 to a VC-1-specific file.

The avg and put pixels functions are. I am fairly confident the others aren't.

 +/***/
 +/* 3Dnow specific */
 +
 +#define DEF(x) x ## _3dnow
 +
 +#include dsputil_avg_template.c
 +
 +#undef DEF
 +
 +/***/
 +/* MMXEXT specific */
 +
 +#define DEF(x) x ## _mmxext
 +
 +#include dsputil_avg_template.c
 +
 +#undef DEF
 +
 +
 +
 +#endif /* HAVE_YASM */
 +
 +
 +
 +
  #if HAVE_INLINE_ASM

 nit: stray large amount of empty lines

Fixed.

 --- a/libavcodec/x86/dsputil.asm
 +++ b/libavcodec/x86/dsputil.asm
 @@ -879,3 +884,986 @@ cglobal avg_pixels16, 4,5,4
  lea  r0, [r0+r2*4]
  jnz   .loop
  REP_RET
 +
 +
 +
 +
 +; HPEL mmxext
 +%macro PAVGB_OP 2

 nit: 4 empty lines looks slightly weird; in that file 2 empty lines
 between unrelated blocks seem to be the norm.

Fixed.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel