Hi, On Mon, Jul 4, 2011 at 10:09 AM, Daniel Kang <[email protected]> wrote: > +INIT_MMX > +PRED8x8_DC mmxext > +PRED8x8_DC sse2
You seem to do this "keep in mmx regs for sse2" for all DCs. Why? It seems you can save several instructions, plus the overhead of moving from mmx to xmm, by doing everything in xmm registers... > + lea r2, [r1+r1*2] If you do this one a little more up (where you do the vertical DC), you can use it instead of having to do lea r0, [r0+r1*2]. +cglobal pred8x8_dc_10_%1, 2,4 [..] + mov r0, r4 That seems wrong, you're declaring to use 4 registers, but use r4 also. I think on x86-64, you can use r10/r11, and on x86-32, you can prevent the mov and just restore the value from r0m, if you want. > +INIT_MMX > +PRED8x8_TOP_DC mmxext > +PRED8x8_TOP_DC sse2 Same here as above (mmx->xmm moves). > +;----------------------------------------------------------------------------- > +;void pred8x8l_dc(pixel *src, int has_topleft, int has_topright, int stride) > +;----------------------------------------------------------------------------- > +%macro PRED8x8L_DC 1 > +cglobal pred8x8l_dc_10_%1, 4,5,8 > + sub r0, r3 > + lea r4, [r0+r3*2] > + mova m0, [r0+r3*1-16] > + punpckhwd m0, [r0+r3*0-16] When I measured, SIMD-vertical-DC was never faster than doing this part in scalar, as you do in horizontal. Did you measure this and compare it to a scalar implementation for vertical-DC? > +%if mmsize==16 > + mova m0, [r0+ 0] > + mova m1, [r0+16] > +%else > + movq m0, [r0+ 0] > + movq m1, [r0+ 8] > + movq m2, [r0+16] > + movq m3, [r0+24] > +%endif mova m0, [r0+0] mova m1, [r0+mmsize] %if mmsize==8 mova m2, [r0+16] mova m3, [r0+24] %endif > +;----------------------------------------------------------------------------- > +; void pred16x16_horizontal(pixel *src, int stride) > +;----------------------------------------------------------------------------- > +%macro PRED16x16_HORIZONTAL 1 > +cglobal pred16x16_horizontal_10_%1, 2,3 > + sub r0, r1 [..] > + movd m0, [r0+r1*1-4] > + movd m1, [r0+r1*2-4] Why the sub? Ronald _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
