Hello, 2012/11/30 Loren Merritt <[email protected]>: > If you increment an index into W and z rather than the pointers > themselves, then you can eliminate an add and a cmp.
I add already tested that, and redid it: cglobal sbr_qmf_post_shuffle, 2,4,3,W,z mov r3q, 32*4 lea r2q, [zq + (64-4)*4] add zq, r3q lea Wq, [Wq + 2*r3q] neg r3q .loop: mova m0, [r2q] mova m1, [zq + r3q] xorps m0, [ps_neg] shufps m0, m0, 0x1B mova m2, m0 unpcklps m0, m1 unpckhps m2, m1 mova [Wq + 2*r3q + 0], m0 mova [Wq + 2*r3q + 16], m2 sub r2q, 16 add r3q, 16 jl .loop REP_RET It's 2 cycles slower on Penrynn/Win64 (154 vs 152). > 4 space tabs. OK, I was a bit puzzled and looking for trailing whitespaces/... You mean style change then. A bit cumbersome to redo all patches because of that. -- Christophe _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
