Hello,

2012/11/30 Loren Merritt <[email protected]>:
> If you increment an index into W and z rather than the pointers
> themselves, then you can eliminate an add and a cmp.

I add already tested that, and redid it:
cglobal sbr_qmf_post_shuffle, 2,4,3,W,z
  mov       r3q, 32*4
  lea       r2q, [zq + (64-4)*4]
  add        zq, r3q
  lea        Wq, [Wq + 2*r3q]
  neg       r3q
.loop:
  mova       m0, [r2q]
  mova       m1, [zq  + r3q]
  xorps      m0, [ps_neg]
  shufps     m0, m0, 0x1B
  mova       m2, m0
  unpcklps   m0, m1
  unpckhps   m2, m1
  mova  [Wq + 2*r3q +  0], m0
  mova  [Wq + 2*r3q + 16], m2
  sub       r2q, 16
  add       r3q, 16
  jl      .loop
  REP_RET

It's 2 cycles slower on Penrynn/Win64 (154 vs 152).

> 4 space tabs.

OK, I was a bit puzzled and looking for trailing whitespaces/... You
mean style change then.
A bit cumbersome to redo all patches because of that.

-- 
Christophe
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to