2012/1/10 Jason Garrett-Glaser <[email protected]>: > You don't need separate MMX and SSE macros. Just use movh.
OK. > This code is needlessly redundant right now. Do you mean I could avoid duplication of that code elsewhere, or in that code (which would be achieved by using movh) > Can you just scale m2/m3 up by <<7 to begin with, or are they too > large for this? They are too big, as they are FP0.14 values: (dist0 << 14) / refdist > +%if %1 == 0 > > You need comments explaining these arguments, right now they're opaque. Can I name them (I'm beginning with x86asm stuff) or do you mean just leaving a comment at the start of the macro? > + cmp r6, 0 ; are both multiple of 2^9? > > Does this happen often? Is it worth optimizing for? Comment about this. It did happen fairly often on my test sequences, but maybe that was luck. In the above formula, refdist is: int refdist = GET_PTS_DIFF(r->next_pts, r->last_pts) so it is not completely obvious it is always a power of 2. However, it was worth it: the *_TIMER results went down by (very approximately) 10%. I guess I can indeed add this to the commit message to make it more obvious. > Consider whether the pmac* instructions might be usable here (xop). I haver never worked with them, and no mean to test them. From this: http://support.amd.com/us/Embedded_TechDocs/43479.pdf it seems those vpmac* instr may indeed improve things. Though I have no further clue. Best regards, Christophe _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
