On Wed, 07 Aug 2013 14:30:09 +0100, Hendrik Leppkes <[email protected]> wrote:
Personally i just think we should pursue getting a x86 version as soon as possible, and then avoid any added complexity. That should then cover the grand majority of all systems.
Well, I like to think I'm a competent ARM coder, but I'm afraid I can't oblige with x86, so would need the help of another volunteer. For what it's worth, my ARM implementation scans for 16-bit aligned 16-bit words which (interpreted big-endian) have value 0x0000 or 0x0001 - it can test two of these in one 32-bit word easily enough - and then winds back a byte further if they were preceded by a 0x00 byte. It uses prefetch, since the source buffers are typically much larger than the L1 cache, so must be primarily uncached data. Anything cleverer (such as trying to special-case the emulation-prevention 0x00 0x00 0x03 sequences) didn't turn out to be worth doing, according to my profiling results. Of course, results may vary on different architectures (or even on more modern ARMs than the one I've been using, which has no NEON coprocessor). Ben _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
