I looked at different memcpy routines for transcode and found that the optimized memcpy presented in AMD's Optimization guide was easily twice as fast as the libc version. I'd be willing to bet that 3-4x would be possible on Athlon64s (because it has twice the number of SSE registers). Works on aligned and unaligned data too, so it is a drop in replacement. I've never looked at the renderer implementation in MythTV, would this be a worthwhile optimization?Has anybody tried the fast_memcpy routines in Mytthv ?
I am interested in trying these routines.
_______________________________________________
mythtv-dev mailing list
[email protected]
http://mythtv.org/cgi-bin/mailman/listinfo/mythtv-dev
