Hi, On Tue, May 24, 2011 at 10:45 AM, Kostya <[email protected]> wrote: > On Tue, May 24, 2011 at 10:38:10AM -0400, Ronald S. Bultje wrote: >> This code is not faster, therefore it just needlessly complicates >> the source without providing any apparent benefit. >> >> Profiling YUV-to-RGB32 conversion: >> mmx: >> 6041918 dezicycles in x, 8192 runs, 0 skips >> 6084080 dezicycles in x, 8192 runs, 0 skips >> 6051309 dezicycles in x, 8192 runs, 0 skips >> 6073144 dezicycles in x, 8192 runs, 0 skips >> (6062613 avg) >> mmx2: >> 6071750 dezicycles in x, 8192 runs, 0 skips >> 6050798 dezicycles in x, 8192 runs, 0 skips >> 6070844 dezicycles in x, 8192 runs, 0 skips >> 6057621 dezicycles in x, 8192 runs, 0 skips >> (6062753 avg) >> --- >> libswscale/x86/swscale_template.c | 10 ---------- >> 1 files changed, 0 insertions(+), 10 deletions(-) > > In general it may differ on different x86 implementations (AMD K6, Via C3, > Intel Atom and Celeron II) but in general looks ok
Jason has similar concerns, so I'm fine with not committing this at this point. I would like to see evidence for when this _does_ benefit. right now some conversion functions use movntq, others don't, there isn't really any pattern in it other than random()ness. I don't like that. Ronald _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
