Hi,

On Tue, May 24, 2011 at 10:45 AM, Kostya <[email protected]> wrote:
> On Tue, May 24, 2011 at 10:38:10AM -0400, Ronald S. Bultje wrote:
>> This code is not faster, therefore it just needlessly complicates
>> the source without providing any apparent benefit.
>>
>> Profiling YUV-to-RGB32 conversion:
>> mmx:
>> 6041918 dezicycles in x, 8192 runs, 0 skips
>> 6084080 dezicycles in x, 8192 runs, 0 skips
>> 6051309 dezicycles in x, 8192 runs, 0 skips
>> 6073144 dezicycles in x, 8192 runs, 0 skips
>> (6062613 avg)
>> mmx2:
>> 6071750 dezicycles in x, 8192 runs, 0 skips
>> 6050798 dezicycles in x, 8192 runs, 0 skips
>> 6070844 dezicycles in x, 8192 runs, 0 skips
>> 6057621 dezicycles in x, 8192 runs, 0 skips
>> (6062753 avg)
>> ---
>>  libswscale/x86/swscale_template.c |   10 ----------
>>  1 files changed, 0 insertions(+), 10 deletions(-)
>
> In general it may differ on different x86 implementations (AMD K6, Via C3,
> Intel Atom and Celeron II) but in general looks ok

Jason has similar concerns, so I'm fine with not committing this at
this point. I would like to see evidence for when this _does_ benefit.
right now some conversion functions use movntq, others don't, there
isn't really any pattern in it other than random()ness. I don't like
that.

Ronald
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to