On 04/22/2011 08:10 PM, Loren Merritt wrote: > On Fri, 22 Apr 2011, Justin Ruggles wrote: > >> Could someone test the patch below with a modern Intel CPU other than Atom? >> I'm getting slower results for the SSE version than the MMX version on >> Athlon64, but SSE is faster on Atom. I'm guessing it's another Athlon issue, >> but it could be something else... > > penryn: > 3656 +/- 3 6ch c > 1475 +/- 3 6ch mmx > 1516 +/- 1 6ch sse > 1091 +/- 5 6ch sse patched > 908 +/- 1 2ch c > 468 +/- 2 2ch mmx > 314 +/- 1 2ch sse > > conroe: > 3221 +/- 6 6ch c > 1295 +/- 5 6ch mmx > 1672 +/- 3 6ch sse > 1010 +/- 5 6ch sse patched > 818 +/- 1 2ch c > 403 +/- 2 2ch mmx > 296 +/- 3 2ch sse > > I don't know why conroe got lower totals than penryn, when every single > instruction is equal or slower in isolation.
awesome. thanks for the patch too! here are my athlon64 and atom benchmarks w/ your patch included athlon64: 5391 6ch c 2095 6ch mmx 2106 6ch sse 1067 2ch c 643 2ch mmx 636 2ch sse atom: 7288 6ch c 4568 6ch mmx 2796 6ch sse 2884 2ch c 1413 2ch mmx 715 2ch sse The athlon64 speeds for mmx vs sse versions are close enough now that we don't really need a special case, and sse is significantly faster on intel. Thanks, Justin _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
