On Wed, Aug 31, 2011 at 8:12 AM, Soeren Sandmann <[email protected]> wrote: > Matt Turner <[email protected]> writes: > >> I've been trying to figure out if the ARM iwmmxt inline assembly makes >> any difference at all. I think the conclusion is that it does not. >> Updated code is here: >> http://cgit.freedesktop.org/~mattst88/pixman/log/?h=iwmmxt-optimizations > > You mean that inline assembly in mmx_fill() doesn't make a difference?
That, and mmx_blt(). >> See >> http://people.freedesktop.org/~mattst88/pixman-iwmmxt-benchdata.txt > > The lowlevel-blt benchmark doesn't hit the fill and blt routines at all, > so this data doesn't support the conclusion that inline assembly in > mmx_fill() and mmx_blt() makes no difference. Well, that explains it. >> Never does using inline assembly seem to make any sort of meaningful >> difference over simply compiling pixman-mmx.c for ARM/iwmmxt. I tried >> checking the alignment in the 'wip' commit in the blt function to >> avoid a lot of unnecessary walign instructions, but as you can see >> from the benchmark results, it doesn't help anything. > > The cairo-trace tests are better benchmarks to use in general because > they reflect real-world use. lowlevel-blt-bench really should only be > used for the case where you are optimizing a specific compositing > routine. OK, I'll run cairo-trace to determine the effect of the inline assembly. I think the addition of inline assembly could go in as follow-on patches though, right? Matt _______________________________________________ Pixman mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/pixman
