Matt Turner <[email protected]> writes:
>> - add_8_8: Here too you also only do the optimization in the edge >> cases. > > I can't (at least with lowlevel-blt-bench) measure any performance > improvements by doing the optimization in the middle of scanlines. I > suppose it's because the optimization would only skip a block of 4 > pixels when all are all zero. That might be an uncommon case in a > synthetic benchmark like lowlevel-blt-bench. Another thing is that > you've got to wait until the src is loaded before you can know whether > you need to load dst, which potentially doubles the amount of time > you're waiting for memory. > > I think because of the extra memory latency it might not be worth > doing the optimization. > > On the other hand, as you can see it does help when doing the > optimization on the edge cases. Is there a reasonable explanation that it would be a win for edge cases and not for the middle cases? If you really think this patch is an improvement, I probably won't complain too much since you are the one with the hardware and the one doing the work, but I have to say that data dependent optimizations that are only shown to improve lowlevel-blt and only in particular special cases, feel a bit too much like voodoo for my taste. > I'll plan to try to find some cairo traces that exercise the in_* > functions. I doesn't look like they are very important, since we don't > have NEON fast paths for them! :) They used to be very important for Evince, but it may be that newer versions of cairo end up taking different paths. > With the in_8_8 patch dropped and the other comments addressed, good to > commit? I guess. Thanks, Soren _______________________________________________ Pixman mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/pixman
