On Tue, Jul 12, 2011 at 5:05 PM, Soeren Sandmann <[email protected]> wrote: > Taekyun Kim <[email protected]> writes: > >> On 07/11/2011 09:18 PM, Soeren Sandmann wrote: >>> This performance regression was introduced when the "simple repeat" code >>> was removed. But I'm not sure hacking it into the ARM backend is the >>> right plan. See this mail for a different approach: >>> >>> http://lists.freedesktop.org/archives/pixman/2010-December/000815.html >>> >>> I have a branch with a start on doing it that way here: >>> >>> http://cgit.freedesktop.org/~sandmann/pixman/log/?h=simple-repeat >>> >>> which may or may not be useful as a starting point. (I'd be interested >>> in seeing what the benchmark results of that branch are). >> >> It seems to be the right place where we can put simple repeat codes. >> It can handle simple repeat for both sse2 and ARM at common place. >> >> I'm a bit worried that tiling does not give us good memory access patterns >> causing cache overhead. 1 x n source images would be as slow as 90 degree >> rotation. Memory buffer will be accessed in vertical order. > > Yeah, that is a problem, and that was in fact one of the reasons the > original 'simple repeat' code was deleted. It's memory access pattern > for 1xn images was really bad. It may be that adding this support to > the ARM backend, as you did, is the better way.
Actually I agree with your older comment. Adding this support exclusively to ARM backend is not so great. Taking slow paths for normal repeat was also spotted by Mozilla [1], and I guess they are interested in getting this issue fixed for all platforms, and not just ARM. So far reverting the simple repeat code deletion seemed to be also kind of usable solution. And 1xn source images problem is a bit overrated, even though it would be surely nice to get it fixed. More cache efficient memory access pattern optimization by extending source images can be probably applied to 'fast_composite_tiled_repeat' function. I would also like to see the results of benchmarks using cairo traces (for the current pixman master, for the proposed patches, and maybe also for simple repeat code deletion reverted). Just to be sure that we don't get any unexpected performance regressions. After all, adding support of normal repeat for standard fast paths touches the frequently used parts of code. 1. https://bugzilla.mozilla.org/show_bug.cgi?id=640250#c5 -- Best regards, Siarhei Siamashka _______________________________________________ Pixman mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/pixman
