On Tuesday 16 March 2010, Alexander Larsson wrote: > On Tue, 2010-03-16 at 09:49 +0200, Siarhei Siamashka wrote: > > On Monday 15 March 2010, Alexander Larsson wrote: > > > On Mon, 2010-03-15 at 17:05 +0200, Siarhei Siamashka wrote: > > > > Really good performance improvements for bilinear scaling are > > > > going to > > > > > > come > > > > from SIMD optimizations. To make it happen, scaler core needs to > > > > be > > > > > > isolated > > > > into a small simple function with a minimal number of checks and > > > > branches. > > > > http://cgit.freedesktop.org/~alexl/pixman/commit/?h=fast-bilinear&id=a > > 79daa > > > > >8453560b4193b848e51b4942dcdcd74c8d > > > > > > The "/* Main columns: */" part there is probably a good start for > > > > that. > > > > Yes, I know :) Just a switch on different repeat cases needs to be > > eliminated > > and it should become quite easy to vectorize. Removal of this switch > > may also > > improve performance of generic C code a bit more. > > Switch? The main (i.e. non-border) pixels have straight linear code: > > http://cgit.freedesktop.org/~alexl/pixman/tree/pixman/pixman-fast-path.c?h= >fast-bilinear#n1854 > > There is no switch (or other repeat handling) in there, because we're > using FAST_PATH_SAMPLES_COVERS_CLIP.
Yeah, looked at the wrong loop, sorry. This one would benefit from knowing the exact number of iterations to run beforehand (and thus eliminating 'vx < max_vx - pixman_fixed_1' part of condition), so that this loop could be unrolled easily. And isolating this inner loop into a separate function with clearly defined intput/output arguments would help. But this is a natural next step for optimization. Current step needs to be finalized and committed first. -- Best regards, Siarhei Siamashka _______________________________________________ Pixman mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/pixman
