Re: [Pixman] Image resampling [PATCH 0/6]

2012-11-27 Thread Søren Sandmann
Søren Sandmann sandm...@cs.au.dk writes:

 There is some additional work that could be done:

 - Performance improvements. Low-hanging fruit includes adding new fast
   path iterators that assume the source is a8r8g8b8 or r5g6b5.

I went ahead and wrote some affine fast paths. There is still room for
improvement though.


Søren



___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman


Re: [Pixman] Image resampling [PATCH 0/6]

2012-11-27 Thread Siarhei Siamashka
On Sat, 24 Nov 2012 04:57:22 +0100
sandm...@cs.au.dk (Søren Sandmann) wrote:

 Hi,
 
 Reviewing the supersampling patch here:
 
http://cgit.freedesktop.org/~ajohnson/pixman/log/?h=supersampling
 
 I wasn't happy with either the performance and image quality, and I
 realized that the whole supersampling approach just isn't going to
 fly. Since I told people to do it that way, I apologize for that. The
 approach advocated by Bill Spitzak in the various downsampling threads
 of computing a convolution kernel up front, is the much better way to
 go. To make up for being misleading, the following patches implement
 comprehensive support for high-quality image scaling filters.
 
 Pixman already has a convolution filter, but since it only allows one
 sample per pixel of the filter, it is limited in the quality that it can
 support, so the following patches (to be applied on top of the three
 rounding patches) add a new filter type
 
 PIXMAN_FILTER_SEPARABLE_CONVOLUTION
 
 that supports multiple different convolution matrices that are chosen
 between based on the subpixel source location. The matrices are
 specified as tensor products of x/y vectors, which makes them
 separable by definition.

I like this approach. This is a clean design, which look simple enough
to optimize for better performance with SIMD.

 -=- Further work and examples
 
 There is some additional work that could be done:
 
 - Performance improvements. Low-hanging fruit includes adding new fast
   path iterators that assume the source is a8r8g8b8 or r5g6b5. Higher
   hanging fruit is SIMD optimziations and implementations that take
   advantage of separability. It may also be interesting to speed up
   pixman_filter_create_separable_convolution() by tabularizing some of
   the trigonometric functions etc.

From what I see, the separable convolution filter shares a lot of
similarities with the existing pixman SIMD code for bilinear scaling,
which could be extended with relatively little effort.

Bilinear scaling uses weighted average of 2 pixels (in one direction),
with weights calculated on the go. Separable convolution uses weighted
average of N pixels, with weights obtained by table lookups. Both use
subpixel positions (7 phase bits or 128 phases for current bilinear
implementation) to lookup or calculate weights. Bilinear filter is
naturally a subset of separable convolution.

The biggest challenge for optimized bilinear scaling (compared to
nearest) had been properly implementing different types of repeat on
image edges due to sampling of some pixels outside of the image
boundaries. But many pixels from the separable convolution filter is
not much different from just two from the (bi)linear filter in this
respect, so this should already work fine also for separable
convolution with just minor tweaks.

Regarding SIMD optimized iterators (for both bilinear and separable
convolution), they are rather simple to implement as well. Just have a
look at my old OpenMP proof of concept patch:
http://lists.freedesktop.org/archives/pixman/2012-June/002071.html
The OMP_BILINEAR_PARALLEL_FOR define lists all the variables which
fully represent the state of the iterator, needed to walk over the
source image and scale it (only the loop counter 'i' is different for
each scanline). This state can be calculated when creating an
iterator, and then we can simply pull one scaled scanline at a time.
The fact that the current bilinear code is not separable is a minor
implementation detail, we could as well have 2x1 sized filter instead
of 2x2 and then do vertical scaling separately with the help of a
temporary buffer in L1 cache like you tried before in the
separable-bilinear branch:
http://lists.freedesktop.org/archives/pixman/2012-June/002140.html
And I tried a similar two-passes in L1 cache approach for YUV-RGB
scaling code prototype much earlier, with reasonably good results:
https://bugzilla.mozilla.org/show_bug.cgi?id=634557#c53
So this is expected to provide good performance.

Affine transformations (still to be implemented with SIMD) are a bit
different and iterators are not a very good choice for them due to less
than perfect memory access pattern.

-- 
Best regards,
Siarhei Siamashka
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman


Re: [Pixman] Image resampling [PATCH 0/6]

2012-11-27 Thread Krzysztof Kosiński
2012/11/27 Bill Spitzak spit...@gmail.com:
 Cairo may want to pre-scale the source surface by an integer factor using a
 box filter so that the sampling filters are not so large. It would then have
 to keep this scaled image around until either the source image is dirtied or
 a transform requiring a different scale is used. I would think this could
 speed up repeated drawing of a much-scaled-down image considerably. It does
 seem to me that such a step should be done by cairo rather than pixman. The
 alternative is for client programs to do this with extra Cairo surfaces for
 the scaled images, but that seems to not be keeping with the easy-to-use api
 intentions of Cairo.

Why not go a little further and implement mipmaps? This would
dramatically speed up the case where an image is repeatedly drawn with
very different transformations, e.g. as is the case when transforming
a bitmap in Inkscape. I guess this would need some support from
Pixman, however.

Regards, Krzysztof
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman


Re: [Pixman] Image resampling [PATCH 0/6]

2012-11-26 Thread Krzysztof Kosiński
2012/11/24 Søren Sandmann sandm...@cs.au.dk:
 Hi,

 Reviewing the supersampling patch here:

http://cgit.freedesktop.org/~ajohnson/pixman/log/?h=supersampling

 I wasn't happy with either the performance and image quality, and I
 realized that the whole supersampling approach just isn't going to
 fly. Since I told people to do it that way, I apologize for that. The
 approach advocated by Bill Spitzak in the various downsampling threads
 of computing a convolution kernel up front, is the much better way to
 go. To make up for being misleading, the following patches implement
 comprehensive support for high-quality image scaling filters.

This is extremely awesome.

Regards, Krzysztof
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman


Re: [Pixman] Image resampling [PATCH 0/6]

2012-11-26 Thread Bill Spitzak



Søren Sandmann wrote:

Hi,

Reviewing the supersampling patch here:

   http://cgit.freedesktop.org/~ajohnson/pixman/log/?h=supersampling

I wasn't happy with either the performance and image quality, and I
realized that the whole supersampling approach just isn't going to
fly. Since I told people to do it that way, I apologize for that. The
approach advocated by Bill Spitzak in the various downsampling threads
of computing a convolution kernel up front, is the much better way to
go. To make up for being misleading, the following patches implement
comprehensive support for high-quality image scaling filters.


This is great news!


-=- Adding support to cairo and further work

Once these patches have landed in Pixman, support will have to be
added to cairo to make use of them. How to do that exactly requires
figuring out what new API to offer, and how the tradeoffs between
performance and quality should be made. This is not something that I
personally plan to work on anytime soon, except to make three notes:


Cairo may want to pre-scale the source surface by an integer factor 
using a box filter so that the sampling filters are not so large. It 
would then have to keep this scaled image around until either the source 
image is dirtied or a transform requiring a different scale is used. I 
would think this could speed up repeated drawing of a much-scaled-down 
image considerably. It does seem to me that such a step should be done 
by cairo rather than pixman. The alternative is for client programs to 
do this with extra Cairo surfaces for the scaled images, but that seems 
to not be keeping with the easy-to-use api intentions of Cairo.


The scale would be chosen so the intermediate image is no smaller than 
2x the final result. This seems to hide any problems with the box 
filter. It kind of makes sense because we are reusing the sampling 
filter at intervals that are less than 1/2 which is below the nyquist 
frequency for the sync.



  - While transformations that are not pure scalings will not
generally result in a separable filter, OK-looking results for
non-scalings can be achieved by using scaling factors based on the
bounding box of a transformation


Having recently wasted some time on this, I discovered that in fact 
these sampling filters *rely* on the separable property. The lancos and 
sync filters do not work if when they are centered on a pixel the 
neighboring pixels are not at the zeros in the filter. The only way to 
do this is for the filter to be a product of 2 1-D filters that are 
aligned with the source pixel grid. If they were at an angle, the zeros 
would be at an angle and thus could not all hit pixel centers. Trying to 
make it circular (as I was doing) also fails for the same reason. An 
obvious error is that the identity transformation results in ringing and 
sharpening being added to the picture. With aligned separable filters 
and a filter that is zero at all integers other than 0 the result is an 
identity, which is much more likely what users want.


Therefore it looks like you have already gotten most of what is needed.

The two sizes are fairly easy to figure out from the inverse 
transformation matrix (ie the inverse of the matrix a Cairo user sets, 
but the one needed to find the source image pixel given an output 
position) (WARNING: I may have b/c swapped from what Cairo/pixman code 
uses):


[ a b 0 ]   [ X ]
  inxy= [ c d 0 ] * [ Y ]
[ x y 1 ]   [ 1 ]

The horizontal size is hypot(a,b) and the vertical size is hypot(c,d).

This is the width/height of the bounding box of the ellipse you get if a 
.5-radius circle is transformed by this matrix.

___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman


Re: [Pixman] Image resampling [PATCH 0/6]

2012-11-23 Thread Behdad Esfahbod
Hi Søren,

This is a very well done patchset with great commentary.  Great job!

Cheers,
behdad

On 12-11-23 10:57 PM, Søren Sandmann wrote:
 Hi,
 
 Reviewing the supersampling patch here:
 
http://cgit.freedesktop.org/~ajohnson/pixman/log/?h=supersampling
 
 I wasn't happy with either the performance and image quality, and I
 realized that the whole supersampling approach just isn't going to
 fly. Since I told people to do it that way, I apologize for that. The
 approach advocated by Bill Spitzak in the various downsampling threads
 of computing a convolution kernel up front, is the much better way to
 go. To make up for being misleading, the following patches implement
 comprehensive support for high-quality image scaling filters.
 
 Pixman already has a convolution filter, but since it only allows one
 sample per pixel of the filter, it is limited in the quality that it can
 support, so the following patches (to be applied on top of the three
 rounding patches) add a new filter type
 
 PIXMAN_FILTER_SEPARABLE_CONVOLUTION
 
 that supports multiple different convolution matrices that are chosen
 between based on the subpixel source location. The matrices are
 specified as tensor products of x/y vectors, which makes them
 separable by definition.
 
 The patches also add a helper function
 
 pixman_filter_create_separable_convolution()
 
 that will create the parameters for the filter based on scaling
 factors, filter kernels and subsampling resolution. Currently the
 supported kernels are impulse, box, linear, cubic
 (Mitchell-Netravali), lanczos2, lanczos3, lanczos3_stretched
 (aka. Blinn's 'Nice' filter), and Gaussian.
 
 There also a new demo program demos/scale that shows how
 the new API can be used.
 
 For some useful math regarding image transformations, see
 http://people.redhat.com/otaylor/gtk/pixbuf-transform-math.ps . For
 some informatino about how to compute the convolution matrices, see
 the additions to rounding.txt in the second patch.
 
 
 -=- Adding support to cairo and further work
 
 Once these patches have landed in Pixman, support will have to be
 added to cairo to make use of them. How to do that exactly requires
 figuring out what new API to offer, and how the tradeoffs between
 performance and quality should be made. This is not something that I
 personally plan to work on anytime soon, except to make three notes:
 
   - While transformations that are not pure scalings will not
 generally result in a separable filter, OK-looking results for
 non-scalings can be achieved by using scaling factors based on the
 bounding box of a transformation 
 
   - For equivalent quality to GdkPixbuf do this: In each direction
 compute the scaling factors and then, if the scaling factor is
 less than 1 (ie., a downscaling), use PIXMAN_KERNEL_BOX for both
 reconstruction and sampling, and if it's greater than one, use
 PIXMAN_KERNEL_LINEAR for reconstruction and PIXMAN_KERNEL_IMPULSE
 for sampling.
 
   - If PIXMAN_KERNEL_GAUSSIAN is used with large downscaling factors
 and the resulting filter is then used with an identity transform,
 the result is a Gaussian blur, which is a feature that has
 sometimes been requested.
 
 The code in demos/scale.c may be useful as an example.
 
 
 -=- Further work and examples
 
 There is some additional work that could be done:
 
 - Performance improvements. Low-hanging fruit includes adding new fast
   path iterators that assume the source is a8r8g8b8 or r5g6b5. Higher
   hanging fruit is SIMD optimziations and implementations that take
   advantage of separability. It may also be interesting to speed up
   pixman_filter_create_separable_convolution() by tabularizing some of
   the trigonometric functions etc.
 
 - A non-separable, but subsampled, convolution filter type could be
   interesting to allow correct filters for non-scaling transformations
   and non-separable filters in general.
 
 
 As a reward for reading this entire mail, here are some images:
 
 Original (2.6 MB):
 
  http://www.daimi.au.dk/~sandmann/house.jpg
 
 Scaled down 12.9 times in each dimension:
 
 - With a box filter:
 
http://www.daimi.au.dk/~sandmann/house-box.png
 
 - With Lanczos3:
 
http://www.daimi.au.dk/~sandmann/house-lanczos3.png
 
 - With stretched Lanczos3:
 
http://www.daimi.au.dk/~sandmann/house-nice.png
 
 For more examples, try demos/scale.
 
 The patch series is also available in this repository:
 
 http://cgit.freedesktop.org/~sandmann/pixman/log/?h=separable
 
 
 Soren
 ___
 Pixman mailing list
 Pixman@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/pixman
 

-- 
behdad
http://behdad.org/
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman