======== Abstract ======== Eric Daoust and Adam Turcotte are about to wrap up a pair of Google Summer of Code 2009 projects having to do with new samplers. In this document, I explain what the samplers do, how they (may) fit within the GEGL library, and where (most likely) they are going.
I also indicate what will need to be done when non-affine transformations (perspective, warp) are implemented in GEGL so that they use the samplers tuned for downsampling with maximal benefit. ============= QUICK SUMMARY ============= (Warning: I wrote this document in a hurry: Hopefully, it contains no significant inaccuracy.) In total, twelve new samplers were programmed: six samplers tuned for transformations in which good downsampling is more important than good upsampling (for example, thumbnail production): downsize, downsharp, downsmooth and their simplified/faster versions downsizefast, downsharpfast and downsmoothfast and six samplers tuned for transformations in which good upsampling is more important than good downsampling (for example, image enlargement and rotation): upsize, upsharp, upsmooth and their simplified/faster versions upsizefast, upsharpfast and upsmoothfast (Note: Perspective transformations often are both upsampling and downsampling.) The naming convention which Adam, Eric and I used (which could be changed) is as follows: "up*" samplers are tuned for upsampling, "down*" samplers are tuned for downsampling, "*sharp*" samplers maximise sharpness at the expense of increased aliasing ("jaggies"); *sharp* samplers behave more like nearest neighbour than the other methods. "*smooth*" samplers minimise jaggies at the expense of additional blur. "*size*" samplers are a compromise between the two. "*fast are simplified/faster versions. If "fast" is omitted, the corresponding method is the higher quality/slower version. As implemented for GEGL, none of the above methods are parameterised. ================ Relevant patches ================ http://bugzilla.gnome.org/show_bug.cgi?id=588336 (remove the yafr sampler) http://bugzilla.gnome.org/show_bug.cgi?id=588180 (remove the sharp sampler, and add upsharp) http://bugzilla.gnome.org/show_bug.cgi?id=588193 (add upsharpfast, upsizefast and upsmoothfast) http://bugzilla.gnome.org/show_bug.cgi?id=592498 (add upsize) http://bugzilla.gnome.org/show_bug.cgi?id=592515 (add upsmooth) http://bugzilla.gnome.org/show_bug.cgi?id=588016 (add downsharp, downsize, downsmooth) http://bugzilla.gnome.org/show_bug.cgi?id=592349 (add downsharpfast, downsizefast and downsmoothfast) (Non-sampler patches which Eric and Adam produced are not included above.) ==================== Where this is going? ==================== (Feedback welcome and actually sought.) If GEGL does not need "pretty good fast samplers," the "*fast" methods could simply not be integrated into GEGL. If and when GEGL has "quality levels," one could write a driver which selects the fast method when the chosen quality level is low. In any case, programming the "*fast" methods was not a waste: The s/nohalo family of methods is multi-stage, hence the "cheaper" versions are used as stages for the "non-fast" ones; in addition, the improved fast versions programmed for GSoC will be ported to VIPS. This being said, it would be nice if they could be merged into GEGL trunk and then removed, so that the git repository have a record of them. Keeping the "*fast" methods private (or not merging them at all) would reduce the total number of "public" GEGL methods from twelve to six. The following would further reduce the total number of methods to four (or eight if "*fast" samplers are kept). The downsize and downsharp samplers are very similar. For this reason, it would make sense to only keep the downsize and downsmooth samplers (same with downsizefast and downsmoothfast). The upsize and upsharp samplers are very different from each other, but research performed by Chantal Racette as part of her Laurentian University Honours Thesis during the Summer suggests that the method underlying upsharp should be used as a final stage for the multi-stage methods upsize and upsmooth, yielding superior methods which would replace the current upsize and upsmooth and make the current upsharp obsolete. So, in the end, there could only be four new additional GEGL samplers: Two samplers tuned for downsampling: downsize and downsmooth (with fast versions downsizefast and downsmoothfast if desired) Two samplers tuned for upsampling: upsize and upsmooth (with fast versions upsizefast and upsmoothfast) Every one is a good general purpose sampler. The downsize method, in particular, will be found to give acceptable results in "all" situations (generally better than bilinear when upsampling, and much much better than bilinear when downsampling). For this reason, I think that downsize may be a good candidate for default GEGL sampler. =================== DETAILED DISCUSSION =================== =========== Terminology =========== "Samplers" are methods like nearest neighbour, bilinear, bicubic, lanczos, Catmull-Rom etc. This is what they do: Given a bitmap image (which, at least in principle, specifies pixel values at pixel locations only), samplers are used to obtain pixel values at arbitrary locations, which is required when resizing, rotating, applying perspective transformations etc. Many key sampler properties are easier to explain in terms of the "reconstructed surface," which is the surface obtained by "plotting" the pixel values at all possible locations (not only the original pixel locations). An "interpolatory sampler" is one such that if a value is requested at an original pixel location, the pixel value returned is the original, unchanged pixel value. In other words, the reconstructed surface goes through the original image data. Nearest neighbour, bilinear, bicubic, lanczos and Catmull-Rom are interpolatory; smoothing bicubic spline (the current GEGL bicubic default) is not. Many non-interpolatory samplers are smoothing. In addition, many exact-area samplers are non-interpolatory. A co-monotone sampler has the following property: If the data (pixel values) is non-increasing or non-decreasing (in some direction, generally parallel to the axes), then the reconstructed surface has the same property between the corresponding locations. A co-monotone sampler does not introduce "haloing." Nearest neighbour, bilinear and smoothing bicubic splines are co-monotone; Lanczos and Catmull-Rom are not co-monotone (and they often introduce significant haloing). A strongly bounded sampler has the following property: The resampled value is always between between the values at the four nearest input pixel locations. If a sampler is monotone, then it is strongly bounded; the converse is not true. A strongly bounded sampler can only add haloing at sub-pixel scale, and the amount of haloing is strongly limited. Nearest neighbour, bilinear and smoothing bicubic splines are strongly bounded; Lanczos and Catmull-Rom are not. A sampler is "exact on linears" if the reconstructed surface is a plane (far enough from the boundary) whenever the input image's pixel values all lie on a plane. Bilinear, smoothing bicubic splines and Catmull-Rom are exact on linears. Nearest neighbour and lanczos are not. ================================== Description of the twelve samplers ================================== See the source code for details and additional comments. =============== Upsharp sampler =============== "Upsharp" is an implementation of the symmetrized MP-Quadratic method (Monotonicity Preserving Hermite interpolation with derivative estimated by fitting a parabola (Quadratic polynomial)). Basically, it is Catmull-Rom with derivatives clamped so as to ensure monotonicity. 1D MP-quadratic (for curve, not surface, interpolation) is described in Accurate Monotone Cubic Interpolation, by Hung T. Huynh, published in the SIAM Journal on Numerical Analysis, Volume 30, Issue 1 (February 1993), pages 57-100, 1993. ISSN:0036-1429. and in NASA technical memorandum 103789, which can be downloaded from ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19910011517_1991011517.pdf In order to ensure reflexion symmetry about diagonal lines, 1D MP-quadratic is performed two different ways---horizontally then vertically, and vertically then horizontally---and averaged. (Symmetry about 45 degree lines is not automatically respected because MP-quadratic is a nonlinear method: interpolating horizontally then vertically does not necessarily give the same as interpolating vertically then horizontally.) Upsharp is interpolatory, co-monotone and exact on linears. Future improvements: This method is likely to replace the final stage of the upsize and upsmooth methods, which will make them smoother and, at least in the case of upsize, less aliased. In addition, Chantal Racette has recently figured out how to implement this method using one third less minmod function evaluations, which should make it run considerably faster: specifically, a much simplified version of http://en.wikipedia.org/wiki/Bicubic_interpolation can be used instead of the current approach. Finally, current research suggests that the co-monotonicity enforcing derivative clamping is overkill and that the result would be less aliased without a noticeable increase in haloing if it only enforced strong boundedness. =================== Upsharpfast sampler =================== Upsharpfast is an efficient (noticeably faster than the current GEGL bicubic) implementation of Catmull-Rom: http://en.wikipedia.org/wiki/Cubic_Hermite_spline Upsharpfast is interpolatory and exact on linears. ============== Upsize sampler ============== Upsize is an implementation of Nohalo level 2 ("Nohalo" = "no halo"), which is obtained by performing two diagonal straightening subdivisions of the input image, finishing with bilinear interpolation. Nohalo level 2 is an unpublished, state of the art method. Upsize is interpolatory, co-monotone and exact on linears. It is especially suited for enlarging colour photographs (which typically are sub-critical, that is, have smallest relevant details/transitions which are several pixel wide). Future improvements: Research by Chantal Racette suggests that one level of subdivision (that is, Nohalo level 1) followed by a variant of symmetrized MP-quadratic instead of bilinear may yield a better method than Nohalo level 2. ================== Upsizefast sampler ================== Upsizefast is an implementation of Nohalo level 1, which is described in CPU, SMP and GPU implementations of Nohalo level 1, a fast co-convex antialiasing image resampler, by Nicolas Robidoux, Minglun Gong, John Cupitt, Adam Turcotte and Kirk Martinez, published in the Proceedings of the Canadian Conference on Computer Science & Software Engineering, C3S2E 2009, Montreal, Quebec, Canada, May 19-21, 2009, ACM International Conference Proceedings Series, pages 185-195. http://doi.acm.org/10.1145/1557626.1557657 Nohalo level 1 consists of one diagonal straightening subdivision followed by bilinear interpolation. Upsizefast is interpolatory, co-monotone and exact on linears. ================ Upsmooth sampler ================ Upsmooth is an implementation of (the simplest version of) Snohalo level 2 ("Snohalo" = "smooth no halo"), which consists of Nohalo level 2 preceded by a de-noising and antialiasing smoothing convolution tuned so that a sharp black on white diagonal line or interface is straight following the first subdivision. Snohalo level 2 is an unpublished, state of the art method. Upsmooth is co-monotone and exact on linears. Upsmooth is strongly anti-aliasing. It is especially suited for enlarging text, "old school" game characters, and heavily (destructively) compressed images and photographs (jpg with heavy compression, for example). Future improvements: It is possible that applying smoothing between every level instead of only initially, or doing one level of subdivision and then using a relative of symmetrized MP-quadratic as a finishing scheme, will yield better results. ==================== Upsmoothfast sampler ==================== Upsmoothfast is an implementation of Snohalo level 1, which consists of Nohalo level 1 preceded by the antialiasing blur. Snohalo level 1 is another unpublished, state-of-the-art method. Upsmoothfast is co-monotone and exact on linears. ================= Downsharp sampler ================= Downsharp is an exact area method which is similar to exact area box filtering with an averaging rectangle that has sides 1/sqrt(2) of those of the smallest rectangle (with sides aligned with untransformed coordinate axes) which contains the approximate inverse image of an "output" pixel area (a square centered at the sampling location in transformed space, with sides, of length one, aligned with the axes). The inverse image is approximated using the inverse Jacobian matrix of the transformation. The averaging is done over the "minmod surface" which is implicit to the S/Nohalo methods. The 1/sqrt(2) factor ensures that when the transformation does not downsample in any direction, the sampler is interpolatory. It also implies that, of the six downsamplers, it is the one which behaves most like nearest neighbour (although it is very different from it). Downsharp is yet another unpublished method. Downsharp is co-monotone and exact on linears. When upsampling, it is also interpolatory. Downsharp is well-suited for producing thumbnails and other transformations (perspective?) in which large areas become a single pixel. Future improvements: Fix things so that downsampling ratios larger than about 100 are better handled. The simplest way would be to have the downsamplers use a larger buffer than the current 64x64; otherwise, use branching and multiple "get pointer to input buffer" calls if necessary. And, as mentioned above, possibly get rid of it. ================ Downsize sampler ================ Downsize is like downsharp except that the rectangle always fully contains the approximate inverse image of a pixel area. Downsize is yet another unpublished method. It was inspired somewhat by the PDL::Transform method of Craig DeForest. Downsize is co-monotone. When upsizing (no rotation or warp), it is interpolatory and exact on linears. Downsize is well-suited for producing thumbnails etc. Future improvements: Like downsharp, better handling of very large downsampling ratios. It is possible that averaging over the actual approximate inverse image, or over a related disk or ellipse, give better results than over the smallest containing rectangle (programming this well is a non-trivial programming task). ================== Downsmooth sampler ================== Downsmooth is like downsize except that the rectangle is twice the size required to contain the approximate inverse image of a pixel area. Downsmooth is yet another unpublished method. Downsmooth is co-monotone and exact on linears. Downsmooth is well-suited for producing thumbnails etc. Future improvements: Like downsize, better large downsampling ratio handling. Also: The rectangular averaging box is not optimal: Current research suggests that a "cross" would produce thumbnails with both less blur and less aliasing. The current version is too blurry for my taste. ============================================== Downsharpfast, Downsizefast and Downsmoothfast ============================================== They are like their "slow" counterparts except that the reconstructed surface which is averaged is the nearest neighbour one (the piecewise constant reconstruction) instead of the minmod surface. They are co-monotone. ================================================================= When perspective and more general warping are implemented in GEGL ================================================================= In order to adjust the size of the averaging rectangle, the downsamplers need information about the transformation which uses them. We have implemented the samplers so that they use inverse Jacobian matrix information. This is trivial to obtain in GEGL because affine transformations already compute the inverse of their linear part. When non-affine transformations get implemented, similar information will have to be passed on to the down methods. The inverse Jacobian matrix does not need to be exact: Any halfway reasonable approximation will do (of course, for perspective transformations, the exact inverse Jacobian matrix is easy to compute). Note that when the transformation upsamples (in one or both directions), the corresponding side lengths (which are, in principle, small, since a pixel area in the transformed image "comes" from a very small area of the input image) are clamped up to 1; for this reason, it is not very important to program an inverse Jacobian matrix which is accurate where the transformation "blows up." Adam Turcotte can provide additional details regarding implementation. Also see the source code. _______________________________________________ Gegl-developer mailing list Gegl-developer@lists.XCF.Berkeley.EDU https://lists.XCF.Berkeley.EDU/mailman/listinfo/gegl-developer