[Gimp-developer] [patch] Major speedup for whirlpinch plugin
Hi, I don't know who's currently "responsible" for the whirlpinch plugin, so I post my patch to this list. I have modified whirlpinch slightly to use "blocking", ie. doing all calculations in small squares (32*32). With that technique very common in numerical computing, the CPU caches (and for GIMP) the tile cache have a much higher hit rate. The boost is quite spectacular: The original whirlpinch on a larger image (1400*1400) needs on a Athlon-600 30s to complete, with my patch only 6.5s. That's a speedup by a factor of 4.5 without any change in the algorithm itself! The changes are relatively small (effectively about 10 lines) and affect mostly clipping. I have found no side efects of the patch... The blocking can IMHO easily used for a lot of other filters, and should give a large speedup for most of GIMP's filters. Please try out the patch and apply it to the source tree if you like it ;-) -- Georg Acher, [EMAIL PROTECTED] http://www.in.tum.de/~acher/ "Oh no, not again !" The bowl of petunias --- whirlpinch.c.orgThu Apr 5 17:47:17 2001 +++ whirlpinch.cThu Apr 5 18:49:09 2001 @@ -22,6 +22,14 @@ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */ +/* Version 2.10: + * + * Major Speedup by use of "blocking", ie. doing the calcualations + * in small squares, thus gaining a performance boost from CPU caches + * and the tile cache. + * + * Georg Acher, [EMAIL PROTECTED] + */ /* Version 2.09: * @@ -63,7 +71,7 @@ #define PLUG_IN_NAME"plug_in_whirl_pinch" -#define PLUG_IN_VERSION "May 1997, 2.09" +#define PLUG_IN_VERSION "April 2001, 2.10" /* Magic numbers */ @@ -71,6 +79,10 @@ #define SCALE_WIDTH 200 #define ENTRY_WIDTH 60 +/* blocking size, 32*32pixels is a good compromise for all CPUs */ + +#define BLOCKING 32 + /* Types */ typedef struct @@ -366,12 +378,13 @@ guchar *top_row, *bot_row; guchar *top_p, *bot_p; gint row, col; + gint row1,col1; guchar pixel[4][4]; guchar values[4]; double whirl; double cx, cy; int ix, iy; - int i; + int i,n; guchar bg_color[4]; pixel_fetcher_t *pft, *pfb; @@ -406,112 +419,133 @@ whirl = wpvals.whirl * G_PI / 180; radius2 = radius * radius * wpvals.radius; - for (row = sel_y1; row = ((sel_y1 + sel_y2) / 2); row++) + /* WhirlPinch in small squares to benefit from cache effects + (tile cache, CPU cache) + 20010405 GA + */ + for (row1 = sel_y1; row1 = ((sel_y1 + sel_y2) / 2); row1+=BLOCKING) { - top_p = top_row; - bot_p = bot_row + img_bpp * (sel_width - 1); - - for (col = sel_x1; col sel_x2; col++) - { - if (calc_undistorted_coords (col, row, whirl, wpvals.pinch, cx, cy)) - { - /* We are inside the distortion area */ - - /* Top */ - - if (cx = 0.0) - ix = (int) cx; - else - ix = -((int) -cx + 1); - - if (cy = 0.0) - iy = (int) cy; - else - iy = -((int) -cy + 1); - - pixel_fetcher_get_pixel (pft, ix, iy, pixel[0]); - pixel_fetcher_get_pixel (pft, ix + 1, iy, pixel[1]); - pixel_fetcher_get_pixel (pft, ix, iy + 1, pixel[2]); - pixel_fetcher_get_pixel (pft, ix + 1, iy + 1, pixel[3]); - - for (i = 0; i img_bpp; i++) - { - values[0] = pixel[0][i]; - values[1] = pixel[1][i]; - values[2] = pixel[2][i]; - values[3] = pixel[3][i]; - - *top_p++ = bilinear (cx, cy, values); - } - - /* Bottom */ - - cx = cen_x + (cen_x - cx); - cy = cen_y + (cen_y - cy); - - if (cx = 0.0) - ix = (int) cx; - else - ix = -((int) -cx + 1); - - if (cy = 0.0) - iy = (int) cy; - else - iy = -((int) -cy + 1); - - pixel_fetcher_get_pixel (pfb, ix, iy, pixel[0]); - pixel_fetcher_get_pixel (pfb, ix + 1, iy, pixel[1]); - pixel_fetcher_get_pixel (pfb, ix, iy + 1, pixel[2]); - pixel_fetcher_get_pixel (pfb, ix + 1, iy + 1, pixel[3]); - - for (i = 0; i img_bpp; i++) - { - values[0] = pixel[0][i]; - values[1] = pixel[1][i]; - values[2] = pixel[2][i]; - values[3] = pixel[3][i]; - - *bot_p++ = bilinear (cx, cy, values); - } - - bot_p -= 2 * img_bpp; /* We move backwards! */ - } - else - { -
[Gimp-developer] Re: [patch] Major speedup for whirlpinch plugin
[EMAIL PROTECTED] (2001-04-05 at 1904.10 +0200): The blocking can IMHO easily used for a lot of other filters, and should give a large speedup for most of GIMP's filters. I do not have the power to decide if it gets accepted, but if Gimp speed up in some areas is just about things like that, please, the people in charge (aka plugin coders), take note. Maybe you could do more reviews or write a small guideline doc for this. And thanks for the patch. GSR ___ Gimp-developer mailing list [EMAIL PROTECTED] http://lists.xcf.berkeley.edu/mailman/listinfo/gimp-developer
[Gimp-developer] [patch] Major speedup for whirlpinch plugin
On Thu, 5 Apr 2001 12:36:05 -0500, Kelly Martin [EMAIL PROTECTED] said: I have modified whirlpinch slightly to use "blocking", ie. doing all calculations in small squares (32*32). With that technique very common in numerical computing, the CPU caches (and for GIMP) the tile cache have a much higher hit rate. The boost is quite spectacular: The original whirlpinch on a larger image (1400*1400) needs on a Athlon-600 30s to complete, with my patch only 6.5s. That's a speedup by a factor of 4.5 without any change in the algorithm itself! I was under the impression that whirlpinch used tile regions, which should do blocking automatically. It's been a couple years since I looked at the code, though. Hm, it does not. The issue with whirlpinch is that there's only a weak locality relationship between destionation pixels (which are iterated across the image) and source pixels (which are fetched with the pixel fetcher). I haven't looked too closely at your blocking patch, but I suspect that much the same improvement would be had by using a pixel region (which respects tiles) to iterate across the destination region. Whirlpinch also has an optimization to do the top and bottom whirls simultaneously to save calculations. This might actually be more of a loss because of the locality hit on the tile cache. If I get some spare time, I might look at this more closely. Kelly ___ Gimp-developer mailing list [EMAIL PROTECTED] http://lists.xcf.berkeley.edu/mailman/listinfo/gimp-developer
Re: [Gimp-developer] [patch] Major speedup for whirlpinch plugin
On Thu, Apr 05, 2001 at 12:45:52PM -0500, Kelly Martin wrote: Hm, it does not. The issue with whirlpinch is that there's only a weak locality relationship between destionation pixels (which are iterated across the image) and source pixels (which are fetched with the pixel fetcher). I haven't looked too closely at your blocking That is right, but destination and source for themselves have good locality (ie. the next pixel isn't 500 pixels away from the last). I have tried whirlpinch with huge distortion values, and the worst case was 8s instead of 6.5s, still better than 30s ;-) patch, but I suspect that much the same improvement would be had by using a pixel region (which respects tiles) to iterate across the destination region. That is possible... Is there a filter that definitely uses the pixel region stuff? Most filters I have seen only use one row, which may not be enough "locally", since it uses only one pixel but has to fill a whole cacheline (4/8 pixel). I will try whether I can speed up bumpmap also, since this takes also the magical 30s and is much more often used in scripts than the whirlpinch module. I don't know how large a tile is, but since IMHO the major impact of blocking seems to come from the CPU cache, I suspect that is too big for older CPUs. I have done the whirlpinch blocking thing about three years ago (and forgot to send the patch), and tried it on an Alpha21164 and a P5. Both have only very small L1-caches (Alpha has 8K) and relatively slow L2 caches (+L3 for Alpha). The best blocking factor for those CPUs was between 64 and 128. My Athlon now also tolerates 512, 32 only is only slightly (2%) slower, but should work for all CPUs fairly good. -- Georg Acher, [EMAIL PROTECTED] http://www.in.tum.de/~acher/ "Oh no, not again !" The bowl of petunias ___ Gimp-developer mailing list [EMAIL PROTECTED] http://lists.xcf.berkeley.edu/mailman/listinfo/gimp-developer