I've now implemented something based on Hain's research and it seems to be measurably faster than previous FT approaches. I have used Hain's paper (now available from http://tinyurl.com/HainBez) to provide some sensible heuristics rather than implement all his stuff in detail, and done so without using square roots or even any divisions.
First, here are the trace results: OLD NEW HAIN average line segs per arc 13.5 11.3 2.1 min line segs per arc 2 1 1 max line segs per arc 32 133 16 average deviation per line seg 0.29 0.44 6.5 min deviation per line seg 0 0 0 max deviation per line seg 22.2 15.8 15.7 By using reasonably accurate heuristics when deciding whether to split the curve, we create 5.5 x fewer line segments. This cuts down the number of calls to split_cubic and the number of iterations within render_cubic. And now the performance results: In gray_convert_glyph, the time is distributed as follows: OLD NEW HAIN render_line 20% 15% 12% render_cubic 15% 33% 9% render_scanline 14% 10% 10% split_cubic 6% 9% 2% The time spent in these functions has been significantly reduced as a fraction of processing time. Including children, we have the following actual times per call for handling cubic curves: OLD NEW HAIN render_cubic 142us 220us 61us render_cubic is now more than twice as fast as it ever has been. The effect of the speed-up is even measurable as a 5-10% speed-up of my font rasterisation program (which is reading and writing data on top of using FT to do the actual rendering). These tests are with the same Unicode font as before. I'll run some more test with Latin-only fonts, though previous testing didn't show any significant performance differences between Latin and CJK. CJK glyphs just have more cubic Bezier curves on average, but a Bezier curve is a Bezier curve wherever it comes from. The code is below. I hope I've tried to follow Werner's coding standards as far as I know what they are. Thanks. David %^> static void gray_render_cubic( RAS_ARG_ const FT_Vector* control1, const FT_Vector* control2, const FT_Vector* to ) { FT_Vector* arc; arc = ras.bez_stack; arc[0].x = UPSCALE( to->x ); arc[0].y = UPSCALE( to->y ); arc[1].x = UPSCALE( control2->x ); arc[1].y = UPSCALE( control2->y ); arc[2].x = UPSCALE( control1->x ); arc[2].y = UPSCALE( control1->y ); arc[3].x = ras.x; arc[3].y = ras.y; for (;;) { /* Check that the arc crosses the current band. */ TPos min, max, y; min = max = arc[0].y; y = arc[1].y; if ( y < min ) min = y; if ( y > max ) max = y; y = arc[2].y; if ( y < min ) min = y; if ( y > max ) max = y; y = arc[3].y; if ( y < min ) min = y; if ( y > max ) max = y; if ( TRUNC( min ) >= ras.max_ey || TRUNC( max ) < 0 ) goto Draw; /* Decide whether to split or draw */ /* See Hain's paper at http://tinyurl.com/HainBez for more info */ { TPos dx, dy, L, dx1, dy1, dx2, dy2, s1, s2; /* dx and dy are x- and y- components of the P0-P3 chord vector */ dx = arc[3].x - arc[0].x; dy = arc[3].y - arc[0].y; /* L is an (under)estimate of the Euclidean distance P0-P3 */ L = ( 236 * FT_MAX(labs(dx), labs(dy)) + 97 * FT_MIN(labs(dx), labs(dy))) >> 8; /* avoid possible arithmetic overflow below by splitting */ if (L > 32767) goto Split; /* s1 is L * the perpendicular distance from P1 to the line P0-P3 */ s1 = labs( dy * (dx1 = arc[1].x - arc[0].x) - dx * (dy1 = arc[1].y - arc[0].y)); /* max deviation is at least (s1 / L) * sqrt(3)/6 (if v = -1) */ if (s1 > L * (TPos)(FT_MAX_CURVE_DEVIATION / 0.288675)) goto Split; /* s2 is L * the perpendicular distance from P2 to the line P0-P3 */ s2 = labs( dy * (dx2 = arc[2].x - arc[0].x) - dx * (dy2 = arc[2].y - arc[0].y)); /* max deviation may be as much as (max(s1,s2)/L) * 3/4 (if v = 1) */ if (FT_MAX(s1, s2) > L * (TPos)(FT_MAX_CURVE_DEVIATION / 0.75)) goto Split; /* if P1 or P2 is outside P0-P3, split */ if ( dy * dy1 + dx * dx1 < 0 || dy * dy2 + dx * dx2 < 0 || dy * (arc[3].y - arc[1].y) + dx * (arc[3].x - arc[1].x) < 0 || dy * (arc[3].y - arc[2].y) + dx * (arc[3].x - arc[2].x) < 0 ) goto Split; /* no reason to split */ goto Draw; } Split: gray_split_cubic( arc ); arc += 3; continue; Draw: gray_render_line( RAS_VAR_ arc[0].x, arc[0].y ); if (arc == ras.bez_stack) return; arc -= 3; } } _______________________________________________ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel