Re: [ft-devel] latest patch file for spline flattening
David, I've tested your code with CartoType, my map rendering library, and it produces a small speedup there too. (The smallness of the speedup is to do with the fact that CartoType's curves are nearly always approximations to arcs of circles, which don't behave too badly with my previous optimisation; I can see why your code works better with font glyphs.) Therefore, combined with the fact that it significantly speeds up font rendering, and solves the bug that prompted this latest burst of work, I recommend that it goes into FreeType. Werner, can you give a ruling on whether assignments in conditionals are allowed in FreeType code? David, what value are you using for FT_MAX_CURVE_DEVIATION? I assume 16, which is a good conservative value. The full patch also requires getting rid of the now-unneeded cubic-level and conic-level variables, and making minor changes to the conic splitting routine. However, we could perhaps make the change to cubic splitting independent of any changes to conic splitting, because the latter doesn't suffer from the original bug ('if it ain't broke don't fix it'). Best regards, Graham On 08/09/2010 08:50, David Bevan wrote: Graham, Here's a final revision. It produces exactly the same results as the previous one but the code is a bit faster (and also easier to understand - the previous code was trying to be a bit too clever). David %^ static void gray_render_cubic( RAS_ARG_ const FT_Vector* control1, const FT_Vector* control2, const FT_Vector* to ) { FT_Vector* arc; arc = ras.bez_stack; arc[0].x = UPSCALE( to-x ); arc[0].y = UPSCALE( to-y ); arc[1].x = UPSCALE( control2-x ); arc[1].y = UPSCALE( control2-y ); arc[2].x = UPSCALE( control1-x ); arc[2].y = UPSCALE( control1-y ); arc[3].x = ras.x; arc[3].y = ras.y; for (;;) { /* Check that the arc crosses the current band. */ TPos min, max, y; min = max = arc[0].y; y = arc[1].y; if ( y min ) min = y; if ( y max ) max = y; y = arc[2].y; if ( y min ) min = y; if ( y max ) max = y; y = arc[3].y; if ( y min ) min = y; if ( y max ) max = y; if ( TRUNC( min )= ras.max_ey || TRUNC( max ) 0 ) goto Draw; /* Decide whether to split or draw */ /* See Hain's paper at http://tinyurl.com/HainBez for more info */ { TPos dx, dy, L, dx1, dy1, dx2, dy2, s, s_limit; /* dx and dy are x- and y- components of the P0-P3 chord vector */ dx = arc[3].x - arc[0].x; dy = arc[3].y - arc[0].y; /* L is an (under)estimate of the Euclidean distance P0-P3 */ L = ( 236 * FT_MAX(labs(dx), labs(dy)) + 97 * FT_MIN(labs(dx), labs(dy))) 8; /* avoid possible arithmetic overflow below by splitting */ if (L 32767) goto Split; /* s is L * the perpendicular distance from P1 to the line P0-P3 */ s = labs( dy * (dx1 = arc[1].x - arc[0].x) - dx * (dy1 = arc[1].y - arc[0].y)); /* max deviation may be as much as (s/L) * 3/4 (if Hain's v = 1) */ if (s (s_limit = L * (TPos)(FT_MAX_CURVE_DEVIATION / 0.75))) goto Split; /* s is L * the perpendicular distance from P2 to the line P0-P3 */ s = labs( dy * (dx2 = arc[2].x - arc[0].x) - dx * (dy2 = arc[2].y - arc[0].y)); /* max deviation may be as much as (s/L) * 3/4 (if Hain's v = 1) */ if (s s_limit) goto Split; /* if P1 or P2 is outside P0-P3, split */ if ( dy * dy1 + dx * dx1 0 || dy * dy2 + dx * dx2 0 || dy * (arc[3].y - arc[1].y) + dx * (arc[3].x - arc[1].x) 0 || dy * (arc[3].y - arc[2].y) + dx * (arc[3].x - arc[2].x) 0 ) goto Split; /* no reason to split */ goto Draw; } Split: gray_split_cubic( arc ); arc += 3; continue; Draw: gray_render_line( RAS_VAR_ arc[0].x, arc[0].y ); if (arc == ras.bez_stack) return; arc -= 3; } } ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] latest patch file for spline flattening
No, forget that, I was getting confused. Graham On 12/09/2010 11:16, Graham Asher wrote: On 08/09/2010 08:50, David Bevan wrote: if (s (s_limit = L * (TPos)(FT_MAX_CURVE_DEVIATION / 0.75))) goto Split; Surely this should be if (s (s_limit = L * (TPos)(FT_MAX_CURVE_DEVIATION * 0.75))) goto Split; because the limit is 3/4 of the deviation, times L? Graham ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] latest patch file for spline flattening
Therefore, combined with the fact that it significantly speeds up font rendering, and solves the bug that prompted this latest burst of work, I recommend that it goes into FreeType. Great! Again, thanks a lot, David and Graham, for your hard work on improving this crucial part of FreeType. Werner, can you give a ruling on whether assignments in conditionals are allowed in FreeType code? If it is allowed in C89, then everything's fine. Please post the final version I shall commit (I assume that your git access still doesn't work...) Werner ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] latest patch file for spline flattening
On 09/12/10 09:36, Werner LEMBERG wrote: Werner, can you give a ruling on whether assignments in conditionals are allowed in FreeType code? If it is allowed in C89, then everything's fine. Please post the final version I shall commit (I assume that your git access still doesn't work...) I'd rather assignments are moved outside the expressions. Makes code more maintainable. My 0.02CAD behdad ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
RE: [ft-devel] latest patch file for spline flattening
Graham, Here are the results of my performance testing. I was a bit surprised by the results. In gray_convert_glyph, the time is distributed as follows: OLDNEW render_line 20%15% render_cubic 15%33% render_scanline 14%10% split_cubic6% 9% OLD is the pre-2.4.0 code; NEW is the latest patch from you. These percentages are the fraction of time spent in the specific function (excluding children). Including children, we have the following actual times per call for handling cubic curves: OLDNEW render_cubic 142us 220us I wasn't expecting your new code to be slower. So I ran my trace code on it with the following results: OLD NEW average line segs per arc13.5 11.3 min line segs per arc 21 max line segs per arc32 133 average deviation per line seg0.29 0.44 min deviation per line seg00 max deviation per line seg 22.2 15.8 Some arcs are creating a very large number of line segments. I expect (though I haven't verified) that it is this that is causing the slow-down. Below is the data for one curve that gets broken down into many tiny line segments. David %^ 4604,0 2080,0 40,2020 40,4496 40,4496 - 40,4436 40,4436 - 40,4379 40,4379 - 41,4321 41,4321 - 44,4264 44,4264 - 47,4206 47,4206 - 51,4149 51,4149 - 56,4092 56,4092 - 62,4036 62,4036 - 68,3979 68,3979 - 74,3922 74,3922 - 81,3865 81,3865 - 90,3811 90,3811 - 99,3754 99,3754 - 109,3700 109,3700 - 119,3645 119,3645 - 131,3591 131,3591 - 142,3535 142,3535 - 154,3481 154,3481 - 166,3427 166,3427 - 181,3373 181,3373 - 195,3319 195,3319 - 210,3265 210,3265 - 225,3211 225,3211 - 243,3160 243,3160 - 259,3106 259,3106 - 277,3055 277,3055 - 295,3002 295,3002 - 314,2951 314,2951 - 333,2900 333,2900 - 354,2849 354,2849 - 375,2798 375,2798 - 397,2748 397,2748 - 418,2697 418,2697 - 440,2646 440,2646 - 463,2595 463,2595 - 487,2547 487,2547 - 536,2450 536,2450 - 588,2354 588,2354 - 641,2258 641,2258 - 697,2165 697,2165 - 756,2073 756,2073 - 817,1984 817,1984 - 879,1894 879,1894 - 943,1807 943,1807 - 1009,1720 1009,1720 - 1079,1637 1079,1637 - 1149,1554 1149,1554 - 1222,1474 1222,1474 - 1297,1395 1297,1395 - 1375,1319 1375,1319 - 1452,1243 1452,1243 - 1533,1169 1533,1169 - 1614,1097 1614,1097 - 1698,1028 1698,1028 - 1782,959 1782,959 - 1869,894 1869,894 - 1958,830 1958,830 - 2049,769 2049,769 - 2140,708 2140,708 - 2233,651 2233,651 - 2328,595 2328,595 - 2425,543 2425,543 - 2522,491 2522,491 - 2570,467 2570,467 - 2621,443 2621,443 - 2671,419 2671,419 - 2722,397 2722,397 - 2773,375 2773,375 - 2825,354 2825,354 - 2876,332 2876,332 - 2927,311 2927,311 - 2978,290 2978,290 - 3031,272 3031,272 - 3082,253 3082,253 - 3136,235 3136,235 - 3190,217 3190,217 - 3244,202 3244,202 - 3297,184 3297,184 - 3351,169 3351,169 - 3405,154 3405,154 - 3460,140 3460,140 - 3514,126 3514,126 - 3570,114 3570,114 - 3625,102 3625,102 - 3682,91 3682,91 - 3736,79 3736,79 - 3793,69 3793,69 - 3849,59 3849,59 - 3906,50 3906,50 - 3963,41 3963,41 - 4020,34 4020,34 - 4077,28 4077,28 - 4136,22 4136,22 - 4193,16 4193,16 - 4251,11 4251,11 - 4308,7 4308,7 - 4368,4 4368,4 - 4425,1 4425,1 - 4485,0 4485,0 - 4544,0 4544,0 - 4604,0 ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] latest patch file for spline flattening
That is very interesting and very useful - in fact I think the more surprising a test is, the more useful it is. I'll have to look into your test case carefully as well. I might not be able to do it for a day or to, though. Where does your test data come from? Actual fonts, cooked up data, or a mixture of both? Best regards, Graham - Original Message From: David Bevan david.be...@pb.com To: Graham Asher graham.as...@btinternet.com Cc: freetype-devel freetype-devel@nongnu.org Sent: Tuesday, 7 September, 2010 12:40:21 Subject: RE: [ft-devel] latest patch file for spline flattening Graham, Here are the results of my performance testing. I was a bit surprised by the results. In gray_convert_glyph, the time is distributed as follows: OLDNEW render_line 20%15% render_cubic 15%33% render_scanline 14%10% split_cubic6% 9% OLD is the pre-2.4.0 code; NEW is the latest patch from you. These percentages are the fraction of time spent in the specific function (excluding children). Including children, we have the following actual times per call for handling cubic curves: OLDNEW render_cubic 142us 220us I wasn't expecting your new code to be slower. So I ran my trace code on it with the following results: OLD NEW average line segs per arc13.5 11.3 min line segs per arc 21 max line segs per arc32 133 average deviation per line seg0.29 0.44 min deviation per line seg00 max deviation per line seg 22.2 15.8 Some arcs are creating a very large number of line segments. I expect (though I haven't verified) that it is this that is causing the slow-down. Below is the data for one curve that gets broken down into many tiny line segments. David %^ 4604,0 2080,0 40,2020 40,4496 40,4496 - 40,4436 40,4436 - 40,4379 40,4379 - 41,4321 41,4321 - 44,4264 44,4264 - 47,4206 47,4206 - 51,4149 51,4149 - 56,4092 56,4092 - 62,4036 62,4036 - 68,3979 68,3979 - 74,3922 74,3922 - 81,3865 81,3865 - 90,3811 90,3811 - 99,3754 99,3754 - 109,3700 109,3700 - 119,3645 119,3645 - 131,3591 131,3591 - 142,3535 142,3535 - 154,3481 154,3481 - 166,3427 166,3427 - 181,3373 181,3373 - 195,3319 195,3319 - 210,3265 210,3265 - 225,3211 225,3211 - 243,3160 243,3160 - 259,3106 259,3106 - 277,3055 277,3055 - 295,3002 295,3002 - 314,2951 314,2951 - 333,2900 333,2900 - 354,2849 354,2849 - 375,2798 375,2798 - 397,2748 397,2748 - 418,2697 418,2697 - 440,2646 440,2646 - 463,2595 463,2595 - 487,2547 487,2547 - 536,2450 536,2450 - 588,2354 588,2354 - 641,2258 641,2258 - 697,2165 697,2165 - 756,2073 756,2073 - 817,1984 817,1984 - 879,1894 879,1894 - 943,1807 943,1807 - 1009,1720 1009,1720 - 1079,1637 1079,1637 - 1149,1554 1149,1554 - 1222,1474 1222,1474 - 1297,1395 1297,1395 - 1375,1319 1375,1319 - 1452,1243 1452,1243 - 1533,1169 1533,1169 - 1614,1097 1614,1097 - 1698,1028 1698,1028 - 1782,959 1782,959 - 1869,894 1869,894 - 1958,830 1958,830 - 2049,769 2049,769 - 2140,708 2140,708 - 2233,651 2233,651 - 2328,595 2328,595 - 2425,543 2425,543 - 2522,491 2522,491 - 2570,467 2570,467 - 2621,443 2621,443 - 2671,419 2671,419 - 2722,397 2722,397 - 2773,375 2773,375 - 2825,354 2825,354 - 2876,332 2876,332 - 2927,311 2927,311 - 2978,290 2978,290 - 3031,272 3031,272 - 3082,253 3082,253 - 3136,235 3136,235 - 3190,217 3190,217 - 3244,202 3244,202 - 3297,184 3297,184 - 3351,169 3351,169 - 3405,154 3405,154 - 3460,140 3460,140 - 3514,126 3514,126 - 3570,114 3570,114 - 3625,102 3625,102 - 3682,91 3682,91 - 3736,79 3736,79 - 3793,69 3793,69 - 3849,59 3849,59 - 3906,50 3906,50 - 3963,41 3963,41 - 4020,34 4020,34 - 4077,28 4077,28 - 4136,22 4136,22 - 4193,16 4193,16 - 4251,11 4251,11 - 4308,7 4308,7 - 4368,4 4368,4 - 4425,1 4425,1 - 4485,0 4485,0 - 4544,0 4544,0 - 4604,0 ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
RE: [ft-devel] latest patch file for spline flattening
After trying various other fonts, I settled on using a single large (14,000 glyphs; 800,000 Bezier curves) CID-keyed Type 1 font, which seemed to show pretty average behaviour. I'm working on an implementation of something like Hain's algorithm now. It'll be interesting to see how it compares. David %^ -Original Message- From: GRAHAM ASHER [mailto:graham.as...@btinternet.com] Sent: 07 September 2010 13:46 To: David Bevan Cc: freetype-devel Subject: Re: [ft-devel] latest patch file for spline flattening That is very interesting and very useful - in fact I think the more surprising a test is, the more useful it is. I'll have to look into your test case carefully as well. I might not be able to do it for a day or to, though. Where does your test data come from? Actual fonts, cooked up data, or a mixture of both? Best regards, Graham - Original Message From: David Bevan david.be...@pb.com To: Graham Asher graham.as...@btinternet.com Cc: freetype-devel freetype-devel@nongnu.org Sent: Tuesday, 7 September, 2010 12:40:21 Subject: RE: [ft-devel] latest patch file for spline flattening Graham, Here are the results of my performance testing. I was a bit surprised by the results. In gray_convert_glyph, the time is distributed as follows: OLDNEW render_line 20%15% render_cubic 15%33% render_scanline 14%10% split_cubic6% 9% OLD is the pre-2.4.0 code; NEW is the latest patch from you. These percentages are the fraction of time spent in the specific function (excluding children). Including children, we have the following actual times per call for handling cubic curves: OLDNEW render_cubic 142us 220us I wasn't expecting your new code to be slower. So I ran my trace code on it with the following results: OLD NEW average line segs per arc13.5 11.3 min line segs per arc 21 max line segs per arc32 133 average deviation per line seg0.29 0.44 min deviation per line seg00 max deviation per line seg 22.2 15.8 Some arcs are creating a very large number of line segments. I expect (though I haven't verified) that it is this that is causing the slow-down. Below is the data for one curve that gets broken down into many tiny line segments. David %^ 4604,0 2080,0 40,2020 40,4496 40,4496 - 40,4436 40,4436 - 40,4379 40,4379 - 41,4321 41,4321 - 44,4264 44,4264 - 47,4206 47,4206 - 51,4149 51,4149 - 56,4092 56,4092 - 62,4036 62,4036 - 68,3979 68,3979 - 74,3922 74,3922 - 81,3865 81,3865 - 90,3811 90,3811 - 99,3754 99,3754 - 109,3700 109,3700 - 119,3645 119,3645 - 131,3591 131,3591 - 142,3535 142,3535 - 154,3481 154,3481 - 166,3427 166,3427 - 181,3373 181,3373 - 195,3319 195,3319 - 210,3265 210,3265 - 225,3211 225,3211 - 243,3160 243,3160 - 259,3106 259,3106 - 277,3055 277,3055 - 295,3002 295,3002 - 314,2951 314,2951 - 333,2900 333,2900 - 354,2849 354,2849 - 375,2798 375,2798 - 397,2748 397,2748 - 418,2697 418,2697 - 440,2646 440,2646 - 463,2595 463,2595 - 487,2547 487,2547 - 536,2450 536,2450 - 588,2354 588,2354 - 641,2258 641,2258 - 697,2165 697,2165 - 756,2073 756,2073 - 817,1984 817,1984 - 879,1894 879,1894 - 943,1807 943,1807 - 1009,1720 1009,1720 - 1079,1637 1079,1637 - 1149,1554 1149,1554 - 1222,1474 1222,1474 - 1297,1395 1297,1395 - 1375,1319 1375,1319 - 1452,1243 1452,1243 - 1533,1169 1533,1169 - 1614,1097 1614,1097 - 1698,1028 1698,1028 - 1782,959 1782,959 - 1869,894 1869,894 - 1958,830 1958,830 - 2049,769 2049,769 - 2140,708 2140,708 - 2233,651 2233,651 - 2328,595 2328,595 - 2425,543 2425,543 - 2522,491 2522,491 - 2570,467 2570,467 - 2621,443 2621,443 - 2671,419 2671,419 - 2722,397 2722,397 - 2773,375 2773,375 - 2825,354 2825,354 - 2876,332 2876,332 - 2927,311 2927,311 - 2978,290 2978,290 - 3031,272 3031,272 - 3082,253 3082,253 - 3136,235 3136,235 - 3190,217 3190,217 - 3244,202 3244,202 - 3297,184 3297,184 - 3351,169 3351,169 - 3405,154 3405,154 - 3460,140 3460,140 - 3514,126 3514,126 - 3570,114 3570,114 - 3625,102 3625,102 - 3682,91 3682,91 - 3736,79 3736,79 - 3793,69 3793,69 - 3849,59 3849,59 - 3906,50 3906,50 - 3963,41 3963,41 - 4020,34 4020,34 - 4077,28 4077,28 - 4136,22 4136,22 - 4193,16 4193,16 - 4251,11 4251,11 - 4308,7 4308,7 - 4368,4 4368,4 - 4425,1 4425,1 - 4485,0 4485,0 - 4544,0 4544,0 - 4604,0 ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] latest patch file for spline flattening
With 14,000 glyphs, I imagine that's a CJK font. I think there might be different characteristics from a typical Latin font. I think we also ought to try out a similar number of Latin glyphs, which could be done by rasterizing all the glyphs in a Latin font at varying sizes and rotations. In some ways a CJK font is probably a more stressful test, because strokes occur at a greater number of angles and sizes; but Latin characters should be part of the benchmark. I am not trying to foist more work on you; just musing. Graham - Original Message From: David Bevan david.be...@pb.com To: GRAHAM ASHER graham.as...@btinternet.com Cc: freetype-devel freetype-devel@nongnu.org Sent: Tuesday, 7 September, 2010 13:52:35 Subject: RE: [ft-devel] latest patch file for spline flattening After trying various other fonts, I settled on using a single large (14,000 glyphs; 800,000 Bezier curves) CID-keyed Type 1 font, which seemed to show pretty average behaviour. I'm working on an implementation of something like Hain's algorithm now. It'll be interesting to see how it compares. David %^ -Original Message- From: GRAHAM ASHER [mailto:graham.as...@btinternet.com] Sent: 07 September 2010 13:46 To: David Bevan Cc: freetype-devel Subject: Re: [ft-devel] latest patch file for spline flattening That is very interesting and very useful - in fact I think the more surprising a test is, the more useful it is. I'll have to look into your test case carefully as well. I might not be able to do it for a day or to, though. Where does your test data come from? Actual fonts, cooked up data, or a mixture of both? Best regards, Graham - Original Message From: David Bevan david.be...@pb.com To: Graham Asher graham.as...@btinternet.com Cc: freetype-devel freetype-devel@nongnu.org Sent: Tuesday, 7 September, 2010 12:40:21 Subject: RE: [ft-devel] latest patch file for spline flattening Graham, Here are the results of my performance testing. I was a bit surprised by the results. In gray_convert_glyph, the time is distributed as follows: OLDNEW render_line 20%15% render_cubic 15%33% render_scanline 14%10% split_cubic6% 9% OLD is the pre-2.4.0 code; NEW is the latest patch from you. These percentages are the fraction of time spent in the specific function (excluding children). Including children, we have the following actual times per call for handling cubic curves: OLDNEW render_cubic 142us 220us I wasn't expecting your new code to be slower. So I ran my trace code on it with the following results: OLD NEW average line segs per arc13.5 11.3 min line segs per arc 21 max line segs per arc32 133 average deviation per line seg0.29 0.44 min deviation per line seg00 max deviation per line seg 22.2 15.8 Some arcs are creating a very large number of line segments. I expect (though I haven't verified) that it is this that is causing the slow-down. Below is the data for one curve that gets broken down into many tiny line segments. David %^ 4604,0 2080,0 40,2020 40,4496 40,4496 - 40,4436 40,4436 - 40,4379 40,4379 - 41,4321 41,4321 - 44,4264 44,4264 - 47,4206 47,4206 - 51,4149 51,4149 - 56,4092 56,4092 - 62,4036 62,4036 - 68,3979 68,3979 - 74,3922 74,3922 - 81,3865 81,3865 - 90,3811 90,3811 - 99,3754 99,3754 - 109,3700 109,3700 - 119,3645 119,3645 - 131,3591 131,3591 - 142,3535 142,3535 - 154,3481 154,3481 - 166,3427 166,3427 - 181,3373 181,3373 - 195,3319 195,3319 - 210,3265 210,3265 - 225,3211 225,3211 - 243,3160 243,3160 - 259,3106 259,3106 - 277,3055 277,3055 - 295,3002 295,3002 - 314,2951 314,2951 - 333,2900 333,2900 - 354,2849 354,2849 - 375,2798 375,2798 - 397,2748 397,2748 - 418,2697 418,2697 - 440,2646 440,2646 - 463,2595 463,2595 - 487,2547 487,2547 - 536,2450 536,2450 - 588,2354 588,2354 - 641,2258 641,2258 - 697,2165 697,2165 - 756,2073 756,2073 - 817,1984 817,1984 - 879,1894 879,1894 - 943,1807 943,1807 - 1009,1720 1009,1720 - 1079,1637 1079,1637 - 1149,1554 1149,1554 - 1222,1474 1222,1474 - 1297,1395 1297,1395 - 1375,1319 1375,1319 - 1452,1243 1452,1243 - 1533,1169 1533,1169 - 1614,1097 1614,1097 - 1698,1028 1698,1028 - 1782,959 1782,959 - 1869,894 1869,894 - 1958,830 1958,830 - 2049,769 2049,769 - 2140,708 2140,708 - 2233,651 2233,651 - 2328,595 2328,595 - 2425,543 2425,543 - 2522,491 2522,491 - 2570,467 2570,467 - 2621,443 2621,443 - 2671,419 2671,419 - 2722,397 2722,397 - 2773,375 2773,375 - 2825,354 2825,354 - 2876,332 2876,332 - 2927,311 2927,311 - 2978,290 2978,290 - 3031,272 3031,272 - 3082,253 3082,253 - 3136,235 3136,235 - 3190,217 3190,217 - 3244,202 3244,202 - 3297,184 3297,184 - 3351,169
RE: [ft-devel] latest patch file for spline flattening
I've now implemented something based on Hain's research and it seems to be measurably faster than previous FT approaches. I have used Hain's paper (now available from http://tinyurl.com/HainBez) to provide some sensible heuristics rather than implement all his stuff in detail, and done so without using square roots or even any divisions. First, here are the trace results: OLD NEW HAIN average line segs per arc13.5 11.32.1 min line segs per arc 21 1 max line segs per arc32 133 16 average deviation per line seg0.29 0.44 6.5 min deviation per line seg00 0 max deviation per line seg 22.2 15.8 15.7 By using reasonably accurate heuristics when deciding whether to split the curve, we create 5.5 x fewer line segments. This cuts down the number of calls to split_cubic and the number of iterations within render_cubic. And now the performance results: In gray_convert_glyph, the time is distributed as follows: OLDNEWHAIN render_line 20%15% 12% render_cubic 15%33% 9% render_scanline 14%10% 10% split_cubic6% 9% 2% The time spent in these functions has been significantly reduced as a fraction of processing time. Including children, we have the following actual times per call for handling cubic curves: OLDNEWHAIN render_cubic 142us 220us 61us render_cubic is now more than twice as fast as it ever has been. The effect of the speed-up is even measurable as a 5-10% speed-up of my font rasterisation program (which is reading and writing data on top of using FT to do the actual rendering). These tests are with the same Unicode font as before. I'll run some more test with Latin-only fonts, though previous testing didn't show any significant performance differences between Latin and CJK. CJK glyphs just have more cubic Bezier curves on average, but a Bezier curve is a Bezier curve wherever it comes from. The code is below. I hope I've tried to follow Werner's coding standards as far as I know what they are. Thanks. David %^ static void gray_render_cubic( RAS_ARG_ const FT_Vector* control1, const FT_Vector* control2, const FT_Vector* to ) { FT_Vector* arc; arc = ras.bez_stack; arc[0].x = UPSCALE( to-x ); arc[0].y = UPSCALE( to-y ); arc[1].x = UPSCALE( control2-x ); arc[1].y = UPSCALE( control2-y ); arc[2].x = UPSCALE( control1-x ); arc[2].y = UPSCALE( control1-y ); arc[3].x = ras.x; arc[3].y = ras.y; for (;;) { /* Check that the arc crosses the current band. */ TPos min, max, y; min = max = arc[0].y; y = arc[1].y; if ( y min ) min = y; if ( y max ) max = y; y = arc[2].y; if ( y min ) min = y; if ( y max ) max = y; y = arc[3].y; if ( y min ) min = y; if ( y max ) max = y; if ( TRUNC( min ) = ras.max_ey || TRUNC( max ) 0 ) goto Draw; /* Decide whether to split or draw */ /* See Hain's paper at http://tinyurl.com/HainBez for more info */ { TPos dx, dy, L, dx1, dy1, dx2, dy2, s1, s2; /* dx and dy are x- and y- components of the P0-P3 chord vector */ dx = arc[3].x - arc[0].x; dy = arc[3].y - arc[0].y; /* L is an (under)estimate of the Euclidean distance P0-P3 */ L = ( 236 * FT_MAX(labs(dx), labs(dy)) + 97 * FT_MIN(labs(dx), labs(dy))) 8; /* avoid possible arithmetic overflow below by splitting */ if (L 32767) goto Split; /* s1 is L * the perpendicular distance from P1 to the line P0-P3 */ s1 = labs( dy * (dx1 = arc[1].x - arc[0].x) - dx * (dy1 = arc[1].y - arc[0].y)); /* max deviation is at least (s1 / L) * sqrt(3)/6 (if v = -1) */ if (s1 L * (TPos)(FT_MAX_CURVE_DEVIATION / 0.288675)) goto Split; /* s2 is L * the perpendicular distance from P2 to the line P0-P3 */ s2 = labs( dy * (dx2 = arc[2].x - arc[0].x) - dx * (dy2 = arc[2].y - arc[0].y)); /* max deviation may be as much as (max(s1,s2)/L) * 3/4 (if v = 1) */ if (FT_MAX(s1, s2) L * (TPos)(FT_MAX_CURVE_DEVIATION / 0.75)) goto Split; /* if P1 or P2 is outside P0-P3, split */ if ( dy * dy1 + dx * dx1 0 || dy * dy2 + dx * dx2 0 || dy * (arc[3].y - arc[1].y) + dx * (arc[3].x - arc[1].x) 0 || dy * (arc[3].y - arc[2].y) + dx * (arc[3].x - arc[2].x) 0 ) goto Split; /* no reason to split */ goto Draw; } Split: gray_split_cubic( arc ); arc += 3; continue; Draw: gray_render_line( RAS_VAR_ arc[0].x, arc[0].y ); if (arc == ras.bez_stack) return; arc -=
RE: [ft-devel] latest patch file for spline flattening
Here are some test results with Latin fonts (40 thousand curves from fonts at various point sizes). Trace results: CJK CJK CJK LATIN LATIN OLD NEW HAINNEW HAIN average line segs per arc13.5 11.32.1 30.96.1 max line segs per arc32 133 16 163 18 average deviation per line seg0.29 0.44 6.5 0.37 7.4 max deviation per line seg 22.2 15.8 15.7 7.9 15.7 Performance results: In gray_convert_glyph, the time is distributed as follows: CJKCJKCJK LATIN LATIN OLDNEWHAINNEWHAIN render_line 20%15% 12%14%11% render_cubic 15%33% 9%34%11% render_scanline 14%10% 10%10%11% split_cubic6% 9% 2%10% 3% Including children, we have the following actual times per call for handling cubic curves: CJKCJKCJK LATIN LATIN OLDNEWHAINNEW HAIN render_cubic 142us 220us 61us546us 176us Conclusions: The performance improvement is as evident with Latin fonts as with CJK ones. However, on average Bezier curves from Latin fonts require more flattening (6 segments versus 2 with the Hain implementation), so processing them takes longer. As Graham pointed out to me: The curves used in Latin and other Latin-like alphabets are very often used to navigate 90-degree corners; P0 and P1 lie on a grid line, and so do P2 and P3. This is very rarely true in Han characters. On the other hand, Latin glyphs contain fewer Bezier curves than CJK (6 versus 57 on average with my data). The upshot of both of these together is that the performance change is very similar (the CJK and Latin time distribution figures are so similar they could be from the same test). David %^ ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
RE: [ft-devel] latest patch file for spline flattening
Graham, That's looking much closer to what I would have thought we needed; only splitting the curve when required. However, your fast heuristic can be very inaccurate. Consider P0: 0,0 P1: 5,5 P2: 95,5 P3: 100,0 The max deviation is 3.75 (0.75 * 5 since Hain's v == 1), but your heuristic gives a value of 45 - twelve times too great. Btw, I think that dx1, dy1, ... need to be of type TPos, not int. On the issue of avoiding square roots, a little bit of algebra shows that it is possible to estimate r = sqrt(dx^2 + dy^2) with a simple linear combination of dx and dy. For example, if an overestimate is required, and dx = dy, you can use r_upperbound = dx + (sqrt(2) - 1) * dy which overestimates by no more than 8.5%. For integer arithmetic, sqrt(2) - 1 can be (over)estimated by 107/256. For example, you could use this approximation to do something like this: dx1 = abs(control1-x - midx); dy1 = abs(control1-y - midy); if (dx1 = dy1) dr1 = dx1 + (107 * dy1 + 255) 8; else dr1 = dy1 + (107 * dx1 + 255) 8; dx2 = abs(control2-x - midx); dy2 = abs(control2-y - midy); if (dx2 = dy2) dr2 = dx2 + (107 * dy2 + 255) 8; else dr2 = dy2 + (107 * dx2 + 255) 8; /* deviation never exceeds 75% of control point dist */ if (dr1 = dr2) dev_max = (3 * dr1 + 3) 2; else dev_max = (3 * dr2 + 3) 2; if (dev_max = FT_MAX_CURVE_DEVIATION) ... Of course, this doesn't resolve the problem I mentioned above; for the example curve, this gives dev_max a value of 36 - still nine times too great. I hope to have something based a bit more closely on Hain's paper by the end of tomorrow. I may use something like the square-root avoiding estimate above to approximate his L value. David %^ -Original Message- From: freetype-devel-bounces+david.bevan=pb@nongnu.org [mailto:freetype-devel-bounces+david.bevan=pb@nongnu.org] On Behalf Of Graham Asher Sent: 6 September 2010 11:10 To: freetype-devel Subject: [ft-devel] latest patch file for spline flattening Here's a new version of my spline flattening patch. (I would like to be able to push this to the git repository but am having authentication problems; Werner has been helping me, but no success so far, probably because of my ineptitude in these matters.). The nub of the latest change is that I found that predicting the number of recursion levels is not reliable when splitting a cubic spline for flattening. A better way is to check the flatness inside the loop - using the fast heuristic of taking the maximum coordinate difference of a control point from the chord midpoint. This also makes the code simpler - and, surprisingly, faster, according to my benchmark; however, my benchmark is based on cartographic, not typographic, use, so will need confirmation. The patch obviously still solves the bug involving s-shaped curves (control points on both sides of the chord). Graham ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
Re: [ft-devel] latest patch file for spline flattening
David, in fact I coded up and tested a different version using an accurate calculation of the control point deviation, but it was slower than the version I am proposing. I'll try your version; and I would be grateful if you could also do some benchmarking, because obviously we are not trying to minimise the number of straight line segments the curve is dissected into, but the overall speed. My aims, and my benchmarking, are rather biased, because my main use of cubic splines is for approximations to arcs of circles, used as parts of path envelopes when drawing maps. Graham David Bevan wrote: Graham, That's looking much closer to what I would have thought we needed; only splitting the curve when required. However, your fast heuristic can be very inaccurate. Consider P0: 0,0 P1: 5,5 P2: 95,5 P3: 100,0 The max deviation is 3.75 (0.75 * 5 since Hain's v == 1), but your heuristic gives a value of 45 - twelve times too great. Btw, I think that dx1, dy1, ... need to be of type TPos, not int. On the issue of avoiding square roots, a little bit of algebra shows that it is possible to estimate r = sqrt(dx^2 + dy^2) with a simple linear combination of dx and dy. For example, if an overestimate is required, and dx = dy, you can use r_upperbound = dx + (sqrt(2) - 1) * dy which overestimates by no more than 8.5%. For integer arithmetic, sqrt(2) - 1 can be (over)estimated by 107/256. For example, you could use this approximation to do something like this: dx1 = abs(control1-x - midx); dy1 = abs(control1-y - midy); if (dx1 = dy1) dr1 = dx1 + (107 * dy1 + 255) 8; else dr1 = dy1 + (107 * dx1 + 255) 8; dx2 = abs(control2-x - midx); dy2 = abs(control2-y - midy); if (dx2 = dy2) dr2 = dx2 + (107 * dy2 + 255) 8; else dr2 = dy2 + (107 * dx2 + 255) 8; /* deviation never exceeds 75% of control point dist */ if (dr1 = dr2) dev_max = (3 * dr1 + 3) 2; else dev_max = (3 * dr2 + 3) 2; if (dev_max = FT_MAX_CURVE_DEVIATION) ... Of course, this doesn't resolve the problem I mentioned above; for the example curve, this gives dev_max a value of 36 - still nine times too great. I hope to have something based a bit more closely on Hain's paper by the end of tomorrow. I may use something like the square-root avoiding estimate above to approximate his L value. David %^ -Original Message- From: freetype-devel-bounces+david.bevan=pb@nongnu.org [mailto:freetype-devel-bounces+david.bevan=pb@nongnu.org] On Behalf Of Graham Asher Sent: 6 September 2010 11:10 To: freetype-devel Subject: [ft-devel] latest patch file for spline flattening Here's a new version of my spline flattening patch. (I would like to be able to push this to the git repository but am having authentication problems; Werner has been helping me, but no success so far, probably because of my ineptitude in these matters.). The nub of the latest change is that I found that predicting the number of recursion levels is not reliable when splitting a cubic spline for flattening. A better way is to check the flatness inside the loop - using the fast heuristic of taking the maximum coordinate difference of a control point from the chord midpoint. This also makes the code simpler - and, surprisingly, faster, according to my benchmark; however, my benchmark is based on cartographic, not typographic, use, so will need confirmation. The patch obviously still solves the bug involving s-shaped curves (control points on both sides of the chord). Graham ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel
RE: [ft-devel] latest patch file for spline flattening
I'll do that. I wonder how much of the cost of FT_Load_Glyph is actually spent in gray_render_cubic and how much impact reducing the number of line segments has on later phases of the rendering process. I'm also trying to see if I can come up with a heuristic that is both fast (i.e. simple linear combinations of positions) and accurate (i.e. always correct and any overestimate is reasonably bounded). Then we may be able to avoid the complexities of Hain's approach. We so seem to be converging towards something that meets all the requirements. I probably won't achieve much today, but hope to spend most of tomorrow on this. David %^ -Original Message- From: Graham Asher [mailto:graham.as...@btinternet.com] Sent: 6 September 2010 12:36 To: David Bevan Cc: freetype-devel Subject: Re: [ft-devel] latest patch file for spline flattening David, in fact I coded up and tested a different version using an accurate calculation of the control point deviation, but it was slower than the version I am proposing. I'll try your version; and I would be grateful if you could also do some benchmarking, because obviously we are not trying to minimise the number of straight line segments the curve is dissected into, but the overall speed. My aims, and my benchmarking, are rather biased, because my main use of cubic splines is for approximations to arcs of circles, used as parts of path envelopes when drawing maps. Graham David Bevan wrote: Graham, That's looking much closer to what I would have thought we needed; only splitting the curve when required. However, your fast heuristic can be very inaccurate. Consider P0: 0,0 P1: 5,5 P2: 95,5 P3: 100,0 The max deviation is 3.75 (0.75 * 5 since Hain's v == 1), but your heuristic gives a value of 45 - twelve times too great. Btw, I think that dx1, dy1, ... need to be of type TPos, not int. On the issue of avoiding square roots, a little bit of algebra shows that it is possible to estimate r = sqrt(dx^2 + dy^2) with a simple linear combination of dx and dy. For example, if an overestimate is required, and dx = dy, you can use r_upperbound = dx + (sqrt(2) - 1) * dy which overestimates by no more than 8.5%. For integer arithmetic, sqrt(2) - 1 can be (over)estimated by 107/256. For example, you could use this approximation to do something like this: dx1 = abs(control1-x - midx); dy1 = abs(control1-y - midy); if (dx1 = dy1) dr1 = dx1 + (107 * dy1 + 255) 8; else dr1 = dy1 + (107 * dx1 + 255) 8; dx2 = abs(control2-x - midx); dy2 = abs(control2-y - midy); if (dx2 = dy2) dr2 = dx2 + (107 * dy2 + 255) 8; else dr2 = dy2 + (107 * dx2 + 255) 8; /* deviation never exceeds 75% of control point dist */ if (dr1 = dr2) dev_max = (3 * dr1 + 3) 2; else dev_max = (3 * dr2 + 3) 2; if (dev_max = FT_MAX_CURVE_DEVIATION) ... Of course, this doesn't resolve the problem I mentioned above; for the example curve, this gives dev_max a value of 36 - still nine times too great. I hope to have something based a bit more closely on Hain's paper by the end of tomorrow. I may use something like the square-root avoiding estimate above to approximate his L value. David %^ -Original Message- From: freetype-devel-bounces+david.bevan=pb@nongnu.org [mailto:freetype-devel-bounces+david.bevan=pb@nongnu.org] On Behalf Of Graham Asher Sent: 6 September 2010 11:10 To: freetype-devel Subject: [ft-devel] latest patch file for spline flattening Here's a new version of my spline flattening patch. (I would like to be able to push this to the git repository but am having authentication problems; Werner has been helping me, but no success so far, probably because of my ineptitude in these matters.). The nub of the latest change is that I found that predicting the number of recursion levels is not reliable when splitting a cubic spline for flattening. A better way is to check the flatness inside the loop - using the fast heuristic of taking the maximum coordinate difference of a control point from the chord midpoint. This also makes the code simpler - and, surprisingly, faster, according to my benchmark; however, my benchmark is based on cartographic, not typographic, use, so will need confirmation. The patch obviously still solves the bug involving s-shaped curves (control points on both sides of the chord). Graham ___ Freetype-devel mailing list Freetype-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/freetype-devel