On 12/11/06, Eero Tamminen <[EMAIL PROTECTED]> wrote: > Hi, > > ext Jorn Baayen wrote: > > Argh! It would have been too simple to be true. The difference in > > profiles must be due to general profiling fuzziness. > > Is there anymore anything where both what is being calculated and > the result are in integers but they are casted to and multiplied with > floating point values for additional accuracy?
Assuming we're only talking about pangocairo for the moment, no. Since Behdad made glyph_extents pretty much go away entirely, now there is only one place in pangocairo that is responsible for FP burn (in all the test cases that I've seen): the loop in pango_cairo_renderer_draw_glyphs. For each glyph, this statement is executed (for both x and y): double cx = crenderer->x_offset + (double)(x + x_position + gi->geometry.x_offset) / PANGO_SCALE; Which produces a int->double conversion (slow), a double multiply [1] (very slow) and a double add (very slow). Twice for each glyph on every expose. The resulting double is later used to populate the x and y members (doubles also) of the cairo_glyph_t that is sent to cairo_show_glyphs. I really think that the above line(s) is responsible for pretty much all the __muldf3, __adddf3 and __floatsidf we see here (at the top of the profile): http://folks.o-hand.com/~jorn/pango-benchmarks/28-pango-1.15.1/pango-cairo.txt I have an idea of how to get rid of these FP ops, too, but I've been concentrating on cairo at the moment and haven't gotten around to coding anything up yet. I'll outline the idea here in case someone wants to beat me to it (and so I can refer back to it later when I forget :) First, we can convert the crenderer offsets to fixed point before we enter the loop. This will allow us to eliminate the __adddf3, as the result of the (x + x_position + gi->geometry.x_offset) expression is a fixed point number anyway (right, Behdad?), so we can just change the expression to (x + x_position + gi->geometry.x_offset + crenderer_x_offset_fixed). What's left is the conversion from fixed->double, which can be done w/out the __mul or the __floatsidf. Basically, the number of leading zeros in the fixed point number can be used to determine the exponent value of the target double, and since the number is in fixed point, you'll need to use a bias that is adjusted for the size of the fractional part of the fixed point number. After shifting the fixed point the proper amount (based on the number of leading zeros again), you'll have your exponent and mantissa all set to pack into a union. Copy the double from the union into the cairo_glyph coordinate, and you're done. Need to watch out for some special cases, but I think that the approach is sound. Once that is done, pangocairo should be pretty much FP free for the typical code paths that I would expect to see on the 770. On timetext.c or the torturer's GtkTextView, I don't think you'll see _that_ much improvement (percentage-wise) from this change until you get Xan's XRender glyph optimization into cairo, as that is a bigger bottleneck ATM, I think. [1] Because the denominator of the divide is a constant, the compiler converts it into a multiply, which is faster. Dan _______________________________________________ Performance-list mailing list Performance-list@gnome.org http://mail.gnome.org/mailman/listinfo/performance-list