Hi, I decided to review Billy's glyph extents patch which was the last of the bits remaining from the major optimization work three weeks ago. In the process I came up with my own completely different patch that I'm posting. I think we can pick the good stuff from the two and close this issue.
We are talking about pangocairo-fcfont.c. The current implementation creates a hash_table per PangoCairoFcFont, and whatever glyph extents it computes, drops into the hash table. The hash table stuff from this is showing on the profiles for about 2.5%. So note that we are talking about as little as 2.5%, if you feel like it's a waste of time, feel free to stop reading now :). Billy did a minimal patch [1] that removes the g_hash_table, and insteal allocates a fixed-size 1024-bucket custom hash table, each entry a linked list. That definitely has its merits, and may work pretty fast as well. But I didn't like allocating 4kb of hash table, and mallocing an item (40 bytes) per glyph later. What I went for instead, is a copy of what Federico did for gunichar->glyph lookup: A fixed-size last-only 256-bucket cache. In fact I even shared the cache-reporting facilities with his cache. I also reordered the code around, so we make fewer cairo calls. I made the per-glyph item to store in the cache smaller, making a 256-item cache down to 6kb, which is comparable to Billy's array of NULLs in size. The hit ratio in all my tests have been >99%. In fact, in any realistic scenario, the hit ratio of this cache should be exactly the same as federico's cache, since if you convert a gunichar to a glyph, you will get that glyph's extents sooner or later. One problem with this approach is that unlike the original code and vektor's, mine does not cache all glyph extents ever queried. I would like to see that as a plus, that the cache does not grow unbounded. On the other hand, cairo and FreeType have their own caches, so we are just adding a small L1 cache on top of them. Very reasonable IMHO. What do people think? As for speed, I did a measurement, it performed almost like vektor's. Although I expect it to be a bit slower, since the cache size is 256, not 1024. Would be nice if somebody else benchmarks too. [1] http://cvs.gnome.org/viewcvs/pango-profile/patches/vektor-glyph-extent-hash.diff?rev=1.1&view=markup [2] http://cvs.gnome.org/viewcvs/pango-profile/patches/behdad-glyph-extent-hash.patch?rev=1.1&view=markup Thanks, --behdad _______________________________________________ Performance-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/performance-list
