The largest chunk of data needed for font matching is the Unicode coverage
for each font; this coverage is represented internally by a sparse bit
array stored in a btree -- each level of the btree represents one byte
from the unicode coverage, so trees are at most four levels deep.
At the bottom, a bit array represents the font coverage within a single
page; 256 bits for each of the members of that page.
A BMP font with glyphs in 8 pages takes:
1 branch node (256 pointers + overhead) 1280
8 leaves (256/8 bytes) 8 * 32 256
1 header 16
----
1552
Multiply that by the number of fonts in the system to find that 400 fonts
takes over 600K of memory. Seems a bit excessive to me.
I've taken steps to reduce this, using the obvious fact that many fonts
cover the same glyphs in each page. I now share branches and leaves
wherever possible. The results are encouraging. In my configuration with
383 fonts:
total total mem unique unique mem
leaves: 4962 158784 793 25376
branches: 360 460800 75 96000
total 619584 121376
This reduces memory consumption by a factor of 5.
Savings improve when adding more fonts; increase the configuration to
include several thousand of the ugliest fonts you've ever seen and
with 4803 fonts:
total total mem unique unique mem
leaves: 46572 1490304 1705 54560
branches: 4779 6117120 519 664320
total 7607424 718880
This reduces memory consumption by a factor of 10.
(for the curious, there are 20 some fonts in this list which have no
Unicode mapping, and hence have no branch allocated).
This analysis highlights one other direction for savings -- the actual
bitmaps are taking only a small fraction of the total space; I can
represent the coverage of nearly 5000 fonts in only 50KB, but holding the
various branches above that takes more than 10 times as much space.
I like the current data structure's performance in locating a leaf; it
consists of walking down the btree a byte at a time. However, wasting
this much memory for inactive fonts doesn't justify the cost. In the 519
branches above, only 6071 of their 132864 entries had non-zero values.
I think I'll convert the branches to a packed representation and take the
performance hit when searching. That should reduce memory usage for
the branches from 664320 to 32403 bytes. I'll take the performance hit
for the 630K savings; the reduction in cache misses should more than
compensate.
Keith Packard XFree86 Core Team HP Cambridge Research Lab
_______________________________________________
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts