2013-01-09 11:57, Leif Halvard Silli wrote:
Not sure which fallacy you have identified - see below.
I was referring to comparison between an ad hoc 8-but encoding and a Unicode encoding so that you count the sizes font files in first case only. I’m a bit confused with your comparison, which seems to deal with a page that uses a downloadable font in both cases but uses some rather obscure fonts (from a site that has no main page etc.). In any case, my point might not apply to your specific comparison, but it applies to the general scenario:
When you use a “fontistic trick”, based on the use of a font that arbitrarily maps bytes 0–255 to whatever glyphs are wanted this time, the font is a necessary ingredient of the soup. When using Unicode encoding for character data, you do not depend on any specific font, but the data still has to be rendered in *some* font. And the more rare characters you use, in terms of coverage in fonts normally available in computers worldwide, the more you will be in practice forced to deal with the font issue, not just for typography, but for having the characters displayed at all. And this quite often means that you need to embed a font (in a Word or PDF document), or to write an elaborated font-family list in CSS, or to use @font-face. Besides, on web pages, you normally need to provide a downloadable font in three or four formats to be safe.
So, quite often, the size of data is increased – actually more due to the size of fonts than due to character encoding issues. But in a vast majority of cases, this price is worth paying. After all, if saving bits were our only concern, we would be using a 5-bit code. ☺
Yucca

