Doug Ewell wrote: > There have been lots of attempts to define short mnemonic names or > "entities" for Unicode. SGML names are one. The "i18nrep > repertoiremap," originally defined in RFC 1345 and more recently used in > ISO/IEC TR 14652, is another. These schemes work well for a relatively > small number of characters, say a thousand, but become unwieldy and > anti-mnemonic when applied to a larger set of characters. There simply > aren't enough short mnemonic names to go around.
Yes, but my suggestion was to read character names on a per-font basis. So when a HTML file contains something like <FONT face="Symbol"> only the Symbol font has to be read and scanned for names. This completely nullifies the need to have the complete list of entity names in memory at all times. > (..) the scenario Pim describes might work (although > asking a browser to interpret the internal structure of a font file > seems excessive to me). But the same mechanism is less likely to work Is it really so excessive? Browsers already have to retrieve a lot of font info, like the line height and such. Why not character names? I could easily envision a function call like "GetCharPSName()"... and John Hudson wrote: > You are presuming that all fonts contain name strings for glyphs. As a matter of fact, I know they don't, and even if they do, it's often names like "uniF10C" which aren't really very helpful. Naturally my scheme would break down if the given fonts do not contain the character names we expect. But, in the current scheme that's in use, things go wrong if the fonts don't contain the numerical indexes we expect! As a real-life example, if the browser program assumes a character like "INTEGRAL EXTENSION" to be in the Symbol font at codepoint U+23AE, but its index in the font really is U+F0F4, this does not work out. Now if the browser were to look for a character named "integralex" in a font, the character would have been found, no matter what codepoint it was on. Anyway, I'm aware this is getting a bit out of scope for the Unicode Mailing List, so I propose we leave it at this. Unless someone is willing to rewrite an Internet browser, then I'll be happy to continue this discussion by private e-mail... Pim Blokland

