Use a font with only one character, U+FFFE with a glyph of known width in displaying the measuring divs. The font may be specified using @font-face for these divs.
Cibu On Tue, Jun 15, 2010 at 11:13 AM, Ed Trager <[email protected]> wrote: > Hi Unicoders, > > Suppose that we write Unicode text in a web page that we create. We > are worried that our viewers' computers lack a font for proper display > of the script in which our text is written. Obviously it will not be > good if our users only see square boxes or question marks instead of > the text that we want them to be able to see and read: > > □□□□□□□□□ ... <= Bad! :-( > > We want a solution to this problem. > > Until very recently, apparently the best we could do was to warn the > user of the possibility of unrenderable text. For example Wikipedia, > on pages related to Indic languages, says: > > “This article contains Indic text. Without proper rendering support, > you may see question marks or boxes, misplaced vowels or missing > conjuncts instead of Indic text.” > > But now that “good” browsers support @font-face, we can envision a > better solution: If the browser does not have a font for rendering a > specific script, we can dynamically supply one. > > I have written some simple Javascript to detect whether a user's web > browser can display Unicode text in a specific ISO 15924 script. > Here's how it works, using Javascript: > > * Create two divs on the page but set the CSS opacity to zero so > the user doesn't see them. > * In one div, place a relatively narrow letter from the target > script. For example, for Latin one might choose "i". > * In the other div, place a relatively wider letter from the target > script. For Latin, "w" is an obvious choice. > * If the width of the two divs is identical, then the letters were > rendered as square boxes or question marks. > * Otherwise, if the widths differ, then the browser has found a > system font capable of rendering the text. > > In the case of a negative result where the widths are the same, we can > then dynamically add an @font-face rule to the page to download an > appropriate font. I have an experimental web application that already > does exactly this to support Tai Tham (Lanna) script. As Lanna is a > fairly recent addition to Unicode, only a very few people will have a > Lanna font available on their machines. > > Astute unicoders on this list will probably already have recognized > one or more shortcomings of this method. This method works perfectly > for most scripts, but of course it fails for monospaced scripts like > Chinese, Japanese, Korean, Yi, and possibly some others like Phags Pa. > > For monospaced scripts, I tried doing this: > > * In the first div put U+FFFE. Every browser I tested rendered > U+FFFE as a square box. > * In the second div put a representative character from the > script, such as "中" or "文" for Chinese. > > In theory, the U+FFFE will always be rendered as a box with a fixed > width, and one would expect that there is a fairly good probability > that the fixed width of any Chinese font on the machine will not be > exactly the same as the width of the fallback square box. > > But in practice, based on my tests, this does not work. One problem > is that Firefox's fallback square boxes contain the Unicode code point > hex digits -- and these fallback square boxes can actually be of > different widths depending on the hex codes contained therein. Also > it might just happen that the fixed width of the Chinese glyph is > exactly the same width as that of the fallback box used to render the > U+FFFE. > > It would be very nice to come up with a reliable solution for scripts > that are traditionally monospaced. Does anyone have any brilliant > ideas? > > - Ed Trager > > >

