There simply isn't a single font that covers all languages and their orthographic subtleties. Arial Unicode MS and Code 2000 are as close as you get.
What you often need is some way (over and above the Unicode characters themselves) to specify the language that a text is in so that software can choose an appropriate font. In HTML you have the 'lang' attribute to indicate to the browser the language used by the text within a specific tag. Then linguistically sophisticated software can select an appropriate font and handle things like ligatures and combined forms appropriately. If your software isn't that sophisticated you may have to specify a font and literally code ligatures in your text, for instance, using #FB01 "LATIN SMALL LIGATURE FI" instead of the two characters 'f' and 'i' when you want the ligature to appear. But this may throw off things like spelling and alphabetization. In either of these cases, you will need a set of different fonts to cover the range of languages you're using. If your software won't let you change fonts as languages change, then you're stuck with the compromise of a 'unicode' font like Arial Unicode MS. Chris Gray Library Systems University of Waterloo On Thu, 12 Feb 2004, Jacobs, Jane W wrote: > We've recently been doing some beta testing of a Unicode compliant software, > which must for the time being, remain nameless. The problem is that once > you REALLY start to do a broad range of Unicode languages weird things crop > up pretty fast. One we've discovered is the limitations of Arial Unicode > MS. I ran across this as I was working on the ligatures. According to the > Unicode standard they are "combining" However, if you look in many library > catalogs, including ours, you will see that they do not combine. At first I > thought this was a system problem, but it looks like the trail leads all the > way back to the font. Bengali is even worse. Combining consonants often > refuse to combine > > The data I have is that The Arial Unicode MS font has provision for > approximately 44,000 Unicode characters, not the full 100,000 characters > available in Unicode. The shareware Code 2000 font resolves some problems, > but has provision for approximately 34,000 characters: 12,000 less > characters than Arial Unicode. > > Anybody got any better font ideas??? > JJ > > > **Views expressed by the author do not necessarily represent those of the > Queens Library.** > > Jane Jacobs > Asst. Coord., Catalog Division > Queens Borough Public Library > 89-11 Merrick Blvd. > Jamaica, NY 11432 > > tel.: (718) 990-0804 > e-mail: [EMAIL PROTECTED] > FAX. (718) 990-8566 >