There simply isn't a single font that covers all languages and their
orthographic subtleties.  Arial Unicode MS and Code 2000 are as close as
you get.

What you often need is some way (over and above the Unicode characters
themselves) to specify the language that a text is in so that software can
choose an appropriate font.  In HTML you have the 'lang' attribute to
indicate to the browser the language used by the text within a specific
tag.  Then linguistically sophisticated software can select an appropriate
font and handle things like ligatures and combined forms appropriately.

If your software isn't that sophisticated you may have to specify a font
and literally code ligatures in your text, for instance, using #FB01
"LATIN SMALL LIGATURE FI" instead of the two characters 'f' and 'i' when
you want the ligature to appear.  But this may throw off things like
spelling and alphabetization.

In either of these cases, you will need a set of different fonts to cover
the range of languages you're using.  If your software won't let you
change fonts as languages change, then you're stuck with the compromise of
a 'unicode' font like Arial Unicode MS.

Chris Gray
Library Systems
University of Waterloo

On Thu, 12 Feb 2004, Jacobs, Jane W wrote:

> We've recently been doing some beta testing of a Unicode compliant software,
> which must for the time being, remain nameless.  The problem is that once
> you REALLY start to do a broad range of Unicode languages weird things crop
> up pretty fast.  One we've discovered is the limitations of Arial Unicode
> MS.  I ran across this as I was working on the ligatures. According to the
> Unicode standard they are "combining" However, if you look in many library
> catalogs, including ours,  you will see that they do not combine. At first I
> thought this was a system problem, but it looks like the trail leads all the
> way back to the font. Bengali is even worse. Combining consonants often
> refuse to combine
>
> The data I have is that The Arial Unicode MS font has provision for
> approximately 44,000 Unicode characters, not the full 100,000 characters
> available in Unicode. The shareware Code 2000 font resolves some problems,
> but has provision for approximately 34,000 characters: 12,000 less
> characters than Arial Unicode.
>
> Anybody got any better font ideas???
> JJ
>
>
> **Views expressed by the author do not necessarily represent those of the
> Queens Library.**
>
> Jane Jacobs
> Asst. Coord., Catalog Division
> Queens Borough Public Library
> 89-11 Merrick Blvd.
> Jamaica, NY 11432
>
> tel.: (718) 990-0804
> e-mail: [EMAIL PROTECTED]
> FAX. (718) 990-8566
>

Reply via email to