At 12:12 AM -0400 10/3/01, [EMAIL PROTECTED] wrote: >In a message dated 2001-10-02 9:39:31 Pacific Daylight Time, >[EMAIL PROTECTED] writes: > >> BTW, I'm not aware that anybody is revising their fonts to handle ZWJ this >> way. > >Well, according to Unicode 3.1 (UAX #27) they should, right?
True. (*sigh*) Implicitly, however, UAX #27 allows that other mechanisms for specifying > > Anyway, there is is a long-standing argument on this subject, and >> unless I misremember the official position of the UTC, this approach >> --specifying ligation control in plain text -- is not considered the >> best mechanism in Latin typography. > >I refer back to Michael Everson's two persuasive papers in which he proposed >a zero-width ligator. It didn't matter to me whether a new ZWL character was >introduced or whether ZWJ was overloaded, so long as the functionality became >available. To paraphrase one of Michael's papers, ligation should not be >considered a fancy-text function in Latin script if it is considered a >plain-text function in Indic and other scripts, and Unicode does provide >support for plain-text specification of ligation for Indic scripts. Er, not entirely. First of all, the UTC motion on using ZWJ for ligation actually specified that this approach was not the best for Latin typography. For some reason, that language didn't make it into UAX #27, but more detail on ZWJ and Latin ligature control will probably be in Unicode 3.2. Secondly, as with Latin type, the *full* set of potential ligatures in Indic typography is pretty much open-ended and the specific set available in any given font is really up to the font designer. Unicode provides enough information for the bare minimum set of ligation rules used in Indic scripts, but unless I'm much mistaken, it does not provide full control for an arbitrary font. The use of ZWJ in Latin typography is really intended to provide the same level of support. Where difference of meaning is potentially present depending on whether a ligature is formed or not -- which is the case in some Latin-based languages -- then ZWJ can be used to indicate the fact. Similarly, in situations where a ligature must not be formed, ZWNJ can be used to indicate the fact. It is certainly *not* the intention of the UTC that ZWJ be used everywhere to turn ligatures on and off in Latin typography. There are a couple of reasons why that I can come up with straight off the top of my head. #1. The ZWJ mechanism doesn't handle well default ligatures. For example, in typesetting English, fi and fl ligatures are fairly standard, and most fonts will use them by default. At the same time, it seems fairly ludicrous to ask people typing English to insert themselves (or have the system insert) ZWJ between every "fi" and "fl" pair that occur in the text. #2. The ZWJ mechanism works well in the case where there is a particular standard ligature which may or not be present in a font, such as "ct". In this case, the rendering engine will produce the ligature (or not) if the ligature is present in the font. That's fine for the standard ligatures whose presence may reasonably be anticipated. In real life, however, a font may have a large number of usual ligatures. In the paper I wrote on the subject for the UTC, I used Tekton Pro from Adobe because it's such a font. One of the standard fonts in Mac OS X is Zapfino from Linotype. It's based on Hermann Zapf's handwriting and has thousands of glyphs even though it's basically a Roman-only font. Among the glyphs are a large number of unexpected ligatures, "pp", "th", even "Mrs." and "Co." Apple's implementation of the font turns many (but not all) of these ligatures on by default. The overall effect is a significant improvement in the appearance of Zapfino text. In any event, having the user insert ZWJ in plain text for these unexpected ligatures on the off-chance that someone is going to display it with Zapfino and have the appropriate ligature set turned on seems unreasonable. #3. OK, so there are more than a couple reasons off the top of my head. The way that software generally handles turning ligatures on and off right now in virtually all programs that support it is for the user to select a range of text and through a menu item or other action turn a particular set of ligatures on or off. The ZWJ mechanism doesn't allow the type designer to group ligatures in sets, and it would increase unreasonably the burden on software to use the current UI. That is, if I want to turn on ffi ligatures in my text by hand, I'd have to remember to put ZWJ in twice, and if the software did it, it would have to scan the text, compare the contents of the scan with the set of potential ligatures in the given ligature set in the font, and either insert or remove ZWJ between every character pair as appropriate. >Of course ligation control is font-specific. That is why the ZWJ solution is >elegant -- it falls back gracefully to the two (or three...) unligated glyphs >in the event the ligature is unavailable in the font. This is still better >than displaying a black box, which is how William Overington's private-use >characters would appear in most fonts, True. Encoding ligatures as characters is a bad thing. >or forcing the user to incur the >overhead of fancy text. In what way is fancy text an unreasonable burden on the user? If anything, plain text is becoming an increasingly rare beast except in source code. -- John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://homepage.mac.com/jenkins/

