Re: Special Type Sorts Tray 2001

Peter_Constable Fri, 05 Oct 2001 12:46:24 -0700

>Indeed they would appear as a black box or something similar in most fonts.
>However, I feel that the availability of ligatured characters in a font at a
>specific official Unicode code point would be useful for the specific use of
>a person to be able to encode the ligature information directly, so that he
>or she may transcribe the typography of an eighteenth century printed book
>directly "metal type sort to unicode character" and print out the text.

Yes, this kind of thing should be possible, but in rich/styled text, not in plain text. Similarly, there are people out that that would like to encode in electronic form manuscripts etc. but that is also more than should be expected of plain text.

>but I do feel
>that the ligatured character facilities should be available for use in
>appropriate circumstances.

Sure, but do those circumstances really require plain text?

> I feel that as their usefulness was such that
>ligatured characters could be cast in some fonts in metal type right up
>until the end of the mainstream use of metal type, then it is reasonable
>that the use of such ligatured characters could be continued indefinitely
>into the future using unicode. There may well be uses in desktop publishing
>for the typesetting of various decorative items.

And that can be done using technologies that are starting to become mainstream without requiring that those decorative items be directly encoded.

>running on Windows 95, that will continue to be in use for many years. As
>far as I know, Word 97 can only use ligatured characters such as ct if they
>are (1) encoded in a font and (2) the character is inserted into the
>document whenever required using Insert Symbol or using a short cut key set
>up from within Insert Symbol.

Other methods for insertion are possible (e.g. you could use Keyman to create an input method), but as far as encoding and fonts to work with Word 97, that is correct. But it should not be a requirement on any technology that advanced functionality be automatically supported on older products. That is simply too costly, and often just not possible. Do you expect the PC you bought in 1996 (probably a 200Mhz machine with 64K and a 2GB drive) to do digital video editing? Probably not. Similarly, we shouldn't necessarily expect our 1996 software to support functionality being developed today.

You will likely respond that the comparison isn't valid because it would be technically very simple for Word 97 to handle a ct ligature if it was just encoded in Unicode. That's true. But the ct ligature is just a drop in a very big bucket that involves a number of complicating factors that aren't being considered. For example, one user creates a document that contains "Wellington's victory over Napoleon" using a ct ligature, and another user creates a different document on the same topic but doesn't use a ligature. Then a third user is trying to retrieve documents on the topic and knows that there are two documents out there but has no idea that one or the other might use a ligature and so be encoded differently. They just search for "victory", and they only get half the results they were expecting returned to them. Multiply this problem by the untold hundred or thousands of different ligatures that might possibly be in! cluded. In addition to this data retrieval scenario, consider various kinds of text support functionality, like case mapping or spell checking: how can someone write algorithms to deal with that decently when next month there may be new ligatures in the standard creating geometrically-increasing options for how things can be represented? They can't. So, then, the logical question is whether a normalisation should be defined that fold the distinction between the two forms of "ct" (and likewise for all the other ligatures that have been added). But then we still have the problem that software can't be designed with any hope of stability since we have no way of knowing what new ligatures (and hence new normalisations) might need to be supported tomorrow.

In the big scheme of things, there is a simpler solution: allow ligatures to be handled using advanced typography technologies, which are being deployed *anyway* to support scripts like Arabic, etc. The cost is that these ligatures are not supported in Word 97, but in 10 years virtually nobody will still be using Word 97 anyway. On the other hand, the normalisation problems the other approach would create would still be with us 65 years from now. Asking for something to work in Word 97 is being somewhat short-sighted.

>Perhaps many people will have seen open access
>rooms in colleges where there are a number of newer machines and then
>gradually as one moves to the end of the room there are all sorts of older
>machines with older software being fully utilized by students preparing a
>paper.

I appreciate the concerns regarding older systems (e.g. Word 97). I have to deal with that as I try to support people in our organisation around the globe since they generally don't have budgets that allow them to update systems very often, and they also have to work with local colleagues who are on even tighter budgets. As a result, I have assumed that I need to find solutions for people to work with non-Roman scripts that will work on down-level systems. It is the case that, if we can accomplish that, there will be people who benefit from it. I've been surprised, though, to learn that a lot more will update their systems than I expected -- they'll do it if there is a reason for them to do it. Once we have apps that work with Unicode and advanced typography technologies that will allow them to work with the non-Roman scripts they use, they'll make the change to Windows XP and Office.Net if that's what it will take to obtain that functional! ity. So, the fact that there are still machines out there with Word 97 installed doesn't keep me from reaching the conclusion that it's OK if support for things like ligatures and other aspects of advanced typography only work on newer systems.

>I would mention in passing, for completeness, the possibility of having a ct
>character as a bitmap

Yes, that would be more awkward than it's worth. You can always create a symbol-encoded font that contains the ligatures you want to use. (On a Windows system, that will in effect encode them in the PUA range f020 - f0ff; that range will be shared with other symbol sets, which nobody here will object to. Note, however, that you'll lose some functionality -- you won't be able to spell check, for example.)

>A designating of certain characters as being quaint
>characters might perhaps be a way out of the problem and that thus ct and
>various long s ligatures could be defined as quaint characters such that
>they have unique official unicode positions yet are outside of regular usage
>where database sorting might be needed. Does that solve the problem of
>including them as presentation forms?

IMHO, no, and I'm inclined to respond that I haven't been at all convinced of the problem that including them in the presenation forms is expected to solve. The fact that Word 97 can't support a ct ligature without it being directly encoded is not IMHO a serious problem, whereas the potential implications of including a ct ligature (and others) in Unicode are.

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <[EMAIL PROTECTED]>

Re: Special Type Sorts Tray 2001

Reply via email to