Patrick said: > >In this case, I think it's important to be picky because there are > >no current Unicoding practices for Phoenician. > > > You may mean that the Unicode book does not document how Phoenician (or > Paleo-Hebrew) may be encoded. This is not to say that no one is using > Unicode to encode Paleo-Hebrew texts. ^^^^^^ represent I like to distinguish this, because the whole notion of what it means to "encode a text" tends to derail the discussion immediately.
The Unicode Standard *encodes* abstract characters. There are many potential abstract characters, but one of the general principles used is that each significant "letter" (grapheme) from a *script* will be encoded once as a character in the standard. That, of course, begs the question of identifying the "script" and its exact repertoire of "letters". The identification of the "script" is what the Phoenician argument has been about, since there is no serious question about the repertoire of "letters" for it. Once a repertoire of abstract characters has been *encoded* in the Unicode Standard, those encoded characters can then be used to *represent* the plain text content of documents. This is deliberately different from talking about "encoding the text", because people don't have common understandings about what that means, and often expect various aspects of format and appearance to also be "encoded" -- hence the way these discussions tend to veer off into ditches. Now returning to Patrick's statement and substituting for a different unencoded script: > the Unicode standard does not document how *Avestan* > may be encoded. This is not to say that no one is using > Unicode to represent *Avestan* texts. Also true, right? Or... > the Unicode standard does not document how *Tifinagh* > may be encoded. This is not to say that no one is using > Unicode to represent *Tifinagh* texts. O.k., I guess you can see that this particular argument is not going to go anywhere. Any script which is not currently encoded in the standard can be (and probably is) represented *somehow* by Unicode characters, either via PUA or transliteration or some other arbitrary intermediate encoding of entities. That it is (or could be) so represented has little or no bearing on the question of whether the script in question is or is not distinct enough from some already encoded but historically related script to warrant a distinct encoding as a "script" in the Unicode sense of a script. John Hudson asked, again: > My question, again, is whether there is a need for the plain > text distinction in the first place? And I claim that there is no final answer for this question. We simply have irresolvable differences of opinion, with some asserting that it is self-evident that there is such a need, and others asserting that it is ridiculous to even consider encoding Phoenician as a distinct script, and that there is no such need. My own take on this seemingly irreconcilable clash of opinion is that if *some* people assert a need (and if they seem to be reasonable people instead of crackpots with no demonstrable knowledge of the standard and of plain text) then there *is* a need. And that people who assert that there is *no* need are really asserting that *they* have no need and are making the reasonable (but fallacious) assumption that since they are rational and knowledgable, the fact that *they* have no need demonstrates that there *is* no need. If such is the case, then there *is* a need -- the question then just devolves to whether the need is significant enough for the UTC and WG2 to bother with it, and whether even if the need is met by encoding of characters, anyone will actually implement any relevant behavior in software or design fonts for it. In my opinion, Phoenician as a script has passed a reasonable need test, and has also passed a significant-enough- to-bother test. Note that these considerations need to be matters of reasonableness and appropriateness. There is no absolutely correct answer to be sought here. A character encoding standard is an engineering construct, not a revelation of truth, and we are seeking solutions that will enable software handling text content and display to do reasonable things with it at reasonable costs. If you start looking for absolutes here, it is relatively easy to apply reductio ad absurdum. In an absolute sense, there is no "need" to encode *any* other script, because they can *all* be represented by one or another transliteration scheme or masquerading scheme and be rendered with some variety or other of symbol font encoding. After all, that's exactly what people have been doing to date already for them -- or they are making use of encodings outside the context of Unicode, which they could go on using, or they are making use of graphics and facsimiles, and so on. The world wouldn't end if all such methods and "hacks" continued in use. The question is rather, given the fundamental nature of the Unicode Standard as enabling text processing for modern software, it is cost-effective and *reasonable* to provide a Unicode encoding for one particular script or another, unencoded to date, so as to maximize the chances that it will be handled more easily by modern software in the global infrastructure and to minimize the costs associated with doing so. *That* is the test which should be applied when trying to make decisions about which of the remaining varieties of unencoded writing systems rise to the level of distinctness, utility, and cost-effectiveness to be encoded as another script in the standard. --Ken