At 14:18 -0800 2003-12-21, Peter Kirk wrote:

So, "KA is KA is KA is KA and BHA is BHA is BHA is BHA", and ALEF is ALEF is ALEF is ALEF, except when it comes to comparing them and collating them?

In the context which I was speaking, yes. The Indic KAs have a one-to-one relationship, historically. We know this. Likewise the Semitic ALEFs. That doesn't mean that we should unify the Indic scripts all into one (which we haven't) or that we should unify all the Semitic scripts into one.


If you have a multiscript database for Pali and you need to search all the KAs accross scripts, you will have to have a local engine to do so. The scripts are distinct as encoded in the Unicode standard.

If you want to sort such a database, illegible as the result would be, you can do it, with a local tailoring for your specific purpose. The default table in the UCA will not interfile them, however, because it orders the scripts sequentially (apart from digits, which are treated differently because of their particular properties). I'm not saying you can't tailor. You can. I'm saying we're not going to change what we are doing in the UCA and ISO/IEC 14651 because it distinguishes scripts on purpose.

Of course if one collates together a mixture of Latin script texts in very different fonts and styles one can get an outrageously messy list which is illegible to those who don't know all the different fonts.

I do not consider the Semitic nodes we are considering for eventual encoding to be font variants of each other.


But that is hardly the point. Anyway, I don't see the main purpose of collation as producing lists of legible words, but rather as matching in text and database searches.

Which you as an expert can do with special tools.


Michael, do you realise that I am trying to offer you an olive branch, and all I get is it thrown back in my face, nicely by you but rudely by someone else offlist.

No, I didn't. In the first place I didn't know that we were at war. In the second place, all I'm telling you is that we have practices which are generic to certain levels of our work, and we are not likely to deviate from those practices. That's not throwing something in your face. That's telling you what's what. We had a similar discussion about generic practice when we were putting Runic into the UCA. Swedish specialists wanted a Latin-based order. That's specific. Everyone else, though, would want the native Futhark order. The Japanese NB, which doesn't really worry about Runes much, thought that the generic order should be the basic historical one.


I think that it just might be acceptable to encode the various ancient Semitic scripts separately if they are unified for collation.

You can tailor a unified collation for them or indeed for anything you like.


But if you are saying that it must be all or nothing, I will continue to fight on behalf of the users of these scripts for all of what they want, rather than what you have apparently unilaterally (on the basis of a book which describes glyph shape differences rather than the systematic differences which really distinguish scripts) decided that they ought to want and have written into your Roadmap.

*I* have not decided on the basis of *one* book, thanks very much. Nor have I done anything unilaterally. Nor have we made decisions which aren't based on our normal working practice.


I'm not interested in worrying about these bits of the Roadmap right now. If I work on anything over the Christmas, it should be N'Ko. Then there is more work on Cuneiform. Then work on Manichaean and Avestan. Then I've got to prepare for the PDAM comments. This sniping, even when nice, isn't doing you any good, nor me. Can we drop this for a while, please?

Michael

(I am sorry you had rude private mail from someone. I also had private mail from someone which suggested that I didn't know anything about Indic scripts, while saying a whole lot of other rather incomprehensible things about ISCII and Unicode. Better forgotten.)

Reply via email to