[EMAIL PROTECTED] <[EMAIL PROTECTED]>: > The problem is, every character in Unicode, all 70,000 of them, has a > distinct set of properties. UnicodeData.txt is basically a listing of > those properties. If it is a copyrightable work, I see no way for a text > processing program to conform to Unicode without using a derivative of > that copyrighted work. Likewise, I'd bet that file or some derivative of > it is embedded in both Perl and Python - you can't reasonably handle > Unicode characters without it.
This is like the word list (spelling dictionary) discussion from a few weeks ago. Intuitively, I would guess you could make a program conform to Unicode without using a derivative of UnicodeData.txt. Copyright applies to the expression of the facts, not the facts themselves, so you can still write your own, original description of how Unicode characters are handled. > We could always pony up the $12,000 (or $1200 for an associate membership) > and become a member of Unicode and complain about this from the inside. An alternative approach would be set up an alternative organisation defining an alternative universal character set that is virtually identical to Unicode, but the documents describing it are new, original works with free licences. Don't worry too much if there are a few accidental differences. You could then approach the Unicode Consortium and say that you are keen to cooperative and keep the two standards aligned. That might be a good negotiating position because the Unicode Consortium people really do want there to be a single, universal character set, I think. (At one point Unicode and UCS were two separate standards that were deliberately maintained so as to be identical. I don't know whether that is still the case, but there might already be an alternative to UnicodeData.txt.) Edmund

