RE: Greek letter LAMDA?

2010-06-02 Thread Peter Constable
Pollution on this list has waxed and ebbed constantly over the years. When it has taken Eerie proportions, things have usually gotten sorted out before long one way or another. Innocuous questions about name stability are nowhere as bad as acute flame wars over the merits of encoding 22-letter

Re: Greek letter LAMDA?

2010-06-02 Thread Asmus Freytag
On 6/1/2010 6:04 PM, Mark Crispin wrote: I don't think that the unicode list should be used for the type of questions that have polluted it recently. That list unicode@unicode.org is open for general questions. It has no formal standing as far as the business of the Consortium is concerned, and

Re: Least used parts of BMP.

2010-06-02 Thread Asmus Freytag
On 6/1/2010 8:04 PM, Kannan Goundan wrote: I'm trying to come up with a compact encoding for Unicode strings for data serialization purposes. The goals are fast read/write and small size. Why not use SCSU? You get the small size and the encoder/decoder aren't that complicated. You get the

Re: Least used parts of BMP.

2010-06-02 Thread Kannan Goundan
On Tue, Jun 1, 2010 at 23:30, Asmus Freytag asm...@ix.netcom.com wrote: Why not use SCSU? You get the small size and the encoder/decoder aren't that complicated. Hmm... I had skimmed the SCSU document a few days ago. At the time it seemed a bit more complicated than I wanted. What's nice

Re: Greek letter LAMDA?

2010-06-02 Thread Michael Everson
On 1 Jun 2010, at 22:50, Kenneth Whistler wrote: Note that: 1038D;UGARITIC LETTER LAMDA;Lo;0;L;N; is a distinct issue, and would no doubt would still have been spelled LAMDA, even if all the Greek characters in the standard had been spelled LAMBDA. *waves hand and takes blame*

Re: Greek letter LAMDA?

2010-06-02 Thread Michael Everson
On 2 Jun 2010, at 00:14, Mark Crispin wrote: Is it really necessary to have this sort of pedagogical discussions on the Unicode list? Even I'm not so curmudgeonly, Mark. Live with it and use the delete key. Cheerily, Michael Everson * http://www.evertype.com/

A question about user areas

2010-06-02 Thread jander...@talentex.co.uk
I am brewing on some plans for making a font with glyphs for ancient Chinese characters and even for some of the more dubious glyphs; I assume that there is no standard area in the Unicode standard for these; so where can I put them so they are least likely to clash with others?

Re: A question about user areas

2010-06-02 Thread Vinodh Rajan
Put them in PUA - Private Use Area. http://unicode.org/charts/PDF/UE000.pdf If there are similar projects that encode Ancient Characters in PUA, may be you can co-ordinate with them. Similar to the ConScript Unicode Registry. V On Wed, Jun 2, 2010 at 2:30 PM, jander...@talentex.co.uk

Re: Preparing a proposal for encoding a portable interpretable object code into Unicode (from Re: IUC 34 - call for participation open until May 26)

2010-06-02 Thread William_J_G Overington
Thank you for replying. On Tuesday 1 June 2010, John H. Jenkins jenk...@apple.com wrote: First of all, as Michael says, this isn't character encoding. Well, it is a collection of portable interpretable object code items encoded within a character encoding as if the items were characters.

Re: A question about user areas

2010-06-02 Thread vanisaac
From: jander...@talentex.co.uk I am brewing on some plans for making a font with glyphs for ancient Chinese characters and even for some of the more dubious glyphs; I assume that there is no standard area in the Unicode standard for these; so where can I put them so they are least likely to

Re: Preparing a proposal for encoding a portable interpretable object code into Unicode (from Re: IUC 34 - call for participation open until May 26)

2010-06-02 Thread Andrew West
On 2 June 2010 10:51, William_J_G Overington wjgo_10...@btinternet.com wrote: I know of no reason to think that a person skilled in the art would be unable to write an iPad app to receive a program written in the portable interpretable object code arriving within a Unicode text message and

Re: Preparing a proposal for encoding a portable interpretable object code into Unicode (from Re: IUC 34 - call for participation open until May 26)

2010-06-02 Thread Michael Everson
On 2 Jun 2010, at 10:51, William_J_G Overington wrote: Well, that might well be the case historically, yet then the emoji were invented and they were encoded. The emoji existed at the time that they were encoded, yet they did not exist at the time that the standards were started. The

RE: Preparing a proposal for encoding a portable interpretable object code into Unicode (from Re: IUC 34 - call for participation open until May 26)

2010-06-02 Thread Erkki I. Kolehmainen
I cannot but agree. Sincerely, Erkki I. Kolehmainen Tilkankatu 12 A 3, FI-00300 Helsinki, Finland Puh. (09) 4368 2643, 0400 825 943; Tel. +358 9 4368 2643, +358 400 825 943 -Original Message- From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf Of Michael

Re: Preparing a proposal for encoding a portable interpretable object code into Unicode (from Re: IUC 34 - call for participation open until May 26)

2010-06-02 Thread vanisaac
From: William_J_G Overington (wjgo_10...@btinternet.com) On Tuesday 1 June 2010, John H. Jenkins jenk...@apple.com wrote: First of all, as Michael says, this isn't character encoding. Well, it is a collection of portable interpretable object code items encoded within a character

Emoji (was: Re: Preparing a proposal for encoding a portable interpretable object code into Unicode)

2010-06-02 Thread Doug Ewell
Van Anderson vanis...@boil.afraid.org wrote: Emoticons (as emoji) are exchanged as plain text. The only consideration that changed was whether they should be considered as markup or not. Eventually, it became clear that they no longer do classify as markup, but as plain text. This was not a

Re: A question about user areas

2010-06-02 Thread Doug Ewell
Van Anderson vanisaac at boil dot afraid dot org wrote: Look up the Conscript Unicode Registry if you want to examine a pseudo-standardized Private Use agreement. A simple mapping table will enable you to equate your private use standard to the officially encoded forms of these scripts, when

Re: Least used parts of BMP.

2010-06-02 Thread Doug Ewell
Kannan Goundan kannan at cakoose dot com wrote: Hmm... I had skimmed the SCSU document a few days ago. At the time it seemed a bit more complicated than I wanted. SCSU decoders are not complicated, and with encoders, you get to make the decision between simplicity and high performance.

Re: Least used parts of BMP.

2010-06-02 Thread David Starner
On Tue, Jun 1, 2010 at 11:04 PM, Kannan Goundan kan...@cakoose.com wrote: I'm trying to come up with a compact encoding for Unicode strings for data serialization purposes.  The goals are fast read/write and small size. The plan: 1. BMP code points are encoded as two bytes (0x-0x,

re: Least used parts of BMP.

2010-06-02 Thread Philippe Verdy
Resending (from Gmail), because the Unicode list rejected the SMTP server of my mail provider (Spamcop is defective). Nothing forbifs you to create new serializations of Unicode; you may even create it so that it will be a conforming process (meaning that it will preserve *all* valid Unicode

RE: Greek letter LAMDA?

2010-06-02 Thread John Dlugosz
Robert Abel noted: Note that as of 1993, the only LAMDA or LAMBDA characters in the standard were: 039B;GREEK CAPITAL LETTER LAMDA;Lu;0;L;N;GREEK CAPITAL LETTER LAMBDA;;;03BB; 03BB;GREEK SMALL LETTER LAMDA;Ll;0;L;N;GREEK SMALL LETTER LAMBDA;;039B;;039B 019B;LATIN SMALL LETTER

RE: Greek letter LAMDA?

2010-06-02 Thread John Dlugosz
Perhaps a better approach would be to establish a Frequently Asked Questions list on the Unicode Web site. Oh, wait. -- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ http://is.gd/2kf0s ­ FWIW, I checked the FAQ first.

Re: A question about user areas

2010-06-02 Thread Philippe Verdy
vanis...@boil.afraid.org wrote: From: Doug Ewell (d...@ewellic.org) I'm not sure how much longer we should continue to wait for Tengwar and Cirth. I hear Michael talking about meeting with the Tokeinists every once in a while, so I can only assume that it is proceeding in some way. I

Re: Preparing a proposal for encoding a portable interpretable object code into Unicode (from Re: IUC 34 - call for participation open until May 26)

2010-06-02 Thread John H. Jenkins
On Jun 2, 2010, at 3:51 AM, William_J_G Overington wrote: I know of no reason to think that a person skilled in the art would be unable to write an iPad app to receive a program written in the portable interpretable object code arriving within a Unicode text message and then for the

Re: Preparing a proposal for encoding a portable interpretable object code into Unicode (from Re: IUC 34 - call for participation open until May 26)

2010-06-02 Thread John H. Jenkins
On Jun 2, 2010, at 3:51 AM, William_J_G Overington wrote: Unicode and ISO/IEC 10646 are attempts to solve a basic, simply-described problem: provide for a standardized computer representation of plain text written using existing writing systems. Well, that might well be the case

Re: IS UNICODE a STANDRAD ?

2010-06-02 Thread Tulasi
The trademarked name does not use ALL CAPS. Is Unicode a registered trademark then? If yes where does it say so? Both refer to the same organization. Usually, you would use The Unicode Consortium. Are you suggesting Incorporate is equal to Consortium in this case? The usage is grammatical.

RE: Preparing a proposal for encoding a portable interpretable object code into Unicode (from Re: IUC 34 - call for participation open until May 26)

2010-06-02 Thread Peter Constable
This is a bad idea. The best way to make it go away is to just stop discussing it. Peter -Original Message- From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf Of William_J_G Overington Sent: Wednesday, June 02, 2010 2:51 AM To: Unicode Discussion; John H.

Re: A question about user areas

2010-06-02 Thread John H. Jenkins
On Jun 2, 2010, at 3:49 AM, Vinodh Rajan wrote: If there are similar projects that encode Ancient Characters in PUA, may be you can co-ordinate with them. Similar to the ConScript Unicode Registry. There is a proposal for Old Hanzi being worked on by the IRG. You can peruse the IRGs

RE: A question about user areas

2010-06-02 Thread Shawn Steele
Anyway, most existing supporters of Tengwar and Cirth (also Klingonists) still use some transliteration Transliterateability shouldn't be a factor, many of the scripts in Unicode have been transliterated (like Latin). Perhaps if it was only transliterated and the script was never used (but

RE: IS UNICODE a STANDRAD ?

2010-06-02 Thread Erkki I. Kolehmainen
Sarasvati? I'd personally wish to see you act... Regards, Erkki Erkki I. Kolehmainen Tilkankatu 12 A 3, FI-00300 Helsinki, Finland Puh. (09) 4368 2643, 0400 825 943; Tel. +358 9 4368 2643, +358 400 825 943 -Original Message- From: unicode-bou...@unicode.org

Re: IS UNICODE a STANDRAD ?

2010-06-02 Thread Sarasvati
Dear list members, This is your official notification that this thread is now terminated. The discussions of 3rd party font IP and trademark status are out of scope and unlikely to result in enlightening discussion here. Regards, -- Sarasvati On 6/2/2010 10:00 AM, Erkki I. Kolehmainen

RE: Greek letter LAMDA?

2010-06-02 Thread Jonathan Rosenne
Although this mail was not addressed to me, I did read it. Sue me. Jony -Original Message- From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf Of John Dlugosz Sent: Wednesday, June 02, 2010 5:03 PM Cc: unicode@unicode.org Subject: RE: Greek letter LAMDA?

Re: Greek letter LAMDA?

2010-06-02 Thread Asmus Freytag
On 6/2/2010 11:46 AM, Jonathan Rosenne wrote: Although this mail was not addressed to me, I did read it. Sue me. The terms of use for the Unicode mail list essentially state that these types of boilerplate are null and void as far as Unicode is concerned. You will find the following in

RE: Greek letter LAMDA?

2010-06-02 Thread Kenneth Whistler
Note that as of 1993, the only LAMDA or LAMBDA characters in the standard were: 039B;GREEK CAPITAL LETTER LAMDA;Lu;0;L;N;GREEK CAPITAL LETTER LAMBDA;;;03BB; 03BB;GREEK SMALL LETTER LAMDA;Ll;0;L;N;GREEK SMALL LETTER LAMBDA;;039B;;039B 019B;LATIN SMALL LETTER LAMBDA WITH

RE: Greek letter LAMDA?

2010-06-02 Thread vanisaac
From: Kenneth Whistler (k...@sybase.com) [snip] I expect that even this explanation will not satisfy those who think that oddities like this should not exist in character names. But that is just the nature of the historical development of big standards like the Unicode Standard when you

RE: Greek letter LAMDA?

2010-06-02 Thread John Dlugosz
If anyone can null and void it, I wonder why companies bother to put such things in people's outgoing mail. I would have thought they could come up with a proper net-etiquite version, but they just don't care. From: Asmus Freytag [mailto:asm...@ix.netcom.com] Sent: Wednesday, June 02, 2010

Tengwar and Cirth (was: Re: A question about user areas)

2010-06-02 Thread Kenneth Whistler
I'm not sure how much longer we should continue to wait for Tengwar and Cirth. Three words: Squeaky wheel -- grease. Don't expect this to just happen. The corporate members of the Unicode Consortium are mostly concerned about economically significant sets of characters that impact their

Re: Greek letter LAMDA?

2010-06-02 Thread Asmus Freytag
On 6/2/2010 3:28 PM, John Dlugosz wrote: If anyone can “null and void” it, I wonder why companies bother to put such things in people’s outgoing mail. I would have thought they could come up with a proper net-etiquite version, but they just don’t care. These things are bogus, because they

Re: Least used parts of BMP.

2010-06-02 Thread Kannan Goundan
Thanks to everyone for the detailed responses. I definitely appreciate the feedback on the broader issue (even though my question was very narrow). I should clarify my use case a little. I'm creating a generic data serialization format similar to Google Protocol Buffers and Apache Thrift.

Re: Least used parts of BMP.

2010-06-02 Thread Asmus Freytag
SCSU is a pass-through for ASCII, plus it handles the common mix of ASCII plus 96 local characters (Latin-1, Greek, Cyrillic, Thai, etc) really fast. Go look at the sample code. If you take that as starting point for optimization, I think you'll be fine.

Re: Least used parts of BMP.

2010-06-02 Thread Mark Davis ☕
An alternative that I've used is: - Serialize every unsigned integer as a sequence of 7 bits, with the top bit off for all but the last one. - For signed integers, shift left by 1 bit, then invert if the original was negative, then serialize as unsigned. - Serialize a string as an

Re: Least used parts of BMP.

2010-06-02 Thread Michael D'Errico
If you want a really fast alternate encoding, you could encode all of Unicode in at most 3 bytes. Use the high bit as a continuation bit and the lower 7 bits as the data. ASCII gets passed through unchanged. For code points between U+0080 and U+3FFF, split the value into the high 7 bits and

Re: Least used parts of BMP.

2010-06-02 Thread Doug Ewell
Michael D'Errico mike dash list at pobox dot com wrote: If you want a really fast alternate encoding, you could encode all of Unicode in at most 3 bytes. Use the high bit as a continuation bit and the lower 7 bits as the data. ASCII gets passed through unchanged. This is essentially what

Re: Least used parts of BMP.

2010-06-02 Thread Kannan Goundan
On Wed, Jun 2, 2010 at 21:43, Doug Ewell d...@ewellic.org wrote: If you want a really fast alternate encoding, you could encode all of Unicode in at most 3 bytes.  Use the high bit as a continuation bit and the lower 7 bits as the data. ASCII gets passed through unchanged. This is

Non-Vedic Sarasvati

2010-06-02 Thread Tulasi
Probably closing the thread with answers to the questions would have been better approach, instead of terminating so. Fyi, Sarasvati is Vedic goddess of knowledge ingenuity, and the goddess made such knowledge ingenuity available for others but never implemented on Her own. pracoditaa yena