Pollution on this list has waxed and ebbed constantly over the years. When it
has taken Eerie proportions, things have usually gotten sorted out before long
one way or another. Innocuous questions about name stability are nowhere as bad
as acute flame wars over the merits of encoding 22-letter
On 6/1/2010 6:04 PM, Mark Crispin wrote:
I don't think that the unicode list should be used for the type of
questions that have polluted it recently.
That list unicode@unicode.org is open for general questions.
It has no formal standing as far as the business of the Consortium
is concerned, and
On 6/1/2010 8:04 PM, Kannan Goundan wrote:
I'm trying to come up with a compact encoding for Unicode strings for
data serialization purposes. The goals are fast read/write and small
size.
Why not use SCSU?
You get the small size and the encoder/decoder aren't that complicated.
You get the
On Tue, Jun 1, 2010 at 23:30, Asmus Freytag asm...@ix.netcom.com wrote:
Why not use SCSU?
You get the small size and the encoder/decoder aren't that
complicated.
Hmm... I had skimmed the SCSU document a few days ago. At the time it
seemed a bit more complicated than I wanted. What's nice
On 1 Jun 2010, at 22:50, Kenneth Whistler wrote:
Note that:
1038D;UGARITIC LETTER LAMDA;Lo;0;L;N;
is a distinct issue, and would no doubt would still have been spelled
LAMDA, even if all the Greek characters in the standard had been spelled
LAMBDA.
*waves hand and takes blame*
On 2 Jun 2010, at 00:14, Mark Crispin wrote:
Is it really necessary to have this sort of pedagogical discussions on the
Unicode list?
Even I'm not so curmudgeonly, Mark. Live with it and use the delete key.
Cheerily,
Michael Everson * http://www.evertype.com/
I am brewing on some plans for making a font with glyphs for ancient
Chinese characters and even for some of the more dubious glyphs; I
assume that there is no standard area in the Unicode standard for
these; so where can I put them so they are least likely to clash with
others?
Put them in PUA - Private Use Area.
http://unicode.org/charts/PDF/UE000.pdf
If there are similar projects that encode Ancient Characters in PUA, may be
you can co-ordinate with them. Similar to the ConScript Unicode Registry.
V
On Wed, Jun 2, 2010 at 2:30 PM, jander...@talentex.co.uk
Thank you for replying.
On Tuesday 1 June 2010, John H. Jenkins jenk...@apple.com wrote:
First of all, as Michael says, this
isn't character encoding.
Well, it is a collection of portable interpretable object code items encoded
within a character encoding as if the items were characters.
From: jander...@talentex.co.uk
I am brewing on some plans for making a font with glyphs for ancient
Chinese characters and even for some of the more dubious glyphs; I
assume that there is no standard area in the Unicode standard for
these; so where can I put them so they are least likely to
On 2 June 2010 10:51, William_J_G Overington wjgo_10...@btinternet.com wrote:
I know of no reason to think that a person skilled in the art would be
unable to write an iPad app to receive a program written in the portable
interpretable object code arriving within a Unicode text message and
On 2 Jun 2010, at 10:51, William_J_G Overington wrote:
Well, that might well be the case historically, yet then the emoji were
invented and they were encoded. The emoji existed at the time that they were
encoded, yet they did not exist at the time that the standards were started.
The
I cannot but agree.
Sincerely,
Erkki I. Kolehmainen
Tilkankatu 12 A 3, FI-00300 Helsinki, Finland
Puh. (09) 4368 2643, 0400 825 943; Tel. +358 9 4368 2643, +358 400 825 943
-Original Message-
From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On
Behalf Of Michael
From: William_J_G Overington (wjgo_10...@btinternet.com)
On Tuesday 1 June 2010, John H. Jenkins jenk...@apple.com wrote:
First of all, as Michael says, this
isn't character encoding.
Well, it is a collection of portable interpretable object code items encoded
within a character
Van Anderson vanis...@boil.afraid.org wrote:
Emoticons (as emoji) are exchanged as plain text. The only
consideration that changed was whether they should be considered as
markup or not. Eventually, it became clear that they no longer do
classify as markup, but as plain text. This was not a
Van Anderson vanisaac at boil dot afraid dot org wrote:
Look up the Conscript Unicode Registry if you want to examine a
pseudo-standardized Private Use agreement. A simple mapping table will
enable you to equate your private use standard to the officially
encoded forms of these scripts, when
Kannan Goundan kannan at cakoose dot com wrote:
Hmm... I had skimmed the SCSU document a few days ago. At the time it
seemed a bit more complicated than I wanted.
SCSU decoders are not complicated, and with encoders, you get to make
the decision between simplicity and high performance.
On Tue, Jun 1, 2010 at 11:04 PM, Kannan Goundan kan...@cakoose.com wrote:
I'm trying to come up with a compact encoding for Unicode strings for
data serialization purposes. The goals are fast read/write and small
size.
The plan:
1. BMP code points are encoded as two bytes (0x-0x,
Resending (from Gmail), because the Unicode list rejected the SMTP server of
my mail provider (Spamcop is defective).
Nothing forbifs you to create new serializations of Unicode; you may
even create it so that it will be a conforming
process (meaning that it will preserve *all* valid Unicode
Robert Abel noted:
Note that as of 1993, the only LAMDA or LAMBDA characters
in the standard were:
039B;GREEK CAPITAL LETTER LAMDA;Lu;0;L;N;GREEK CAPITAL LETTER
LAMBDA;;;03BB;
03BB;GREEK SMALL LETTER LAMDA;Ll;0;L;N;GREEK SMALL LETTER
LAMBDA;;039B;;039B
019B;LATIN SMALL LETTER
Perhaps a better approach would be to establish a Frequently Asked
Questions list on the Unicode Web site. Oh, wait.
--
Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org
RFC 5645, 4645, UTN #14 | ietf-languages @ http://is.gd/2kf0s
FWIW, I checked the FAQ first.
vanis...@boil.afraid.org wrote:
From: Doug Ewell (d...@ewellic.org)
I'm not sure how much longer we should continue to wait for Tengwar and
Cirth.
I hear Michael talking about meeting with the Tokeinists every once in a
while, so I can only assume that it is proceeding in some way.
I
On Jun 2, 2010, at 3:51 AM, William_J_G Overington wrote:
I know of no reason to think that a person skilled in the art would be
unable to write an iPad app to receive a program written in the portable
interpretable object code arriving within a Unicode text message and then for
the
On Jun 2, 2010, at 3:51 AM, William_J_G Overington wrote:
Unicode and ISO/IEC 10646 are attempts to solve a basic,
simply-described problem: provide for a standardized
computer representation of plain text written using existing
writing systems.
Well, that might well be the case
The trademarked name does not use ALL CAPS.
Is Unicode a registered trademark then? If yes where does it say so?
Both refer to the same organization.
Usually, you would use The Unicode Consortium.
Are you suggesting Incorporate is equal to Consortium in this case?
The usage is grammatical.
This is a bad idea.
The best way to make it go away is to just stop discussing it.
Peter
-Original Message-
From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf
Of William_J_G Overington
Sent: Wednesday, June 02, 2010 2:51 AM
To: Unicode Discussion; John H.
On Jun 2, 2010, at 3:49 AM, Vinodh Rajan wrote:
If there are similar projects that encode Ancient Characters in PUA, may be
you can co-ordinate with them. Similar to the ConScript Unicode Registry.
There is a proposal for Old Hanzi being worked on by the IRG. You can peruse
the IRGs
Anyway, most existing supporters of Tengwar and Cirth (also
Klingonists) still use some transliteration
Transliterateability shouldn't be a factor, many of the scripts in Unicode have
been transliterated (like Latin). Perhaps if it was only transliterated and
the script was never used (but
Sarasvati?
I'd personally wish to see you act...
Regards, Erkki
Erkki I. Kolehmainen
Tilkankatu 12 A 3, FI-00300 Helsinki, Finland
Puh. (09) 4368 2643, 0400 825 943; Tel. +358 9 4368 2643, +358 400 825 943
-Original Message-
From: unicode-bou...@unicode.org
Dear list members,
This is your official notification that this thread is now terminated.
The discussions of 3rd party font IP and trademark status are out of scope
and unlikely to result in enlightening discussion here.
Regards,
-- Sarasvati
On 6/2/2010 10:00 AM, Erkki I. Kolehmainen
Although this mail was not addressed to me, I did read it. Sue me.
Jony
-Original Message-
From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On
Behalf Of John Dlugosz
Sent: Wednesday, June 02, 2010 5:03 PM
Cc: unicode@unicode.org
Subject: RE: Greek letter LAMDA?
On 6/2/2010 11:46 AM, Jonathan Rosenne wrote:
Although this mail was not addressed to me, I did read it. Sue me.
The terms of use for the Unicode mail list essentially state that these
types of boilerplate are null and void as far as Unicode is concerned.
You will find the following in
Note that as of 1993, the only LAMDA or LAMBDA characters
in the standard were:
039B;GREEK CAPITAL LETTER LAMDA;Lu;0;L;N;GREEK CAPITAL LETTER
LAMBDA;;;03BB;
03BB;GREEK SMALL LETTER LAMDA;Ll;0;L;N;GREEK SMALL LETTER
LAMBDA;;039B;;039B
019B;LATIN SMALL LETTER LAMBDA WITH
From: Kenneth Whistler (k...@sybase.com)
[snip]
I expect that even this explanation will not satisfy those
who think that oddities like this should not exist in
character names. But that is just the nature of the
historical development of big standards like the Unicode
Standard when you
If anyone can null and void it, I wonder why companies bother to put such
things in people's outgoing mail. I would have thought they could come up with
a proper net-etiquite version, but they just don't care.
From: Asmus Freytag [mailto:asm...@ix.netcom.com]
Sent: Wednesday, June 02, 2010
I'm not sure how much longer we should continue to wait for Tengwar and
Cirth.
Three words: Squeaky wheel -- grease.
Don't expect this to just happen. The corporate members of
the Unicode Consortium are mostly concerned about economically
significant sets of characters that impact their
On 6/2/2010 3:28 PM, John Dlugosz wrote:
If anyone can “null and void” it, I wonder why companies bother to put
such things in people’s outgoing mail. I would have thought they could
come up with a proper net-etiquite version, but they just don’t care.
These things are bogus, because they
Thanks to everyone for the detailed responses. I definitely
appreciate the feedback on the broader issue (even though my question
was very narrow).
I should clarify my use case a little. I'm creating a generic data
serialization format similar to Google Protocol Buffers and Apache
Thrift.
SCSU is a pass-through for ASCII, plus it handles the common mix of
ASCII plus 96 local characters (Latin-1, Greek, Cyrillic, Thai, etc)
really fast. Go look at the sample code. If you take that as starting
point for optimization, I think you'll be fine.
An alternative that I've used is:
- Serialize every unsigned integer as a sequence of 7 bits, with the top
bit off for all but the last one.
- For signed integers, shift left by 1 bit, then invert if the original
was negative, then serialize as unsigned.
- Serialize a string as an
If you want a really fast alternate encoding, you could encode all of
Unicode in at most 3 bytes. Use the high bit as a continuation bit
and the lower 7 bits as the data.
ASCII gets passed through unchanged.
For code points between U+0080 and U+3FFF, split the value into the
high 7 bits and
Michael D'Errico mike dash list at pobox dot com wrote:
If you want a really fast alternate encoding, you could encode all of
Unicode in at most 3 bytes. Use the high bit as a continuation bit
and the lower 7 bits as the data.
ASCII gets passed through unchanged.
This is essentially what
On Wed, Jun 2, 2010 at 21:43, Doug Ewell d...@ewellic.org wrote:
If you want a really fast alternate encoding, you could encode all of
Unicode in at most 3 bytes. Use the high bit as a continuation bit and
the lower 7 bits as the data.
ASCII gets passed through unchanged.
This is
Probably closing the thread with answers to the questions would have
been better approach, instead of terminating so.
Fyi, Sarasvati is Vedic goddess of knowledge ingenuity, and the
goddess made such knowledge ingenuity available for others but never
implemented on Her own.
pracoditaa yena
44 matches
Mail list logo