Re: Kaktovik Inupiaq numerals

2012-04-28 Thread Richard Wordingham
On Fri, 27 Apr 2012 13:50:15 -0700 Ken Whistler k...@sybase.com wrote: On 4/27/2012 10:45 AM, Richard Wordingham wrote: If they are to be adopted by the CLDR, the digits need to be coded consecutively. I doubt this matters in any case, because this proposed use is for a vigesimal system,

Encoding of Numbers Composed of Decimal Digits (General Category of Nd)

2012-04-28 Thread Richard Wordingham
Is it anywhere stated as policy that numbers written by a string of decimal digits will be encoded with the most significant digit first in storage order? I couldn't find it stated anywhere. As positional notation only seems to have been invented and propagated once or twice (Babylonian and

Re: Unicode, SMS and year 2012

2012-04-28 Thread anbu
How data is transformed to this string is undefined, which is a problem. As mentioned in the mail, just like utf-8 is pre-installed in most systems, this design would also be pre-installed in the systems intending to use them. The example given above is not existing anywhere. One needs to come

Fwd: Re: Unicode, SMS and year 2012

2012-04-28 Thread anbu
Please note some correction and additions in the comparison of the values My design provides the following number values for the specified number of bits: 8 bits - 128 values (Cumulative: 128 values) 10 bits - 192 values (Cumulative: 320 values) 12 bits - 512 values (Cumulative: 832 values) 14

Writing Babylonian Numbers in Unicode

2012-04-28 Thread Richard Wordingham
Is there any recommendation on how to write Babylonian numbers in Unicode? I use the usual scheme of using the DISH series for the units and the U series for the tens. One problem with the Cuneiform Numbers and Punctuation block is that there is no cross reference for the low numbers. However,

ece

2012-04-28 Thread anbu
EBNF for any given code of a character encoding that I am designing: (10|11)(({(00|01)}(|0001|0010|0011|0100|0101|0110|0111|1010|1011|1110|){(10|11)}(1000|1001|1000|1101))|({(10|11)}(|0001|0100|0101|1000|1001|1010|1011|1100|1101|1110|){(00|01)}(0010|0011|0110|0111))) possible

Re: Encoding of Numbers Composed of Decimal Digits (General Category of Nd)

2012-04-28 Thread Mark Davis ☕
We don't have that as a policy ( http://www.unicode.org/policies/property_value_stability_table.html). It's worth proposing via the feedback form, because that is the expectation. -- Mark https://plus.google.com/114199149796022210033 * * *— Il meglio è l’inimico del

Re: Unicode, SMS and year 2012

2012-04-28 Thread Richard Wordingham
On Fri, 27 Apr 2012 11:21:05 -0700 Doug Ewell d...@ewellic.org wrote: SCSU works equally well, or almost so, with any text sample where the non-ASCII characters fit into a single block of 128 code points. For anything other than Latin-1 you need one byte of overhead, to switch to another

Re: Unicode, SMS and year 2012

2012-04-28 Thread Doug Ewell
anbu at peoplestring dot com wrote: Document encoded in SCSU or BOCU-1, given that the document contains only ASCII characters, may appear corrupt on a system that doesn't recognise SCSU or BOCU-1. This is the curious point of view that ASCII compatibility (or transparency) is a bad thing.

Re: Unicode, SMS and year 2012 - SQU, not UQU

2012-04-28 Thread Richard Wordingham
On Sat, 28 Apr 2012 18:55:00 +0100 Richard Wordingham richard.wording...@ntlworld.com wrote: I wrote: With SCSU that avoids Unicode mode and UQU whenever possible, most alphabetic languages work fairly well. I meant: With SCSU that avoids Unicode mode and SQU whenever possible, most

Re: Unicode, SMS and year 2012

2012-04-28 Thread Doug Ewell
Mark Davis  wrote: I suspect the punycode goal is to take a wide character set into a restricted character set, without caring much on resulting string length; if the original string happens to be in other character set than the target restricted character set, then the string length

Re: Fwd: Re: Unicode, SMS and year 2012

2012-04-28 Thread Doug Ewell
anbu at peoplestring dot com wrote: This clearly shows that my design yields number of values more than double that of UTF8 I didn't know we were competing against UTF-8 on efficiency. That's easy. UTF-8 is not at all guaranteed to be the most efficient encoding possible, or even reasonably

Re: Unicode, SMS and year 2012

2012-04-28 Thread Doug Ewell
I wrote: 0xxx - encodes U+ through U+007F 1xxx 0xxx - encodes U+0080 through U+3FFF 1xxx 1xxx - encodes U+4000 through U+10 (and onward to 0x1F) Last code sequence should be 1xxx 1xxx 0xxx. -- Doug Ewell | Thornton, Colorado, USA

Re: Fwd: Re: Unicode, SMS and year 2012

2012-04-28 Thread anbu
There are many reasons why a new encoding that is merely more efficient than UTF-8, especially one that sacrifices byte-based processing or other design features, will face a severe uphill battle in trying to displace UTF-8. What are some of the reasons a new encoding will face? On Sat,

Fwd: Re: Fwd: Re: Unicode, SMS and year 2012

2012-04-28 Thread anbu
The question shall read as: What are some of the reasons a new encoding will face challenges? Original Message Subject: Re: Fwd: Re: Unicode, SMS and year 2012 Date: Sat, 28 Apr 2012 15:32:47 -0400 From: a...@peoplestring.com To: d...@ewellic.org There are many reasons why a

Re: Unicode, SMS and year 2012

2012-04-28 Thread Cristian Secară
În data de Sat, 28 Apr 2012 12:53:17 -0600, Doug Ewell a scris: Not to say this isn’t so, but can you point to a tool or site where a user can type a string and see the output with different parameterizations? Pretty much all of the “Convert to Punycode” pages I see are only able to convert

Re: Unicode, SMS and year 2012

2012-04-28 Thread Doug Ewell
anbu at peoplestring dot com wrote: What are some of the reasons a new encoding will face challenges? The main challenge to a new encoding is that UTF-8 is already present in numerous applications and operating systems, and that any encoding intended to serve as an alternative, let alone a

Re: Unicode, SMS and year 2012

2012-04-28 Thread Cristian Secară
În data de Sat, 28 Apr 2012 12:41:51 -0600, Doug Ewell a scris: If I'm going to use a variable-length, non-byte-aligned encoding, where there is no chance of realigning in case of a flipped or dropped bit (which seems to be of great concern to many people), I might as well go ahead and use a

Re: Unicode, SMS and year 2012

2012-04-28 Thread Naena Guru
Hi Cristian, This is a bit of a deviation from the issues you raise, but it relates to the subject in a different way. The SMS char set does not seem to follow Unicode. How I see Unicode is as a set of character groups, 7-bit, 8-bit (extends and replaces 7-bit), 16-bit, and CJKV that use some

Re: Fwd: Re: Unicode, SMS and year 2012

2012-04-28 Thread Doug Ewell
On Friday, April 27, anbu at peoplestring dot com wrote: In addition I had a few more questions, of which the one below is the most significant: What if one had to send a text in multiple scripts, like in the case of a text and its translation in the same message? I thought maybe a new

Re: Unicode, SMS and year 2012

2012-04-28 Thread Doug Ewell
Richard Wordingham wrote: With SCSU that avoids Unicode mode and UQU whenever possible, most alphabetic languages work fairly well. However, extra windows are needed to cover the half-blocks from A480 to ABFF, 15 new codes. If I were being miserly, I wouldn't cover A500-A5FF. In November