On Fri, 27 Apr 2012 13:50:15 -0700
Ken Whistler k...@sybase.com wrote:
On 4/27/2012 10:45 AM, Richard Wordingham wrote:
If they are to be adopted by the CLDR, the digits need to be coded
consecutively.
I doubt this matters in any case, because this proposed use is for
a vigesimal system,
Is it anywhere stated as policy that numbers written by a string of
decimal digits will be encoded with the most significant digit first in
storage order? I couldn't find it stated anywhere.
As positional notation only seems to have been invented and propagated
once or twice (Babylonian and
How data is transformed to this string is
undefined, which is a problem.
As mentioned in the mail, just like utf-8 is pre-installed in most
systems, this design would also be pre-installed in the systems intending
to use them. The example given above is not existing anywhere. One needs to
come
Please note some correction and additions in the comparison of the values
My design provides the following number values for the specified number of
bits:
8 bits - 128 values (Cumulative: 128 values)
10 bits - 192 values (Cumulative: 320 values)
12 bits - 512 values (Cumulative: 832 values)
14
Is there any recommendation on how to write Babylonian numbers in
Unicode? I use the usual scheme of using the DISH series
for the units and the U series for the tens.
One problem with the Cuneiform Numbers and Punctuation block is that
there is no cross reference for the low numbers. However,
EBNF for any given code of a character encoding that I am designing:
(10|11)(({(00|01)}(|0001|0010|0011|0100|0101|0110|0111|1010|1011|1110|){(10|11)}(1000|1001|1000|1101))|({(10|11)}(|0001|0100|0101|1000|1001|1010|1011|1100|1101|1110|){(00|01)}(0010|0011|0110|0111)))
possible
We don't have that as a policy (
http://www.unicode.org/policies/property_value_stability_table.html). It's
worth proposing via the feedback form, because that is the expectation.
--
Mark https://plus.google.com/114199149796022210033
*
*
*— Il meglio è l’inimico del
On Fri, 27 Apr 2012 11:21:05 -0700
Doug Ewell d...@ewellic.org wrote:
SCSU works equally well, or almost so, with any text sample where the
non-ASCII characters fit into a single block of 128 code points. For
anything other than Latin-1 you need one byte of overhead, to switch
to another
anbu at peoplestring dot com wrote:
Document encoded in SCSU or BOCU-1, given that the document contains
only ASCII characters, may appear corrupt on a system that doesn't
recognise SCSU or BOCU-1.
This is the curious point of view that ASCII compatibility (or
transparency) is a bad thing.
On Sat, 28 Apr 2012 18:55:00 +0100
Richard Wordingham richard.wording...@ntlworld.com wrote:
I wrote:
With SCSU that avoids Unicode mode and UQU whenever possible, most
alphabetic languages work fairly well.
I meant:
With SCSU that avoids Unicode mode and SQU whenever possible, most
Mark Davis wrote:
I suspect the punycode goal is to take a wide character set into a
restricted character set, without caring much on resulting string
length; if the original string happens to be in other character set
than the target restricted character set, then the string length
anbu at peoplestring dot com wrote:
This clearly shows that my design yields number of values more than
double that of UTF8
I didn't know we were competing against UTF-8 on efficiency. That's
easy. UTF-8 is not at all guaranteed to be the most efficient encoding
possible, or even reasonably
I wrote:
0xxx - encodes U+ through U+007F
1xxx 0xxx - encodes U+0080 through U+3FFF
1xxx 1xxx - encodes U+4000 through U+10
(and onward to 0x1F)
Last code sequence should be 1xxx 1xxx 0xxx.
--
Doug Ewell | Thornton, Colorado, USA
There are many reasons why a new encoding that is merely more efficient
than UTF-8, especially one that sacrifices byte-based processing or
other design features, will face a severe uphill battle in trying to
displace UTF-8.
What are some of the reasons a new encoding will face?
On Sat,
The question shall read as:
What are some of the reasons a new encoding will face challenges?
Original Message
Subject: Re: Fwd: Re: Unicode, SMS and year 2012
Date: Sat, 28 Apr 2012 15:32:47 -0400
From: a...@peoplestring.com
To: d...@ewellic.org
There are many reasons why a
În data de Sat, 28 Apr 2012 12:53:17 -0600, Doug Ewell a scris:
Not to say this isn’t so, but can you point to a tool or site where a
user can type a string and see the output with different
parameterizations? Pretty much all of the “Convert to Punycode” pages
I see are only able to convert
anbu at peoplestring dot com wrote:
What are some of the reasons a new encoding will face challenges?
The main challenge to a new encoding is that UTF-8 is already present in
numerous applications and operating systems, and that any encoding
intended to serve as an alternative, let alone a
În data de Sat, 28 Apr 2012 12:41:51 -0600, Doug Ewell a scris:
If I'm going to use a variable-length, non-byte-aligned encoding,
where there is no chance of realigning in case of a flipped or
dropped bit (which seems to be of great concern to many people), I
might as well go ahead and use a
Hi Cristian,
This is a bit of a deviation from the issues you raise, but it relates to
the subject in a different way.
The SMS char set does not seem to follow Unicode. How I see Unicode is as a
set of character groups, 7-bit, 8-bit (extends and replaces 7-bit), 16-bit,
and CJKV that use some
On Friday, April 27, anbu at peoplestring dot com wrote:
In addition I had a few more questions, of which the one below is the
most significant:
What if one had to send a text in multiple scripts, like in the case
of a text and its translation in the same message?
I thought maybe a new
Richard Wordingham wrote:
With SCSU that avoids Unicode mode and UQU whenever possible, most
alphabetic languages work fairly well. However, extra windows are
needed to cover the half-blocks from A480 to ABFF, 15 new codes. If I
were being miserly, I wouldn't cover A500-A5FF.
In November
21 matches
Mail list logo