On 8/24/2011 7:45 PM, Richard Wordingham wrote:
Which earlier coding system supported Welsh? (I'm thinking of 'W WITH
CIRCUMFLEX', U+0174 and U+0175.) How was the use of the canonical
decompositions incompatible with the character encodings of legacy
systems? Latin-1 has the same codes as
]
Puolesta Asmus Freytag
Lähetetty: 25. elokuuta 2011 9:00
Vastaanottaja: Richard Wordingham
Kopio: Ken Whistler; unicode@unicode.org
Aihe: Re: Code pages and Unicode
On 8/24/2011 7:45 PM, Richard Wordingham wrote:
Which earlier coding system supported Welsh? (I'm thinking of 'W WITH
CIRCUMFLEX
On Tuesday 23 August 2011, Doug Ewell d...@ewellic.org wrote:
Asmus Freytag asmusf at netcom dot com wrote:
Until then, I find further speculation rather pointless and would love if
it moved off this list (until such time).
+1
-0.7
It is harmless fun, indeed it is fun that assists
On 23 août 2011 21:44 Richard Wordingham
richard.wording...@ntlworld.com richard.wording...@ntlworld.com
wrote:
On Tue, 23 Aug 2011 07:18:21 +0200
Jean-François Colson j...@colson.eu j...@colson.eu wrote:
And what dou you think about (H1,H2,VS1,L3,L4)?
The L4 is unnecessary. The trick
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
(1) a plain-text file
(2) using only plain-text conventions (i.e. not adding rich text)
(3) which contains the same PUA code point with two meanings
(4) using different fonts or other mechanisms
(5) in a platform-independent,
Luke-Jr luke at dashjr dot org wrote:
Too bad the Conscript registry is censoring assignments the maintainer
doesn't like for unspecified personal reasons, increasing the chances
of an overlap.
This isn't censorship, which would imply some sort of political,
ethical, or moral agenda. This is
Asmus Freytag 於 2011年8月23日 下午2:00 寫道:
Until then, I find further speculation rather pointless and would love if it
moved off this list (until such time).
That would be wonderful, because we could then turn our attention to more
urgent subjects, such as what to do when the sun reaches
William_J_G Overington wjgo underscore 10009 at btinternet dot com
wrote:
Until then, I find further speculation rather pointless and would
love if it moved off this list (until such time).
It is harmless fun, indeed it is fun that assists learning and
understanding, and so as long as it
On Wed, 24 Aug 2011 08:02:42 -0700
Doug Ewell d...@ewellic.org wrote:
But some people seem to be dead serious about the need to go beyond
1.1 million code points, and are making dead-serious arguments that
we need to plan for it.
Those are two different claims. 'Never say never' is a useful
On 8/24/2011 10:48 AM, Richard Wordingham wrote:
Those are two different claims. 'Never say never' is a useful maxim.
So is Leave well enough alone.
The problem would be in using maxims instead
of an analysis of engineering requirements to drive architectural decisions.
The extension of
On Wed, 24 Aug 2011 12:40:54 -0700
Ken Whistler k...@sybase.com wrote:
On 8/24/2011 10:48 AM, Richard Wordingham wrote:
if, say,
code points are squandered.
Oh.
Well, in that case, the correct action is to work to ensure that code
points are not squandered.
Have there not already
It has ceased to be. It's expired and gone to meet its maker. It's a stiff.
Bereft of life, it rests in peace.…Its metabolic processes are now history.
It's off the twig. It's kicked the bucket, it's shuffled off its mortal coil,
run down the curtain and joined the bleedin' choir invisible.
On 8/24/2011 3:51 PM, Richard Wordingham wrote:
Well, in that case, the correct action is to work to ensure that code
points are not squandered.
Have there not already been several failures on that front? The BMP is
littered with concessions to the limitations of rendering systems -
2011/8/24 Doug Ewell d...@ewellic.org:
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
(1) a plain-text file
(2) using only plain-text conventions (i.e. not adding rich text)
(3) which contains the same PUA code point with two meanings
(4) using different fonts or other mechanisms
2011/8/25 Richard Wordingham richard.wording...@ntlworld.com:
It will only happen when the need becomes obvious, which may be never,
or may be 30 years hence. It's even conceivable that UTF-16 will
drop out of use.
Conceivable but extremely unlikely because it will remain used in
extremely
by ATT
-Original Message-
From: Philippe Verdy verd...@wanadoo.fr
Sender: unicode-bou...@unicode.org
Date: Thu, 25 Aug 2011 02:10:27
To: Doug Ewelld...@ewellic.org
Reply-To: verd...@wanadoo.fr
Cc: unicode@unicode.org
Subject: Re: Multiple private agreements (was: RE: Code pages and Unicode
Subject: Re: Multiple private agreements (was: RE: Code pages and Unicode)
Philippe wrote:
But my initial suggestion implied that condition 3 was not part of it.
This is not me, but sriva that has modified the problem. The problem
was changed later by adding new conditions that I have never intended
2011/8/24 Doug Ewell d...@ewellic.org:
As Richard said, and you probably already know, there is no chance that
UTC will ever do anything with the PUA, especially anything that gives
the appearance of endorsing its use. I'm just thankful they haven't
deprecated it.
The appearance of endorsing
so you will end up with the CSUR AND the registry Pilippe is
suggesting AND all the existing uses of PUA that will not end up in
CSUR or the other registry.
sounds like it will be a mess.
its bad enough dealing with Unicode and pseudo-Unicode in the Myanmar
script, adding PUA potentially into
2011/8/25 Andrew Cunningham lang.supp...@gmail.com:
so you will end up with the CSUR AND the registry Philippe is
suggesting AND all the existing uses of PUA that will not end up in
CSUR or the other registry.
sounds like it will be a mess.
its bad enough dealing with Unicode and
On Mon, 22 Aug 2011 16:18:56 -0700
Ken Whistler k...@sybase.com wrote:
How about Clause 12.5 of ISO/IEC 10646:
001B, 0025, 0040
You escape out of UTF-16 to ISO 2022, and then you can do whatever
the heck you want, including exchange and processing of complete
4-byte forms, with all the
On 8/23/2011 12:00 PM, Richard Wordingham wrote:
On Mon, 22 Aug 2011 16:18:56 -0700
Ken Whistlerk...@sybase.com wrote:
How about Clause 12.5 of ISO/IEC 10646:
001B, 0025, 0040
You escape out of UTF-16 to ISO 2022, and then you can do whatever
the heck you want, including exchange and
Asmus Freytag asmusf at netcom dot com wrote:
Until then, I find further speculation rather pointless and would
love if it moved off this list (until such time).
+1
--
Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14
www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell
srivas sinnathurai sisrivas at blueyonder dot co dot uk wrote:
If same codes within PUA becomes standard for different purposes,
They aren't standard. Two different private agreements could assign
different characters to the same PUA code points.
how to get both working using same font?
You
2011/8/23 Doug Ewell d...@ewellic.org:
srivas sinnathurai sisrivas at blueyonder dot co dot uk wrote:
If same codes within PUA becomes standard for different purposes,
They aren't standard. Two different private agreements could assign
different characters to the same PUA code points.
how
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
There's no standard way to specify even one font or private agreement in
plain text, let alone how to switch between them within the same
document. This is not an intended use of the PUA.
There exists such standard in the context
2011/8/24 Doug Ewell d...@ewellic.org:
Coordinating private agreements so they don't conflict is clearly the
ideal situation. But many different people and organizations have
already claimed the same chunk of PUA space, as Richard exemplified
yesterday with his Taiwan/Hong Kong example.
On Tuesday, August 23, 2011 10:29:58 PM Philippe Verdy wrote:
2011/8/24 Doug Ewell d...@ewellic.org:
(3) which contains the same PUA code point with two meanings
The only numbered item to sacifice is number (3) here. that's the case
where separate PUA agreements are still coordinated so that
2011/8/24 Luke-Jr l...@dashjr.org:
On Tuesday, August 23, 2011 10:29:58 PM Philippe Verdy wrote:
2011/8/24 Doug Ewell d...@ewellic.org:
(3) which contains the same PUA code point with two meanings
The only numbered item to sacifice is number (3) here. that's the case
where separate PUA
On 21 August 2011 02:14, Richard Wordingham
richard.wording...@ntlworld.com wrote:
On Fri, 19 Aug 2011 17:03:41 -0700
Ken Whistler k...@sybase.com wrote:
O.k., so apparently we have awhile to go before we have to start
worrying about the Y2K or IPv4 problem for Unicode. Call me again in
the
On 08/22/2011 03:05 PM, Andrew West wrote:
Can anyone think of a way to extend UTF-16 without adding new
surrogates or inventing a new general category?
Why would anyone *need* to do so? UTF-16 can represent all codepoints
upto Plane 16 right?
--
Shriramana Sharma
On 22 August 2011 12:51, Shriramana Sharma samj...@gmail.com wrote:
On 08/22/2011 03:05 PM, Andrew West wrote:
Can anyone think of a way to extend UTF-16 without adding new
surrogates or inventing a new general category?
Why would anyone *need* to do so? UTF-16 can represent all codepoints
srivas sinnathurai sisrivas at blueyonder dot co dot uk wrote:
The true lifting of UTF-16 would be to UTF-32.
Leave the UTF-16 un touched and make the new half versatile as possible.
I think any other solution is just a patch up for the timebeing.
There is no evidence whatsoever that this
Christoph Päper 於 2011年8月20日 上午2:31 寫道:
Mark Davis ☕:
Under the original design principles of Unicode, the goal was a bit more
limited; we envisioned […] a generative mechanism for infrequent CJK
ideographs,
I'd still like having that as an option.
Et voilà! We have Ideographic
On Monday 22 August 2011, Andrew West andrewcw...@gmail.com wrote:
Can anyone think of a way to extend UTF-16 without adding new surrogates or
inventing a new general category?
Andrew
How about a triple sequence of two high surrogates followed by one low
surrogate?
I suggest this as a
On 22/08/11 16:55, Doug Ewell wrote:
srivas sinnathuraisisrivas at blueyonder dot co dot uk wrote:
The true lifting of UTF-16 would be to UTF-32.
Leave the UTF-16 un touched and make the new half versatile as possible.
I think any other solution is just a patch up for the timebeing.
There
On 20/08/11 02:03, Ken Whistler wrote:
O.k., so apparently we have awhile to go before we have to start worrying
about the Y2K or IPv4 problem for Unicode. Call me again in the
year 2851, and we'll still have 5 years left to design a new scheme
and plan
for the transition. ;-)
--Ken
I
On 8/22/2011 9:58 AM, Jean-François Colson wrote:
I wonder whether you aren’t a little too optimistic.
No. If anything I'm assuming that the folks working on proposals will
be amazingly assiduous during the next decade.
Have you considered the unencoded ideographic scripts?
Why, yes I
On Mon, 22 Aug 2011 14:06:00 +0100 (BST)
William_J_G Overington wjgo_10...@btinternet.com wrote:
On Monday 22 August 2011, Andrew West andrewcw...@gmail.com wrote:
Can anyone think of a way to extend UTF-16 without adding new
surrogates or inventing a new general category?
Andrew
On 8/22/2011 3:15 PM, Richard Wordingham wrote:
On Monday 22 August 2011, Andrew Westandrewcw...@gmail.com wrote:
Can anyone think of a way to extend UTF-16 without adding new
surrogates or inventing a new general category?
Andrew
How about a triple sequence of two
On 23/08/11 00:15, Richard Wordingham wrote:
The problem is that a search for the character represented by the code
unit sequence (H2,L3) would also pick up the sequence (H1,H2,L3).
While there is no ambiguity, it does make searching more complicated
to code. The same issue applies to the
About the research works.
I alone (with with my colleagues) researching the fact that
Sumerian is Tamil / Tamil is Sumerian
This requires quite a lot of space.
Additionally I do research on Tamil alphabet as based on scientific
definitions and it only represents the mechanical parts , ie only
for a character
encoding.
--
Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14
www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell
From: srivas sinnathurai
Sent: Saturday, August 20, 2011 3:35
To: Christoph Päper
Cc: unicode@unicode.org
Subject: Re: Code pages and Unicode
On Fri, 19 Aug 2011 17:03:41 -0700
Ken Whistler k...@sybase.com wrote:
O.k., so apparently we have awhile to go before we have to start
worrying about the Y2K or IPv4 problem for Unicode. Call me again in
the year 2851, and we'll still have 5 years left to design a new
scheme and plan for the
srivas sinnathurai sisrivas at blueyonder dot co dot uk wrote:
PUA is not structured
It's not supposed to be. It's a private-use area. You use it the way
you see fit.
and not officially programmable to accommodate
numerous code pages.
None of Unicode is designed around code-page switching
sisrivas at blueyonder dot co dot uk wrote:
PUA is not structured
It's not supposed to be. It's a private-use area. You use it the way
you see fit.
and not officially programmable to accommodate
numerous code pages.
None of Unicode is designed around code-page switching. It's a flat
code
srivas sinnathurai sisrivas at blueyonder dot co dot uk wrote:
Why this suggestion?
With current flat space, one code point is only allocated to one and
only one purpose.
We can run out of code space soon.
Argument over. There are not 800,000 more characters that need to be
encoded for
srivas sinnathurai 於 2011年8月19日 上午9:40 寫道:
Why this suggestion?
With current flat space, one code point is only allocated to one and only one
purpose.
We can run out of code space soon.
There are a couple of problems here.
We currently have over 860,000 unassigned code points. Surveys
John H. Jenkins:
there would have to be a *lot* of writing systems out there we don't know
about to fill up planes 4 through 14
That’s quite possible, though, the universe is huge. The question rather is
whether we will ever know about them. It’s quite possible we won’t.
Maybe we should step back a bit:
I'm not calling for any change to existing major aloocations. However,
this is about time we allocate (not PUA) large number of codes to a
code page based sub codes so that not only all 7000+ languages can
Freely use it without INTERFERENCE from Unicode and
On 19 Aug 2011, at 18:24, John H. Jenkins wrote:
We currently have over 860,000 unassigned code points. Surveys of all known
writing systems indicate that only a small fraction of these will be needed.
Indeed, although it looks likely that Han will spill out of the SIP into
plane 3, all
On 08/19/2011 01:24 PM, John H. Jenkins wrote:
In order to get the UTC and WG2 to agree to a major architectural
change such as you're suggesting, you'd have to have some very solid
evidence that it's needed—not an interesting idea, not potentially
useful, but seriously *needed*. That's how
Mark E. Shoulson mark at kli dot org wrote:
And indeed, it went the other way too, back when ISO-10646 had not 17,
but 65536 *planes* and someone provided some reasonable evidence (or
just plain reasoned arguments) that 4.3 *billion* characters was
probably overkill.
Technically, I think
20.8.2011 0:07, Doug Ewell wrote:
Of course, 2.1 billion characters is also overkill, but the advent of
UTF-16 was how we ended up with 17 planes.
And now we think that a little over a million is enough for everyone,
just as they thought in the late 1980s that 16 bits is enough for everyone.
On 08/19/2011 05:07 PM, Doug Ewell wrote:
Mark E. Shoulsonmark at kli dot org wrote:
And indeed, it went the other way too, back when ISO-10646 had not 17,
but 65536 *planes* and someone provided some reasonable evidence (or
just plain reasoned arguments) that 4.3 *billion* characters was
On 20 Aug 2011, at 00:35, Jukka K. Korpela wrote:
And now we think that a little over a million is enough for everyone,
just as they thought in the late 1980s that 16 bits is enough for everyone.
Whenever somebody talks about needing 31 bits for Unicode, I always think of
the hypothetical
Jukka K. Korpela jkorpela at cs dot tut dot fi wrote:
And now we think that a little over a million is enough for everyone,
just as they thought in the late 1980s that 16 bits is enough for
everyone.
I know this is an enjoyable exercise — people love to ridicule Bill
Gates for his comment in
On 8/19/2011 2:07 PM, Doug Ewell wrote:
Technically, I think 10646 was always limited to 32,768 planes so that
one could always address a code point with a 32-bit signed integer (a
nod to the Java fans).
Well, yes, but it didn't really have anything to do with Java. Remember
that Java
wasn't
Benjamin M Scarborough 於 2011年8月19日 下午3:53 寫道:
Whenever somebody talks about needing 31 bits for Unicode, I always think of
the hypothetical situation of discovering some extraterrestrial civilization
and trying to add all of their writing systems to Unicode. I imagine there
would be
On 8/19/2011 2:53 PM, Benjamin M Scarborough wrote:
Whenever somebody talks about needing 31 bits for Unicode, I always think of
the hypothetical situation of discovering some extraterrestrial civilization
and trying to add all of their writing systems to Unicode. I imagine there
would be
On 8/19/2011 2:35 PM, Jukka K. Korpela wrote:
20.8.2011 0:07, Doug Ewell wrote:
Of course, 2.1 billion characters is also overkill, but the advent of
UTF-16 was how we ended up with 17 planes.
And now we think that a little over a million is enough for everyone,
just as they thought in the
On 8/19/2011 3:24 PM, Ken Whistler wrote:
On 8/19/2011 2:07 PM, Doug Ewell wrote:
Technically, I think 10646 was always limited to 32,768 planes so that
one could always address a code point with a 32-bit signed integer (a
nod to the Java fans).
Well, yes, but it didn't really have anything
62 matches
Mail list logo