Fwd: Endangered Alphabets

2011-08-19 Thread Michael Everson
I'd like to invite everyone to support this worthwhile project: http://www.kickstarter.com/projects/1496420787/the-endangered-alphabets-project/ Michael Everson * http://www.evertype.com/

Re: Endangered Alphabets

2011-08-19 Thread srivas sinnathurai
This is about time we allocate a significant space withi the Unicode code space to work in the old fashion code page provisioning mode. I'm not calling for any change to existing major aloocations. However, this is about time we allocate (not PUA) large number of codes to a code page based sub

Re: Endangered Alphabets

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 04:43 PM, srivas sinnathurai wrote: All those in favour of creating code pages, please say yes, and others please say why not. Sinnathurai, 7000 code pages are not enough. To replace Unicode, you should create at least 65536 code pages, because Unicode is represented in UTF-16

Re: Endangered Alphabets

2011-08-19 Thread Doug Ewell
In what way is this not what the PUA is all about? -- Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14 www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell ­ From: srivas sinnathurai Sent: Friday, August 19, 2011 5:13 To: Michael Everson Cc: unicode Unicode Discussion ;

Re: Endangered Alphabets

2011-08-19 Thread srivas sinnathurai
PUA is not structured and not officially programmable to accommodate numerous code pages. Take the ISO 8859-1, 2, 3, and so on . These are now allocating the same code points to many languages and for other purposes. Similarly, a structured and official allocations to any many requirements

RTL PUA?

2011-08-19 Thread Petr Tomasek
Hello, I would like to ask why there are no PUA parts which would be reserved for RTL scripts (i.e. would have the directionality set to strong RTL). Thanks! P.T. -- Petr Tomasek http://www.etf.cuni.cz/~tomasek Jabber: but...@jabbim.cz

Re: RTL PUA?

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 14:29, Petr Tomasek wrote: I would like to ask why there are no PUA parts which would be reserved for RTL scripts (i.e. would have the directionality set to strong RTL). Thanks! P.T. This is a very good question. Michael Everson * http://www.evertype.com/

Re: RTL PUA?

2011-08-19 Thread Petr Tomasek
On Fri, Aug 19, 2011 at 02:43:56PM +0100, Michael Everson wrote: On 19 Aug 2011, at 14:29, Petr Tomasek wrote: I would like to ask why there are no PUA parts which would be reserved for RTL scripts (i.e. would have the directionality set to strong RTL). Thanks! P.T. This is a

RE: RTL PUA?

2011-08-19 Thread Doug Ewell
Petr Tomasek tomasek at etf dot cuni dot cz wrote: I would like to ask why there are no PUA parts which would be reserved for RTL scripts (i.e. would have the directionality set to strong RTL). The PUA is supposed to be a free and open sandbox, without reserved or allocated zones. There was

Re: RTL PUA?

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 07:13 PM, Michael Everson wrote: This is a very good question. It seems Michael speaks tongue-in-cheek. I personally don't see the point in allocation RTL areas in the PUA. It is after all the *P*UA. Do you expect rendering engines to support the PUA? Yeah OK maybe simply

Re: RTL PUA?

2011-08-19 Thread Werner LEMBERG
I would like to ask why there are no PUA parts which would be reserved for RTL scripts (i.e. would have the directionality set to strong RTL). This is a very good question. Probably noone had such idea until now. Werner

Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Doug Ewell
srivas sinnathurai sisrivas at blueyonder dot co dot uk wrote: PUA is not structured It's not supposed to be. It's a private-use area. You use it the way you see fit. and not officially programmable to accommodate numerous code pages. None of Unicode is designed around code-page

Re: RTL PUA?

2011-08-19 Thread Mark E. Shoulson
On 08/19/2011 09:29 AM, Petr Tomasek wrote: Hello, I would like to ask why there are no PUA parts which would be reserved for RTL scripts (i.e. would have the directionality set to strong RTL). I have long wondered about this, and I'm pretty sure the discussion has surfaced here once or twice

Re: RTL PUA?

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 15:13, Doug Ewell wrote: The PUA is supposed to be a free and open sandbox, without reserved or allocated zones. Nevertheless, inherent directionality is something that computers take notice of. There would be no harm in having a RTL PUA area. My question would be why

Re: RTL PUA?

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 07:43 PM, Doug Ewell wrote: My question would be why the PUA is designated as 'L' by default at all, instead of, say, 'ON'. ... do present the impression that these code points are somehow reserved for strong-LTR characters, and also for non-reordrant characters (i.e. no

Re: RTL PUA?

2011-08-19 Thread Mark E. Shoulson
On 08/19/2011 10:13 AM, Doug Ewell wrote: So your private agreement, in addition to specifying the meaning of your PUA characters and probably some sample glyphs, can also specify their properties, overriding the default properties. I don't know if you can even do this. My understanding of

Re: RTL PUA?

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 15:24, Shriramana Sharma wrote: On 08/19/2011 07:13 PM, Michael Everson wrote: This is a very good question. It seems Michael speaks tongue-in-cheek. Not at all. I think there should be a RTL PUA. I personally don't see the point in allocation RTL areas in the PUA. It

Re: RTL PUA?

2011-08-19 Thread vanisaac
From: Michael Everson everson_at_evertype.com On 19 Aug 2011, at 14:29, Petr Tomasek wrote: I would like to ask why there are no PUA parts which would be reserved for RTL scripts (i.e. would have the directionality set to strong RTL). Thanks! P.T. This is a very good

Re: Endangered Alphabets

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 08:14 PM, William_J_G Overington wrote: I am wondering if the following idea would be of any usefulness towards solving the problem without needing any code point allocations in Unicode. Pardon me for not understanding if I entirely missed your point, but why can't these

Re: RTL PUA?

2011-08-19 Thread Mark E. Shoulson
On 08/19/2011 10:24 AM, Shriramana Sharma wrote: I also wonder what the following below http://unicode.org/reports/tr9/#Bidirectional_Character_Types means: Private-use characters can be assigned different values by a conformant implementation. Best I can guess is You can write your own

Re: RTL PUA?

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 15:34, Shriramana Sharma wrote: On 08/19/2011 07:43 PM, Doug Ewell wrote: My question would be why the PUA is designated as 'L' by default at all, instead of, say, 'ON'. ... do present the impression that these code points are somehow reserved for strong-LTR characters,

Re: RTL PUA?

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 08:11 PM, vanis...@boil.afraid.org wrote: why there weren't private use Variation Selectors. Because you are already free to use PUA codepoints as VSs? -- Shriramana Sharma

RE: RTL PUA?

2011-08-19 Thread Doug Ewell
Michael Everson everson at evertype dot com wrote: So your private agreement, in addition to specifying the meaning of your PUA characters and probably some sample glyphs, can also specify their properties, overriding the default properties. Gods know I wouldn't have any idea how to get

RE: RTL PUA?

2011-08-19 Thread Doug Ewell
Mark E. Shoulson mark at kli dot org wrote: So your private agreement, in addition to specifying the meaning of your PUA characters and probably some sample glyphs, can also specify their properties, overriding the default properties. I don't know if you can even do this. My understanding

Re: RTL PUA?

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 15:51, Shriramana Sharma wrote: On 08/19/2011 08:11 PM, vanis...@boil.afraid.org wrote: why there weren't private use Variation Selectors. Because you are already free to use PUA codepoints as VSs? Because the existing VSs are sufficient? Michael Everson *

Re: RTL PUA?

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 08:34 PM, Mark E. Shoulson wrote: But they work Just Great for LTR scripts in the PUA, but not for RTL scripts. Isn't that kind of bias counter to the whole point of the PUA and Unicode in general? And it isn't only due to implementors, either: Unicode specifies LTR

Re: RTL PUA?

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 08:36 PM, Michael Everson wrote: On 19 Aug 2011, at 15:51, Shriramana Sharma wrote: On 08/19/2011 08:11 PM, vanis...@boil.afraid.org wrote: why there weren't private use Variation Selectors. Because you are already free to use PUA codepoints as VSs? Because the existing VSs

Re: RTL PUA?

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 16:03, Shriramana Sharma wrote: There is plenty of space. There would be no difficulty in assigning some rows to a RTL PUA. It is not a question of availability of space. It is a question of principles. Sure. People who are happy with LTR directionality have PUA code

Re: RTL PUA?

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 15:57, Doug Ewell wrote: Most applications don't care about the PUA, or assume it's only for that particular vendor's custom Latin-script ligatures and dictionary symbols. And guess what: since my ligatures and dictionary symbols are LTR, I have no problem because the

Re: Endangered Alphabets

2011-08-19 Thread William_J_G Overington
I am wondering if the following idea would be of any usefulness towards solving the problem without needing any code point allocations in Unicode. Suppose that a concept of an Endangered Language Code Page is invented. Suppose that the letter sequence ELCP is used to designate an endangered

Re: RTL PUA?

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 16:04, Mark E. Shoulson wrote: I didn't say that applications or rendering engines are able to accept your overridden properties and apply them, right out of the box, at least not today. But they work Just Great for LTR scripts in the PUA, but not for RTL scripts.

Re: Endangered Alphabets

2011-08-19 Thread John H. Jenkins
I think you want ISO 2022. In any event, this will never happen in Unicode, because this is the exact opposite of what Unicode is all about, unless I misunderstand you. Unicode's goal is for every code unit to have a fixed interpretation. So far as many people involved in the original

Re: RTL PUA?

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 16:05, Shriramana Sharma wrote: On 08/19/2011 08:03 PM, Michael Everson wrote: On 19 Aug 2011, at 15:13, Doug Ewell wrote: The PUA is supposed to be a free and open sandbox, without reserved or allocated zones. Nevertheless, inherent directionality is something that

Re: RTL PUA?

2011-08-19 Thread Mark E. Shoulson
On 08/19/2011 11:03 AM, Shriramana Sharma wrote: In effect, changing the existing BC=L to ON is no worse than changing it to R. I think making the directionality of the PUA L instead of ON was a mistake in the first place, yes, but does even the PUA fall under the commandment Thou shalt

Re: Private Use Variation Selectors (was: RTL PUA?)

2011-08-19 Thread vanisaac
From: Michael Everson everson_at_evertype.com On 19 Aug 2011, at 15:51, Shriramana Sharma wrote: On 08/19/2011 08:11 PM, vanisaac_at_boil.afraid.org wrote: why there weren't private use Variation Selectors. Because you are already free to use PUA codepoints as VSs? Because the

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread srivas sinnathurai
Doug, First of all flat code space is the primary functionality of Unicode and not calling for any changes to existing encodings. What I propose is assign about 16,000 codes to code-page switching model. Why this suggestion? With current flat space, one code point is only allocated to one and

Re: RTL PUA?

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 16:16, Shriramana Sharma wrote: Which then again brings us back to Doug's previous point that these should be (have been) assigned some more neutral BC such as ON. That train has left the station, though. Michael Everson * http://www.evertype.com/

Re: RTL PUA?

2011-08-19 Thread Petr Tomasek
On Fri, Aug 19, 2011 at 04:22:19PM +0100, Michael Everson wrote: On 19 Aug 2011, at 16:04, Mark E. Shoulson wrote: I didn't say that applications or rendering engines are able to accept your overridden properties and apply them, right out of the box, at least not today. But they work

Re: RTL PUA?

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 16:31, Mark E. Shoulson wrote: On 08/19/2011 11:03 AM, Shriramana Sharma wrote: In effect, changing the existing BC=L to ON is no worse than changing it to R. I think making the directionality of the PUA L instead of ON was a mistake in the first place, yes, but

Re: RTL PUA?

2011-08-19 Thread Mark E. Shoulson
On 08/19/2011 11:21 AM, Michael Everson wrote: Directionality is a very deep property. A CSUR LTR script works fine out of the box on all platforms at least as far as directionality goes. A CSUR RTL script simply can't, and do you really think that defining the properties will effectively

Re: Endangered Alphabets

2011-08-19 Thread srivas sinnathurai
We are keeping the Unicode as it is and asking it to support code pages within say 25% of the allocations. That is entirely different to making all Unicode as code page switchable. We will have plenty of time inour hand to avoid any disasters, as we are not touching the primary purpose while

RE: RTL PUA?

2011-08-19 Thread Doug Ewell
Michael Everson everson at evertype dot com wrote: It's part of the private agreement. I can't personally tell an OS or application (unless I write it) how to interpret those properties, but they are out there, and it would be theoretically possible for an OS or app to accept those

RE: RTL PUA?

2011-08-19 Thread Doug Ewell
Michael Everson everson at evertype dot com wrote: The PUA *already* defines its characters as LTR. That's been done. It is *part* of the definition and functionality of the PUA. It's irrelevant whether it should be or not. It *is*. It isn't. It's just a default, though admittedly one

RE: RTL PUA?

2011-08-19 Thread Doug Ewell
Michael Everson everson at evertype dot com wrote: Which then again brings us back to Doug's previous point that these should be (have been) assigned some more neutral BC such as ON. That train has left the station, though. I thought this property was mutable, and indeed that even some

Re: Private Use Variation Selectors (was: RTL PUA?)

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 09:00 PM, vanis...@boil.afraid.org wrote: Quote from 16.4: Standardized variation sequences are defined in the file StandardizedVariants.txt in the Unicode Character Database. Ideographic variation sequences are defined by the registration process defined in Unicode Technical

Re: RTL PUA?

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 16:38, Mark E. Shoulson wrote: It's pretty disingenuous to say Well, if you want a private-use RTL script, you should be prepared to write an engine that can render it, ignoring the fact that LTR people can get by with just a font. Why should it be so much harder to

RE: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Doug Ewell
srivas sinnathurai sisrivas at blueyonder dot co dot uk wrote: Why this suggestion? With current flat space, one code point is only allocated to one and only one purpose. We can run out of code space soon. Argument over. There are not 800,000 more characters that need to be encoded for

Re: RTL PUA?

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 09:51 PM, Doug Ewell wrote: I thought this property was mutable, and indeed that even some assigned characters had had their Bidi_Class changed over the years. I could be wrong. Perhaps you refer to this: http://www.unicode.org/versions/corrigendum8.html ? -- Shriramana

Re: RTL PUA?

2011-08-19 Thread Mark Davis ☕
All of the property assignments to PUA characters (except the GC) are purely informative. The property assignments that are there are simply based on the likelyhood of property assignment, and can be freely overridden by implementations. It is just more likely that PUA characters are bc:L than

RE: What are the present criteria...

2011-08-19 Thread Doug Ewell
Asmus Freytag asmusf at ix dot netcom dot com wrote: Nevertheless, N4085 is a German NB document, the criteria in question are those suggested by the German NB and not WG2 (and the document makes note of this distinction), and it is an error to portray this passage as representing either a

Re: RTL PUA?

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 08:59 PM, Michael Everson wrote: Please write me a rendering engine that will correctly processhttp://www.evertype.com/standards/csur/engsvanyali.html on the Mac OS, Linux, and Windows. Thanks. Heh, it seems from a superficial look through that you could stick the required

Re: Fwd: Endangered Alphabets

2011-08-19 Thread Mark Davis ☕
+1 Mark *— Il meglio è l’inimico del bene —* On Fri, Aug 19, 2011 at 08:41, John Cowan co...@mercury.ccil.org wrote: Michael Everson scripsit: I'd like to invite everyone to support this worthwhile project: Worthwhile it may be, but surely misinformed as well. Does Mr. Brooks actually

Re: RTL PUA?

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 17:35, Mark Davis ☕ wrote: All of the property assignments to PUA characters (except the GC) are purely informative. The property assignments that are there are simply based on the likelyhood of property assignment, and can be freely overridden by implementations. How?

Re: RTL PUA?

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 08:47 PM, Michael Everson wrote: Indic scripts have LTR directionality. They can use PUA and do whatever is needed for the*other* challenges inherent in Indic fonts. A private RTL script cannot use the PUA and have the same level of support. In OT, without help from the

Re: RTL PUA?

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 09:54 PM, Michael Everson wrote: On 19 Aug 2011, at 16:38, Mark E. Shoulson wrote: It's pretty disingenuous to say Well, if you want a private-use RTL script, you should be prepared to write an engine that can render it, ignoring the fact that LTR people can get by with just a

RE: RTL PUA?

2011-08-19 Thread Doug Ewell
Shriramana Sharma samjnaa at gmail dot com wrote: I thought this property was mutable, and indeed that even some assigned characters had had their Bidi_Class changed over the years. I could be wrong. Perhaps you refer to this: http://www.unicode.org/versions/corrigendum8.html ? No, I

Re: RTL PUA?

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 10:05 PM, Mark Davis ☕ wrote: All of the property assignments to PUA characters (except the GC) are purely informative. The property assignments that are there are simply based on the likelyhood of property assignment, and can be freely overridden by implementations. Glad to hear

Endangered Code Pages

2011-08-19 Thread Steven R. Loomis
I'd rather see code pages become endangered, and code-page switching an obscure footnote on the pages of history. Please, don't invent any new code page systems. Steven On 08/19/2011 08:40 AM, srivas sinnathurai wrote: Doug, First of all flat code space is the primary functionality of

Re: RTL PUA?

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 09:08 PM, Mark E. Shoulson wrote: It's pretty disingenuous to say Well, if you want a private-use RTL script, you should be prepared to write an engine that can render it, ignoring the fact that LTR people can get by with just a font. Why should it be so much harder to write

Re: RTL PUA?

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 09:01 PM, Mark E. Shoulson wrote: On 08/19/2011 11:03 AM, Shriramana Sharma wrote: In effect, changing the existing BC=L to ON is no worse than changing it to R. I think making the directionality of the PUA L instead of ON was a mistake in the first place, yes, but does even

RE: Endangered Alphabets

2011-08-19 Thread Doug Ewell
William_J_G Overington wjgo underscore 10009 at btinternet dot com wrote: Suppose that a concept of an Endangered Language Code Page is invented. The original Endangered Alphabets subject line was hijacked, almost immediately, into a thread about defining code pages within the Unicode

Re: RTL PUA?

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 18:01, Shriramana Sharma wrote: Even though it isn't encoded? That is, my understanding is that we *can't* change the PUA to ON now, but that there is a suggestion that some *new* hunk of PUA be created that is R, in order to balance the existing L. Is that right?

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread John H. Jenkins
srivas sinnathurai 於 2011年8月19日 上午9:40 寫道: Why this suggestion? With current flat space, one code point is only allocated to one and only one purpose. We can run out of code space soon. There are a couple of problems here. We currently have over 860,000 unassigned code points. Surveys

Re: Code pages and Unicode

2011-08-19 Thread Christoph Päper
John H. Jenkins: there would have to be a *lot* of writing systems out there we don't know about to fill up planes 4 through 14 That’s quite possible, though, the universe is huge. The question rather is whether we will ever know about them. It’s quite possible we won’t.

RE: What are the present criteria...

2011-08-19 Thread William_J_G Overington
On Friday 19 August 2011, Doug Ewell d...@ewellic.org wrote: Sorry, in my attempt to avoid naming names I made it look as though Karl made that claim.  He did not.  William's message was the one that attempted to connect the dots between official WG2 policy and the German NB proposal.  

Re: RTL PUA?

2011-08-19 Thread John H. Jenkins
Michael Everson 於 2011年8月19日 上午11:15 寫道: On 19 Aug 2011, at 18:01, Shriramana Sharma wrote: Even though it isn't encoded? That is, my understanding is that we *can't* change the PUA to ON now, but that there is a suggestion that some *new* hunk of PUA be created that is R, in order to

RE: Code pages and Unicode

2011-08-19 Thread Doug Ewell
Maybe we should step back a bit: I'm not calling for any change to existing major aloocations. However, this is about time we allocate (not PUA) large number of codes to a code page based sub codes so that not only all 7000+ languages can Freely use it without INTERFERENCE from Unicode and

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Michael Everson
On 19 Aug 2011, at 18:24, John H. Jenkins wrote: We currently have over 860,000 unassigned code points. Surveys of all known writing systems indicate that only a small fraction of these will be needed. Indeed, although it looks likely that Han will spill out of the SIP into plane 3, all

RE: Endangered Alphabets

2011-08-19 Thread William_J_G Overington
On Friday 19 August 2011, Doug Ewell d...@ewellic.org wrote: William_J_G Overington wjgo underscore 10009 at btinternet dot com wrote: Suppose that a concept of an Endangered Language Code Page is invented. The original Endangered Alphabets subject line was hijacked, almost

RE: RTL PUA?

2011-08-19 Thread Doug Ewell
John H. Jenkins jenkins at apple dot com wrote: Put a RTL PUA zone in Plane 14, which is mostly empty, and expected to remain so, and you're done. No, you're not, because the OSs/rendering engines would have to rev, and to be honest, there won't be a lot of enthusiasm for doing something

Re: RTL PUA?

2011-08-19 Thread Mark E. Shoulson
On 08/19/2011 12:39 PM, Shriramana Sharma wrote: And I grant you your point of free font making software being available, but still proper OT/Graphite/AAT tables have to be made (to render all those contextual forms in this conscript), which calls for some expertise at least. You would then

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Mark E. Shoulson
On 08/19/2011 01:24 PM, John H. Jenkins wrote: In order to get the UTC and WG2 to agree to a major architectural change such as you're suggesting, you'd have to have some very solid evidence that it's needed—not an interesting idea, not potentially useful, but seriously *needed*. That's how

RE: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Doug Ewell
Mark E. Shoulson mark at kli dot org wrote: And indeed, it went the other way too, back when ISO-10646 had not 17, but 65536 *planes* and someone provided some reasonable evidence (or just plain reasoned arguments) that 4.3 *billion* characters was probably overkill. Technically, I think

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Jukka K. Korpela
20.8.2011 0:07, Doug Ewell wrote: Of course, 2.1 billion characters is also overkill, but the advent of UTF-16 was how we ended up with 17 planes. And now we think that a little over a million is enough for everyone, just as they thought in the late 1980s that 16 bits is enough for everyone.

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Mark E. Shoulson
On 08/19/2011 05:07 PM, Doug Ewell wrote: Mark E. Shoulsonmark at kli dot org wrote: And indeed, it went the other way too, back when ISO-10646 had not 17, but 65536 *planes* and someone provided some reasonable evidence (or just plain reasoned arguments) that 4.3 *billion* characters was

Re: Code pages and Unicode

2011-08-19 Thread Benjamin M Scarborough
On 20 Aug 2011, at 00:35, Jukka K. Korpela wrote: And now we think that a little over a million is enough for everyone, just as they thought in the late 1980s that 16 bits is enough for everyone. Whenever somebody talks about needing 31 bits for Unicode, I always think of the hypothetical

RE: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Doug Ewell
Jukka K. Korpela jkorpela at cs dot tut dot fi wrote: And now we think that a little over a million is enough for everyone, just as they thought in the late 1980s that 16 bits is enough for everyone. I know this is an enjoyable exercise — people love to ridicule Bill Gates for his comment in

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Ken Whistler
On 8/19/2011 2:07 PM, Doug Ewell wrote: Technically, I think 10646 was always limited to 32,768 planes so that one could always address a code point with a 32-bit signed integer (a nod to the Java fans). Well, yes, but it didn't really have anything to do with Java. Remember that Java wasn't

Re: Code pages and Unicode

2011-08-19 Thread John H. Jenkins
Benjamin M Scarborough 於 2011年8月19日 下午3:53 寫道: Whenever somebody talks about needing 31 bits for Unicode, I always think of the hypothetical situation of discovering some extraterrestrial civilization and trying to add all of their writing systems to Unicode. I imagine there would be

Re: Code pages and Unicode

2011-08-19 Thread Ken Whistler
On 8/19/2011 2:53 PM, Benjamin M Scarborough wrote: Whenever somebody talks about needing 31 bits for Unicode, I always think of the hypothetical situation of discovering some extraterrestrial civilization and trying to add all of their writing systems to Unicode. I imagine there would be

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Asmus Freytag
On 8/19/2011 2:35 PM, Jukka K. Korpela wrote: 20.8.2011 0:07, Doug Ewell wrote: Of course, 2.1 billion characters is also overkill, but the advent of UTF-16 was how we ended up with 17 planes. And now we think that a little over a million is enough for everyone, just as they thought in the

Re: Code pages and Unicode (wasn't really: RE: Endangered Alphabets)

2011-08-19 Thread Asmus Freytag
On 8/19/2011 3:24 PM, Ken Whistler wrote: On 8/19/2011 2:07 PM, Doug Ewell wrote: Technically, I think 10646 was always limited to 32,768 planes so that one could always address a code point with a 32-bit signed integer (a nod to the Java fans). Well, yes, but it didn't really have anything

Re: RTL PUA?

2011-08-19 Thread Shriramana Sharma
On 08/19/2011 11:19 PM, John H. Jenkins wrote: Saying that does not make it possible for people to use PUA characters with RTL directionality, since all the OSes treat them as LTR. Mac OS has a mechanism to override that default assumption, the 'prop' table. Which proves my point that the