Here's a proposed solution then. I hereby submit it for use on that incredibly distant day in which our oracle fails and a new 1 million code point script is added to Unicode (e.g. never).
When all of the planes less than 16 are full and the possibility of exhausting code points become actually apparent (but not before), the UTC should reserve a range of code points in plane 16 to serve as "astral low surrogates" and another to serve as "astral high surrogates". UTF-16 can the use a pair of surrogate pairs to address the higher planes thereby exposed. And we won't all have to muck with our implementations to support this stuff. Regards, Addison Addison P. Phillips Director, Globalization Architecture webMethods | Delivering Global Business Visibility 432 Lakeside Drive, Sunnyvale, CA, USA +1 408.962.5487 (office) +1 408.210.3569 (mobile) mailto:[EMAIL PROTECTED] Chair, W3C-I18N-WG, Web Services Task Force http://www.w3.org/International/ws Internationalization is an architecture. It is not a feature. > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Behalf Of Rick McGowan > Sent: Thursday, October 16, 2003 12:50 PM > To: [EMAIL PROTECTED] > Subject: Re: Beyond 17 planes, was: Java char and Unicode 3.0+ > > > Before everyone goes jumping off the deep end with wanting to > reserve more > space on the BMP for hyper extended surrogates or whatever, can someone > please come up with more than 1 million things that need to be encoded? > > Our best estimate, for all of human history, comes in around > 250,000. Even > if we included, as characters, lots of stuff that is easily unified with > existing characters, or undeciphered, or just more dingbatty blorts, it > comes up nowhere near a million. > > What you see on the roadmap is what we, in over 12 years of searching, > have been able to find. I challenge anyone to come up with enough > legitimate characters (approximately a million of them) that > aren't on the > roadmap to fill the 17 planes. > > Thanks. > Rick

