Reminder: open Public Review Issues

2004-01-21 Thread Rick McGowan
This is your friendly reminder that the February UTC meeting is quickly approaching. There are several public review issues open. So far, public comment has been light. I hope you have all been working diligently on your comments during the cold dark days of winter and are ready to spring the

Re: Chinese FVS? (was: RE: Cuneiform Free Variation Selectors)

2004-01-21 Thread John Jenkins
On Jan 21, 2004, at 6:36 AM, Andrew C. West wrote: If a simplified form of a given CJK ideograph is used, then it deserves encoding properly. There are newly-coined simplified forms in CJK-B and CJK-C, so why not add newly used simplified forms to CJK-C or whereever if they are really needed ?

Re: Chinese FVS? (was: RE: Cuneiform Free Variation Selectors)

2004-01-21 Thread John Jenkins
On Jan 20, 2004, at 6:14 PM, [EMAIL PROTECTED] wrote: Except that this character is listed in CJK Extension C, on page 612. (File: IRGN9285.PDF 08/06/02) Irrelevant. Extension C isn't encoded yet. The UTC's intention is that in the future, SC forms derivable algorithmically from their encoded

Re: Mongolian Unicoding (was Re: Cuneiform Free Variation Selectors)

2004-01-21 Thread Andrew C. West
On Tue, 20 Jan 2004 16:33:24 -0500, [EMAIL PROTECTED] wrote: > > Andrew C. West scripsit: > > > These are glyph variants of Phags-pa letters that are used with semantic > > distinctiveness in a single (but very important) text, _Menggu Ziyun_ , a 14th > > century rhyming dictionary of Chinese in

Re: Chinese FVS? (was: RE: Cuneiform Free Variation Selectors)

2004-01-21 Thread Andrew C. West
On Tue, 20 Jan 2004 10:32:06 -0700, John Jenkins wrote: > > 1) U+9CE6 is a traditional Chinese character (a kind of swallow) > without a SC counterpart encoded. However, applying the usual rules > for simplifications, it would be easy to derive a simplified form which > one could conceivably

Re: Unicode forms for internal storage

2004-01-21 Thread Elliotte Rusty Harold
At 10:59 PM -0800 1/20/04, Doug Ewell wrote: If you are using the "mini" version of SCSU where Latin-1 characters are stored as 1 byte each and everything else is stored as UTF-16 (using SCU and UC0 tags to switch between modes), you ought to achieve really good speed. I'll have to try this. The s

Re: Unicode forms for internal storage

2004-01-21 Thread Elliotte Rusty Harold
At 10:24 AM + 1/21/04, Jon Hanna wrote: Do you plan to support XML1.1 with XOM? The C0 controls forbidden in the 1.0 spec are allowed in the 1.1 spec if they appear as character references - so this no longer holds (unless you store them as references or otherwise escaped, which would bring i

Re: Unicode forms for internal storage

2004-01-21 Thread Jon Hanna
> In developing such a format I have a couple of advantages: > > 1. Most C0 controls are forbidden, and will not appear in the data. > That's already verified. If someone tries to pass in a C0 control > other than tab, linefeed, or carriage return to setValue, an > exception is thrown and the d

Re: Unicode forms for internal storage

2004-01-21 Thread Doug Ewell
I just wrote: > Oooh. That could potentially be a problem with SCSU... Never mind. I completely misunderstood what Elliotte had written. SCSU is, in fact, ideal for his needs. -Doug

Re: Unicode forms for internal storage

2004-01-21 Thread Doug Ewell
Elliotte Rusty Harold wrote: > In developing such a format I have a couple of advantages: > > 1. Most C0 controls are forbidden, and will not appear in the data. > That's already verified. If someone tries to pass in a C0 control > other than tab, linefeed, or carriage return to setValue, an > ex

Re: Unicode forms for internal storage

2004-01-21 Thread Doug Ewell
Elliotte Rusty Harold wrote: >> BZZZT! Sorry, thanks for playing. You can't get the >> advantages of both with no drawbacks. Given the octets 0x5B5B, how >> would you know if you had "[[" or a Chinese character? > > Actually, it looks like SCSU may do exactly that. If I'm > understand