Re: Full Unicode strings strawman

Allen Wirfs-Brock Mon, 16 May 2011 16:50:54 -0700

On May 16, 2011, at 3:36 PM, Mark Davis ☕ wrote:

> > all defined Unicode characters.
> 
> That would also not be correct. The defined characters are only about 109K 
> (more if you consider private use); nowhere near the number of code points, 
> because there are over 800K code points that are reserved for the allocation 
> of future characters. For a breakdown, see 
> http://www.unicode.org/versions/Unicode6.0.0/#Character_Additions


Sorry about the terminology issues, I work on fixing them.

I actually think "character" is the right term for use in:

SourceCharcter ::
  any Unicode character

This is defining the alphabet of the grammar.  The alphabet is composed of 
logical characters, not specific encodings.  The actual program might be 
encoded in EBCDIC or Hollerith card codes as long as there is a mapping of the 
characters actually used in that encoding to Unicode characters.

The intent is that any defined Unicode character can be used.  That is the 109K 
but growing in the future as Unicode adopts additional characters.  In 
practice, there are actually very view places in the grammar when any 
SourceCharacter is allowed but in those places we really do me the valid 
logical characters defined by the current Unicode standard.

Allen

_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Re: Full Unicode strings strawman

Reply via email to