Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog

Jeff Allen Thu, 11 Sep 2014 23:56:29 -0700


On 12/09/2014 04:28, Stephen J. Turnbull wrote:

Jeff Allen writes:


  > A welcome article. One correction should be made, I believe: the area of
  > code point space used for the smuggling of bytes under PEP-383 is not a
  > "Unicode Private Use Area", but a portion of the trailing surrogate
  > range.

Nice catch.  Note that the surrogate range was originally part of the
Private Use Area, but it was carved out with the adoption of UTF-16 in
about 1993.  In practice, I doubt that there are any current
implementations claiming compatibility with Unicode 1.0 (IIRC, UTF-16
was made mandatory in Unicode 1.1).

That's a helpful bit of history that explains the uncharacteristicinaccuracy. Most I can do to keep the current position clear in my head.

I've always thought that the "right" way to handle the private use
area for "platforms" like Python and Emacs, which may need to use it
for their own purposes (such as "undecodable bytes") but want to
respect its use by applications, is to create an auxiliary table
mapping the private use area to objects describing the characters
represented by the private use code points.  These objects would have
attributes such as external representation for text I/O, glyph (for
GUI display), repr (for TTY display), various Unicode properties, etc.

Simply having a block "for private use" seems to create an unmanagedspace for conflict, reminiscent of the "other 128 characters" inbilingual programming. I wondered if the way to respect use byapplications might be to make it private to a particular sub-class ofstr, idly however.


Jeff Allen

_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog

Reply via email to