Re: [Python-3000] Draft PEP for New IO system

Jim Jewett Tue, 27 Feb 2007 10:41:52 -0800

On 2/27/07, Adam Olsen <[EMAIL PROTECTED]> wrote:
> On 2/26/07, Mike Verdone <[EMAIL PROTECTED]> wrote:
> > Text I/O
> > ... operate on a per-character basis instead of a per-byte basis.


> "per-character" needs some clarification.  I'm guessing this will only
> return entire code points, but the unicode type will expose them as
> code units, so it could be seen as both per-code-point and
> per-code-unit.

Does this just mean that you assume
(1) UTF32
(2) surrogate pairs will show up as two characters
(3) diacritics may (or may not) show up separately from their base characters?

This does suggest that error-correction should be specified (or at
least explicitly not specified).  If the underlying input byte-stream
contains an invalid sequence, will the TextIO raise a
UnicodeDecodeError?  Or will its error/replace/delete behavior be
settable?

Does the Text class promise to catch things like an invalid
combination of surrogates?

-jJ
_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Re: [Python-3000] Draft PEP for New IO system

Reply via email to