Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

Martin v. Löwis Sat, 25 Apr 2009 05:22:54 -0700

> The problem with this, and other preceding schemes that have been
> discussed here, is that there is no means of ascertaining whether a
> particular file name str was obtained from a str API, or was funny-
> decoded from a bytes API... and thus, there is no means of reliably
> ascertaining whether a particular filename str should be passed to a
> str API, or funny-encoded back to bytes.


Why is it necessary that you are able to make this distinction?

> Picking a character (I don't find U+F01xx in the
> Unicode standard, so I don't know what it is)

It's a private use area. It will never carry an official character
assignment.

> As I realized in the email-sig, in talking about decoding corrupted
> headers, there is only one way to guarantee this... to encode _all_
> character sequences, from _all_ interfaces.  Basically it requires
> reserving an escape character (I'll use ? in these examples -- yes, an
> ASCII question mark -- happens to be illegal in Windows filenames so
> all the better on that platform, but the specific character doesn't
> matter... avoiding / \ and . is probably good, though).

I think you'll have to write an alternative PEP if you want to see
something like this implemented throughout Python.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

Reply via email to