> The problem with this, and other preceding schemes that have been > discussed here, is that there is no means of ascertaining whether a > particular file name str was obtained from a str API, or was funny- > decoded from a bytes API... and thus, there is no means of reliably > ascertaining whether a particular filename str should be passed to a > str API, or funny-encoded back to bytes.
Why is it necessary that you are able to make this distinction? > Picking a character (I don't find U+F01xx in the > Unicode standard, so I don't know what it is) It's a private use area. It will never carry an official character assignment. > As I realized in the email-sig, in talking about decoding corrupted > headers, there is only one way to guarantee this... to encode _all_ > character sequences, from _all_ interfaces. Basically it requires > reserving an escape character (I'll use ? in these examples -- yes, an > ASCII question mark -- happens to be illegal in Windows filenames so > all the better on that platform, but the specific character doesn't > matter... avoiding / \ and . is probably good, though). I think you'll have to write an alternative PEP if you want to see something like this implemented throughout Python. Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com