On Wed, Mar 11, 2020 at 7:19 AM Christopher Barker <python...@gmail.com> wrote:
>
> Getting a bit OT, but I *think* this is the story:
>
> I've heard it argued, by folks that want to write Python software that uses 
> bytes for filenames, that:
>
> A file path on a *nix system can be any string of bytes, except two special 
> values:
>
> b'\x00'   : null
> b'\x2f'    : slash
>
> (consistent with this SO post, among many other sources: 
> https://unix.stackexchange.com/questions/39175/understanding-unix-file-name-encoding)
>
> So any encoding will work, as long as those two values mean the right thing. 
> Practically, null is always null, so that leaves the slash
>
> So any encoding that uses b'\x2f' for the slash would work. Which seems to 
> include, for instance, UTF-16:
>
> In [31]: "/".encode('utf-16')
> Out[31]: b'\xff\xfe/\x00'

Nope, see above about b'\x00' :)

> In practice, maybe knowing that it's ascii compatible in the first 127 bytes 
> will get pretty far...

That's exactly what "ASCII compatible" means. Since ASCII is a
seven-bit encoding, an encoding is ASCII-compatible if (a) every ASCII
character is represented by the corresponding byte value, and (b)
every seven-bit value represents that ASCII character.

ChrisA
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CJ5W3MHSU2YO4U5VHF2T4K2EYYMB32KU/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to