Re: Why exception from os.path.exists()?

Chris Angelico Thu, 07 Jun 2018 10:55:57 -0700

On Fri, Jun 8, 2018 at 3:10 AM, MRAB <[email protected]> wrote:
> On 2018-06-07 08:45, Chris Angelico wrote:
>> Under Linux, a file name contains bytes, most commonly representing
>> UTF-8 sequences. So... an ASCIIZ string *can* contain that character,
>> or at least a representation of it. Yet it cannot contain "\0".
>>
> I've seen a variation of UTF-8 that encodes U+0000 as 2 bytes so that a zero
> byte can be used as a terminator.
>
> It's therefore not impossible to have a version of Linux that allowed a
> (Unicode) "\0" in a filename.


Considering that Linux treats filenames as raw bytes, that's not
surprising. The mangled encoding you refer to is a horrendous cheat,
though, and violates several of the design principles of UTF-8, so I
do not recommend it EVER. The correct way for Python to handle and
represent such a file name would be to use the U+DCxx range to carry
the bytes through unchanged - not using "\0".

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Why exception from os.path.exists()?

Reply via email to