On May 11, 2020, at 13:31, Barry Scott <ba...@barrys-emacs.org> wrote:
> 
> macOS and Unix version (I only use Unicode input so avoid the random bytes 
> problems):

But that doesn’t avoid the problem. If someone gives you a character whose 
encoding on the target filesystem includes a null or pathsep byte, your 
sanitizer will pass it as safe, when it shouldn’t.

This isn’t possible on macOS because the OS won’t let you mount any filesystem 
whose encoding isn’t UTF-8, but it is possible on most other *nixes, and it has 
been used as an attack in the past.

Is it still a realistic problem today? I don’t know. I’m pretty sure the modern 
versions of Shift-JIS, EUC-*, Big5, and GB can never have continuation bytes 
below 0x30, but even if I’m right, are these (and UTF-8, of course) the only 
multi-byte encodings anyone ever uses on Unix filesystems?
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KPEEFJHXFH26EMLYRPAG27MQD2LJHCHG/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to