On May 11, 2020, at 13:31, Barry Scott <ba...@barrys-emacs.org> wrote: > > macOS and Unix version (I only use Unicode input so avoid the random bytes > problems):
But that doesn’t avoid the problem. If someone gives you a character whose encoding on the target filesystem includes a null or pathsep byte, your sanitizer will pass it as safe, when it shouldn’t. This isn’t possible on macOS because the OS won’t let you mount any filesystem whose encoding isn’t UTF-8, but it is possible on most other *nixes, and it has been used as an attack in the past. Is it still a realistic problem today? I don’t know. I’m pretty sure the modern versions of Shift-JIS, EUC-*, Big5, and GB can never have continuation bytes below 0x30, but even if I’m right, are these (and UTF-8, of course) the only multi-byte encodings anyone ever uses on Unix filesystems? _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KPEEFJHXFH26EMLYRPAG27MQD2LJHCHG/ Code of Conduct: http://python.org/psf/codeofconduct/