> On 11 May 2020, at 23:24, Andrew Barnert <abarn...@yahoo.com> wrote: > > On May 11, 2020, at 13:31, Barry Scott <ba...@barrys-emacs.org> wrote: >> >> macOS and Unix version (I only use Unicode input so avoid the random bytes >> problems): > > But that doesn’t avoid the problem. If someone gives you a character whose > encoding on the target filesystem includes a null or pathsep byte, your > sanitizer will pass it as safe, when it shouldn’t.
Do you have a example that shows an encoding that produces a NUL or pathsep? I'm not aware of any. > > This isn’t possible on macOS because the OS won’t let you mount any > filesystem whose encoding isn’t UTF-8, but it is possible on most other > *nixes, and it has been used as an attack in the past. Indeed the case of mounting an NTFS filesystem on Linux now requires the use of the NTFS rules to validate the filename, > > Is it still a realistic problem today? I don’t know. I’m pretty sure the > modern versions of Shift-JIS, EUC-*, Big5, and GB can never have continuation > bytes below 0x30, but even if I’m right, are these (and UTF-8, of course) the > only multi-byte encodings anyone ever uses on Unix filesystems? I suspect that legacy encoding are used in organisations with old data, but do have direct experience of this. Barry _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/PT273EU4CS7GKGIR5NNEVM6V5XI5M45F/ Code of Conduct: http://python.org/psf/codeofconduct/