FWIW, here are some of the CWE codes for related vulnerabilities/weaknesses in implementations:
CWE-73: External Control of File Name or Path https://cwe.mitre.org/data/definitions/73.html CWE-707: Improper Neutralization https://cwe.mitre.org/data/definitions/707.html CWE-22: Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal') https://cwe.mitre.org/data/definitions/22.html Because this behavior of os.path.join is documented, it's not a vuln in Python, it's a vuln in every downstream component that (1) uses os.path.join with user supplied input; and that (2) doesn't strip a leading '/' from path parts before joining them with os.path.join. https://docs.python.org/3/library/os.path.html#os.path.join > [...] If a component is an absolute path, all previous components are thrown away and joining continues from the absolute path component. [quoting from "part 2"] What does sanitizepart do with a leading slash? assert os.path.join("a", "/b") == "/b" A new safejoin() or joinsafe() or join(safe='True') could call sanitizepart() such that: assert joinsafe("a\n", "/b") == "a\\n/b" On Sun, May 10, 2020 at 5:36 AM Steve Jorgensen <ste...@stevej.name> wrote: > Steve Jorgensen wrote: > > Steve Jorgensen wrote: > > > Andrew Barnert wrote: > > > On May 9, 2020, at 17:35, Steve Jorgensen > > > ste...@stevej.name wrote: > > > I believe the Python standard library should > > > include > > > a means of sanitizing a filesystem entry, and this should not be > something requiring a > > > 3rd > > > party package. > > > One of reasons I think this should be in the standard lib is because > that provides a > > > common, simple means for code reviewers and static analysis services > such as Veracode to > > > recognize that a value is sanitized in an accepted manner. > > > This does seem like a good idea. People who do this themselves get it > wrong all > > > the time, occasionally with disastrous consequences, so if Python can > solve that, that > > > would be great. > > > But, at least historically, this has been more complicated than what > you’re suggesting > > > here. For example, don’t you have to catch things like directories > named “Con” or files > > > whose 8.3 representation has “CON” as the 8 part? I don’t think you > can hang an entire > > > Windows system by abusing those anymore, but you can still produce > filenames that some > > > APIs, and some tools (possibly including Explorer, cmd, powershell, > Cygwin, mingw/native > > > shells, Python itself…) can’t access (or can only access if the user > manually specified a > > > .\ absolute path, or whatever). > > > Yes. I am aware of some of the unsafe names in DOS and older Windows. > As I > > > mentioned in my other reply, there is a distinction between the ones > that are merely > > > invalid and those that are actually unsafe. In researching existing > Linux tools just now, > > > I was reminded that a leading dash is frequently unsafe because many > tools will treat an > > > argument starting with dash as an option argument. > > > Is there an established algorithm/rule that lots of > > > people in the industry trust that > > > Python can just reference, instead of having to research or invent it? > Because otherwise, > > > we run the risk of making things worse instead of better. > > > An excellent point! I just started digging into that and found > references to > > > detox and Glindra. Neither of those seems to be well maintained > though. The documentation > > > pages for Glindra no longer exist and detox is not in standard package > repositories for > > > CentOS later than 6 (and only in EPEL for that. Still digging. > > > Extremely apropos to the question of what charters might be problematic > > and/or unsafe: > https://dwheeler.com/essays/fixing-unix-linux-filenames.html > > That article links to another by the same author that is specific to > vulnerabilities caused by file names. > https://dwheeler.com/secure-programs/Secure-Programs-HOWTO/file-names.html > _______________________________________________ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/FDZOXS2BNZHJ4XAG7WU7BO3AA7KF6WWK/ > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YNBV7RDBWHGN6XHHDCYEA7XPOKP33LSU/ Code of Conduct: http://python.org/psf/codeofconduct/