Steve Jorgensen wrote: > I believe the Python standard library should include a means of sanitizing a > filesystem > entry, and this should not be something requiring a 3rd party package. > One of reasons I think this should be in the standard lib is because that > provides a > common, simple means for code reviewers and static analysis services such as > Veracode to > recognize that a value is sanitized in an accepted manner. > What I am envisioning is a function (presumably in os.path with a > signature roughly like > {{{ > sanitizepart(name, permissive=False, mode=ESCAPE, system=None) > }}} > When permissive is False, characters that are generally > unsafe are rejected. When permissive is True, only path > separator characters are rejected. Generally unsafe characters besides path > separators > would include things like a leading ".", any non-printing character, any > wildcard, piping > and redirection characters, etc. > The mode argument indicates what to do with unacceptable characters. > Escape them (ESCAPE), omit them (OMIT) or raise an exception > (RAISE). This could also double as an escape character argument when a string > is given. The default escape character should probably be "%" (same as URL > encoding). > The system argument accepts a combination of bit flags indicating what > operating system's rules to apply, or None meaning to use rules for the > current platform. Systems would probably include SYS_POSIX, > SYS_WIN, and SYS_MISC where miscellaneous means to enforce rules > for all commonly used systems. One example of a distinction is that on a > POSIX system, > backslash characters are not path separators, but on Windows, both forward > and backward > slashes are path separators. > {{{ > from os import path > from os.path import sanitizepart > print(repr( > os.path.sanitizepart('/ABC\QRS%', system=path.SYS_WIN)) > # => '%2fABC%5cQRS%%' > os.path.sanitizepart('/ABC\\QRS%', True, mode=path.STRIP, > system=path.SYS_POSIX)) > > # => 'ABC\QRS%' > os.path.sanitizepart('../AB&CD*\x01\n', system=path.SYS_POSIX)) > > # => '%2e.%2fABC%26CD%2a%01%10' > os.path.sanitizepart('../AB&CD*\x01\n', True, system=path.SYS_POSIX)) > > # => '..%2eAB&CD*\x01\n' > }}}
Existing work: https://pypi.org/project/pathvalidate/#sanitize-a-filename _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/FI2V2EZGLSYB3AAV5V5RNEOFJQWQE45S/ Code of Conduct: http://python.org/psf/codeofconduct/