Steve Jorgensen wrote:
> I believe the Python standard library should include a means of sanitizing a 
> filesystem
> entry, and this should not be something requiring a 3rd party package.
> One of reasons I think this should be in the standard lib is because that 
> provides a
> common, simple means for code reviewers and static analysis services such as 
> Veracode to
> recognize that a value is sanitized in an accepted manner.
> What I am envisioning is a function (presumably in os.path with a
> signature roughly like
> {{{
> sanitizepart(name, permissive=False, mode=ESCAPE, system=None)
> }}}
> When permissive is False, characters that are generally
> unsafe are rejected. When permissive is True, only path
> separator characters are rejected. Generally unsafe characters besides path 
> separators
> would include things like a leading ".", any non-printing character, any 
> wildcard, piping
> and redirection characters, etc.
> The mode argument indicates what to do with unacceptable characters.
> Escape them (ESCAPE), omit them (OMIT) or raise an exception
> (RAISE). This could also double as an escape character argument when a string
> is given. The default escape character should probably be "%" (same as URL 
> encoding).
> The system argument accepts a combination of bit flags indicating what
> operating system's rules to apply, or None meaning to use rules for the
> current platform. Systems would probably include SYS_POSIX,
> SYS_WIN, and SYS_MISC where miscellaneous means to enforce rules
> for all commonly used systems. One example of a distinction is that on a 
> POSIX system,
> backslash characters are not path separators, but on Windows, both forward 
> and backward
> slashes are path separators.
> {{{
> from os import path
> from os.path import sanitizepart
> print(repr(
>     os.path.sanitizepart('/ABC\QRS%', system=path.SYS_WIN))
> # => '%2fABC%5cQRS%%'
> os.path.sanitizepart('/ABC\\QRS%', True, mode=path.STRIP,
> system=path.SYS_POSIX))
> 
> # => 'ABC\QRS%'
> os.path.sanitizepart('../AB&CD*\x01\n', system=path.SYS_POSIX))
> 
> # => '%2e.%2fABC%26CD%2a%01%10'
> os.path.sanitizepart('../AB&CD*\x01\n', True, system=path.SYS_POSIX))
> 
> # => '..%2eAB&CD*\x01\n'
> }}}

Existing work:
https://pypi.org/project/pathvalidate/#sanitize-a-filename
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FI2V2EZGLSYB3AAV5V5RNEOFJQWQE45S/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to