Steve Jorgensen wrote:
> Steve Jorgensen wrote:
> > I believe the Python standard library should include
> > a means of sanitizing a filesystem
> > entry, and this should not be something requiring a 3rd party package.
> > One of reasons I think this should be in the standard lib is because that 
> > provides a
> > common, simple means for code reviewers and static analysis services such 
> > as Veracode to
> > recognize that a value is sanitized in an accepted manner.
> > What I am envisioning is a function (presumably in os.path with a
> > signature roughly like
> > {{{
> > sanitizepart(name, permissive=False, mode=ESCAPE, system=None)
> > }}}
> > When permissive is False, characters that are generally
> > unsafe are rejected. When permissive is True, only path
> > separator characters are rejected. Generally unsafe characters besides path 
> > separators
> > would include things like a leading ".", any non-printing character, any 
> > wildcard, piping
> > and redirection characters, etc.
> > The mode argument indicates what to do with unacceptable characters.
> > Escape them (ESCAPE), omit them (OMIT) or raise an exception
> > (RAISE). This could also double as an escape character argument when a 
> > string
> > is given. The default escape character should probably be "%" (same as URL 
> > encoding).
> > The system argument accepts a combination of bit flags indicating what
> > operating system's rules to apply, or None meaning to use rules for the
> > current platform. Systems would probably include SYS_POSIX,
> > SYS_WIN, and SYS_MISC where miscellaneous means to enforce rules
> > for all commonly used systems. One example of a distinction is that on a 
> > POSIX system,
> > backslash characters are not path separators, but on Windows, both forward 
> > and backward
> > slashes are path separators.
> > {{{
> > from os import path
> > from os.path import sanitizepart
> > print(repr(
> >     os.path.sanitizepart('/ABC\QRS%', system=path.SYS_WIN))
> > # => '%2fABC%5cQRS%%'
> > os.path.sanitizepart('/ABC\QRS%', True, mode=path.STRIP,
> > system=path.SYS_POSIX))
> > # => 'ABC\QRS%'
> > os.path.sanitizepart('../AB&CD*\x01\n', system=path.SYS_POSIX))
> > # => '%2e.%2fABC%26CD%2a%01%10'
> > os.path.sanitizepart('../AB&CD*\x01\n', True, system=path.SYS_POSIX))
> > # => '..%2eAB&CD*\x01\n'
> > }}}
> > Existing work:
> https://pypi.org/project/pathvalidate/#sanitize-a-filename

More existing work:
* https://pypi.org/project/sanitize-filename/
* http://detox.sourceforge.net/
* 
https://sourceforge.net/p/glindra/news/2005/08/glindra-rename--lower--portable/
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ITEHIWIFNGM5WOMOC5UAHKQVMLVIBR6Z/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to