Steve Jorgensen wrote:
> Andrew Barnert wrote:
> > On May 9, 2020, at 17:35, Steve Jorgensen
> > ste...@stevej.name wrote:
> > I believe the Python standard library should
> > include
> > a means of sanitizing a filesystem entry, and this should not be something 
> > requiring a
> > 3rd
> > party package.
> > One of reasons I think this should be in the standard lib is because that 
> > provides a
> > common, simple means for code reviewers and static analysis services such 
> > as Veracode to
> > recognize that a value is sanitized in an accepted manner.
> > This does seem like a good idea. People who do this themselves get it wrong 
> > all
> > the time, occasionally with disastrous consequences, so if Python can solve 
> > that, that
> > would be great.
> > But, at least historically, this has been more complicated than what you’re 
> > suggesting
> > here. For example, don’t you have to catch things like directories named 
> > “Con” or files
> > whose 8.3 representation has “CON” as the 8 part? I don’t think you can 
> > hang an entire
> > Windows system by abusing those anymore, but you can still produce 
> > filenames that some
> > APIs, and some tools (possibly including Explorer, cmd, powershell, Cygwin, 
> > mingw/native
> > shells, Python itself…) can’t access (or can only access if the user 
> > manually specified a
> > .\ absolute path, or whatever).
> > Yes. I am aware of some of the unsafe names in DOS and older Windows. As I
> mentioned in my other reply, there is a distinction between the ones that are 
> merely
> invalid and those that are actually unsafe. In researching existing Linux 
> tools just now,
> I was reminded that a leading dash is frequently unsafe because many tools 
> will treat an
> argument starting with dash as an option argument.
> > Is there an established algorithm/rule that lots of
> > people in the industry trust that
> > Python can just reference, instead of having to research or invent it? 
> > Because otherwise,
> > we run the risk of making things worse instead of better.
> > An excellent point! I just started digging into that and found references to
> detox and Glindra. Neither of those seems to be well maintained though. The 
> documentation
> pages for Glindra no longer exist and detox is not in standard package 
> repositories for
> CentOS later than 6 (and only in EPEL for that. Still digging.

Extremely apropos to the question of what charters might be problematic and/or 
unsafe: https://dwheeler.com/essays/fixing-unix-linux-filenames.html
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EDJQA7SDUWEHJ53GYXIGX2HPTU3JEM6X/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to