On Mon, Jun 28, 2021 at 10:03:15AM -0400, Wes Turner wrote:
> Here's this, which IIRC I never wrote tests for, which is what needs to be
> done to specify the correct behavior:
> 
> ```python
> def pathjoin(*args, **kwargs):
>     """
>     Arguments:
>         args (list): *args list of paths
>             if len(args) == 1, args[0] is not a string, and args[0] is
> iterable,
>             set args to args[0].
> 
>     Basically::
> 
>         joined_path = u'/'.join(
>             [args[0].rstrip('/')] +
>             [a.strip('/') for a in args[1:-1]] +
>             [args[-1].lstrip('/')])
>     """
>     log.debug('pathjoin: %r' % list(args))
> 
> 
>     def _pathjoin(*args, **kwargs):
>         len_ = len(args) - 1
>         if len_ < 0:
>             raise Exception('no args specified')
>         elif len_ == 0:
>             if not isinstance(args, basestring):
>                 if hasattr(args, '__iter__'):
>                     _args = args
>                     _args
>                     args = args[0]
>         for i, arg in enumerate(args):
>             if not i:
>                 yield arg.rstrip('/')
>             elif i == len_:
>                 yield arg.lstrip('/')
>             else:
>                 yield arg.strip('/')
>     joined_path = u'/'.join(_pathjoin(*args))
>     return sanitize_path(joined_path)
> 
> 
> def sanitize_path(path):
>     # XXX TODO FIXME
>     if '/../' in path:
>         raise Exception()
>     return path
> ```
> 
>  https://github.com/westurner/pgs/blob/master/pgs/app.py#L60-L95

Yes, something like that should work, except that sanitize_path() misses
leading or trailing '..'.

When we do this is an operator, things become even simpler:

   class PosixPath2(PosixPath):
       def __floordiv__(self, other):
           as_path = PosixPath(other)
           if '..' in as_path.parts:
               raise ValueError("argument has a component with '..'")
           if as_path.is_absolute():
               other = str(other).lstrip('/')
           return self / other       

Zbyszek

> On Mon, Jun 28, 2021, 04:09 Zbigniew Jędrzejewski-Szmek <zbys...@in.waw.pl>
> wrote:
> 
> > On Sun, Jun 27, 2021 at 09:55:34PM -0400, Wes Turner wrote:
> > > "[Python-ideas] Sanitize filename (path part) 2nd try"
> > >
> > https://mail.python.org/archives/list/python-ideas@python.org/thread/LRIKMG3G4I4YQNK6BTU7MICHT7X67MEF/
> > >
> > > "[Python-ideas] Sanitize filename (path part)"
> > >
> > https://mail.python.org/archives/list/python-ideas@python.org/thread/SQH4LPERFLKBLXPDUOVJMV24JBCBUCYO/
> > >
> > > ```quote
> > > What does sanitizepart do with a leading slash?
> > >
> > > assert os.path.join("a", "/b") == "/b"
> > >
> > > A new safejoin() or joinsafe() or join(safe='True') could call
> > > sanitizepart() such that:
> > >
> > > assert joinsafe("a\n", "/b") == "a\n/b"
> > > ```
> >
> > Thanks for the links. "sanitizepart()" seems to be about *constructing*
> > a safe filename. It's a different problem and there's a thousand ways to
> > do it.
> >
> > I think the idea with joinsafe() is similar to my idea... But I think
> > the req to disallow '..' is crucial. If we set the requirements as:
> >
> > 1. the resulting path must not be above the lhs arg
> > 2. the operation must be done without actually accessing the fs
> >
> > right now I see the proposed operation that rejects '..' as the best
> > approach.
> >
> > Zbyszek
> >
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5JJEJ7OMW2I4C5RDVZ3IERXRZZ3NBEFC/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to