Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.

2012-09-26 Thread Jonathan Nieder
Hi,

Michael G Schwern wrote:
 On 2012.7.30 12:51 PM, Eric Wong wrote:
 Michael G Schwern wrote:

 _collapse_dotdot() works better than the existing regex did.

 I don't dispute it's better, but it's worth explaining in the commit
 message to reviewers why something is better.

 Yeah.  I figured the tests covered that.

Now I'm tripping up on the same thing.  Eric, did you ever find out
what the motivation for this patch was?  Is SVN 1.7 more persnickety
about runs of multiple slashes in a row or something, or is it more
of an aesthetic thing?

Puzzled,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.

2012-09-26 Thread Eric Wong
Jonathan Nieder jrnie...@gmail.com wrote:
 Hi,
 
 Michael G Schwern wrote:
  On 2012.7.30 12:51 PM, Eric Wong wrote:
  Michael G Schwern wrote:
 
  _collapse_dotdot() works better than the existing regex did.
 
  I don't dispute it's better, but it's worth explaining in the commit
  message to reviewers why something is better.
 
  Yeah.  I figured the tests covered that.
 
 Now I'm tripping up on the same thing.  Eric, did you ever find out
 what the motivation for this patch was?  Is SVN 1.7 more persnickety
 about runs of multiple slashes in a row or something, or is it more
 of an aesthetic thing?

I'm not sure about this case specifically, but SVN has (and will likely
become) more persnickety over time.  I haven't had a chance to check SVN
itself, but I think being defensive and giving it prettier paths will
be safer in the future.

That said, I'd favor an implementation that split on m{/+} and
collapsed as Michael mentioned.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.

2012-09-26 Thread Jonathan Nieder
Eric Wong wrote:

 That said, I'd favor an implementation that split on m{/+} and
 collapsed as Michael mentioned.

Sounds sensible.  Is canonicalize_path responsible for collapsing
runs of slashes?  What should _collapse_dotdot do to
c:/.. or http://www.example.com/..;?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.

2012-09-26 Thread Eric Wong
Jonathan Nieder jrnie...@gmail.com wrote:
 Eric Wong wrote:
  That said, I'd favor an implementation that split on m{/+} and
  collapsed as Michael mentioned.
 
 Sounds sensible.  Is canonicalize_path responsible for collapsing
 runs of slashes?  What should _collapse_dotdot do to
 c:/.. or http://www.example.com/..;?

It should probably just return the root path (c:/ and
http://www.example.com/; respectively).
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.

2012-09-26 Thread Jonathan Nieder
Eric Wong wrote:

 It should probably just return the root path (c:/ and
 http://www.example.com/; respectively).

That means recognizing drive letters and URLs.  Hm.

Subversion commands seem to use svn_client_args_to_target_array2
to canonicalize arguments.  It does something like the following:

 1. split at @PEG revision
 2. check if it looks like a URL (svn_path_is_url).  If so:

i.   urlencode characters with high bit set (svn_path_uri_from_iri)
ii.  urlencode some other special characters (svn_path_uri_autoescape)
 (that is: [ \\^`{|}])
iii. (on Windows) convert backslashes to forward slashes
iv.  complain if there are any '..' (svn_path_is_backpath_present)
v.   make url scheme and hostname lowercase, strip default portnumber,
 strip trailing '/', collapse '/'-es including urlencoded %2F,
 strip '.' and '%2E' components, make drive letter in file:///
 URLs uppercase, urlencode uri-special characters
 (svn_uri_canonicalize)

   Otherwise:

i.   canonicalize case, handle '..' and '.' components, and make
 path separators into '/'
 (apr_filepath_merge(APR_FILEPATH_TRUENAME))
ii.  strip trailing '/', collapse '/'-es except in UNC paths,
 strip '.' components, make drive letter uppercase
 (svn_dirent_canonicalize)
iii. deal with truepath collisions (unwanted case
 canonicalizations)
iv.  reject .svn and _svn directories

Maybe we can use apr_filepath_merge() to avoid reinventing the wheel?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.

2012-09-26 Thread Eric Wong
Jonathan Nieder jrnie...@gmail.com wrote:
 Maybe we can use apr_filepath_merge() to avoid reinventing the wheel?

Ideally, yes.  Is there an easy way to access that from Perl? (and
for the older versions of SVN folks people are running).

Perhaps we can expose equivalent functionality in git via
git-rev-parse instead?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.

2012-09-26 Thread Jonathan Nieder
Eric Wong wrote:

 Ideally, yes.  Is there an easy way to access that from Perl? (and
 for the older versions of SVN folks people are running).

Subversion's swig bindings only wrap a few apr functions and do not
depend on fuller apr bindings.

Something like svn_dirent_is_under_root() could be useful.  It uses
whatever base path you choose.  I haven't tried using base_path=
yet.  New in Subversion 1.7, but that would be ok --- an imperfect
fallback for older Subversion versions would be fine.

Unfortunately, from swig/core.i: SWIG can't digest these functions
yet, so ignore them for now. TODO: make them work.

svn_client_args_to_target_array2() is exposed and would probably be
perfect.  I can't seem to find how to create an apr_array_header_t
to pass as its first argument, alas.

 Perhaps we can expose equivalent functionality in git via
 git-rev-parse instead?

It might be possible to add a git-svn--helper command that links to
libsvn or apr, but ick.  The trouble is it's not clear at all how
Subversion's perl API was *designed* to be used.

Hm.
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.

2012-07-30 Thread Eric Wong
Michael G. Schwern schw...@pobox.com wrote:
 From: Michael G. Schwern schw...@pobox.com
 
 The SVN API functions will not accept ../foo but their canonicalization
 functions will not collapse it.  So we'll have to do it ourselves.
 
 _collapse_dotdot() works better than the existing regex did.

I don't dispute it's better, but it's worth explaining in the commit
message to reviewers why something is better.

 This will be used shortly when canonicalize_path() starts using the
 SVN API.
 ---

 +# Turn foo/../bar into bar
 +sub _collapse_dotdot {
 + my $path = shift;
 +
 + 1 while $path =~ s{/[^/]+/+\.\.}{};
 + 1 while $path =~ s{[^/]+/+\.\./}{};
 + 1 while $path =~ s{[^/]+/+\.\.}{};

This is a bug that's gone unnoticed[1] for over 5 years now,
but I've just noticed this doesn' handle foo/..bar  or foo/...bar
cases correctly.

  sub canonicalize_path {
   my ($path) = @_;
   my $dot_slash_added = 0;
 @@ -83,7 +95,7 @@ sub canonicalize_path {
   # good reason), so let's do this manually.
   $path =~ s#/+#/#g;
   $path =~ s#/\.(?:/|$)#/#g;
 - $path =~ s#/[^/]+/\.\.##g;
 + $path = _collapse_dotdot($path);

[1] - I doubt anybody uses paths like these, though...
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.

2012-07-30 Thread Michael G Schwern
On 2012.7.30 12:51 PM, Eric Wong wrote:
 The SVN API functions will not accept ../foo but their canonicalization
 functions will not collapse it.  So we'll have to do it ourselves.

 _collapse_dotdot() works better than the existing regex did.
 
 I don't dispute it's better, but it's worth explaining in the commit
 message to reviewers why something is better.

Yeah.  I figured the tests covered that.


 +# Turn foo/../bar into bar
 +sub _collapse_dotdot {
 +my $path = shift;
 +
 +1 while $path =~ s{/[^/]+/+\.\.}{};
 +1 while $path =~ s{[^/]+/+\.\./}{};
 +1 while $path =~ s{[^/]+/+\.\.}{};
 
 This is a bug that's gone unnoticed[1] for over 5 years now,
 but I've just noticed this doesn' handle foo/..bar  or foo/...bar
 cases correctly.

Good catch.  Woo unit tests!  :)  You could add them as TODO tests.

A more accurate way to do it would be to split the path, collapse using the
resulting list, and rejoin it.


 [1] - I doubt anybody uses paths like these, though...

Not for an svnroot or branch name, no.


-- 
Hating the web since 1994.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html