Re: [PATCH] Canonicalize dirent/URL before passing to client API

Julian Foad Thu, 11 Nov 2010 08:44:16 -0800

On Wed, 2010-11-10, Noorul Islam K M wrote:
> [[[
> 
> Canonicalize dirent/URL before passing to client API
> 
> * subversion/svn/merge-cmd.c (svn_cl__merge),
>   subversion/svn/propget-cmd.c (svn_cl__propget),
>   subversion/svn/proplist-cmd.c (svn_cl__proplist),
>   subversion/svn/copy-cmd.c (svn_cl__copy),
>   subversion/svn/mergeinfo-cmd.c (svn_cl__mergeinfo),
>   subversion/svn/blame-cmd.c (svn_cl__blame),
>   subversion/svn/log-cmd.c (svn_cl__log),
>   subversion/svn/export-cmd.c (svn_cl__export),
>   subversion/svn/info-cmd.c (svn_cl__info):
>     Use svn_cl__opt_parse_path() instead of svn_opt_parse_path()
>     followed by canonicalizing.


I thought that part looked good... but then discovered something.  See
below.

> * subversion/svn/propdel-cmd.c (svn_cl__propdel),
>   subversion/svn/propget-cmd.c (svn_cl__propget),
>   subversion/svn/propset-cmd.c (svn_cl__propset),
>   subversion/svn/proplist-cmd.c (svn_cl__proplist): 
>     Canonicalize URL

Let's do that part inside the function svn_cl__revprop_prepare()
instead, like I mentioned in another email, and not include it in this
patch.  (You missed the call in propedit-cmd, and it won't be possible
to miss any calls if we do it that way.)

> * subversion/tests/cmdline/mergeinfo_tests.py
>   (mergeinfo_url_special_characters),
>   subversion/tests/cmdline/prop_tests.py
>   (props_url_special_characters),
>   subversion/tests/cmdline/merge_tests.py
>   (merge_url_special_characters),
>   subversion/tests/cmdline/log_tests.py
>   (log_url_special_characters),
>   subversion/tests/cmdline/copy_tests.py
>   (copy_url_special_characters),
>   subversion/tests/cmdline/blame_tests.py
>   (blame_url_special_characters): 
>     New tests

The purpose of the tests is to check that the commands properly accept
and canonicalize non-canonical target paths.  That's not clear from just
reading them, so let's add a comment in each one just to say that, so
that future maintainers will know.  It might be a good idea to rename
the test functions and description away from "special characters" to
something like "non-canonical targets", since a path or URL can be
non-canonical even if it doesn't contain "special" characters.

Some of these subcommands accept either a local path or URL, but the
tests only check URLs - was that intentional?


> +def mergeinfo_url_special_characters(sbox):
> +  """special characters in svn mergeinfo URL"""
> +
> +  sbox.build()
> +  wc_dir = sbox.wc_dir
> +  special_url = sbox.repo_url + '/%2E'

Why construct a URL that is non-canonical in two ways at the same time?
The fact that a character that doesn't need to be percent-encoded ('.'
in this case) is percent-encoded makes it non-canonical, but ending a
URL with '/.' also makes it non-canonical.  You should be able to test
for each of those separately.

Aha.  I think I have discovered the problem.  The arguments have already
been through svn_uri_canonicalize() once, inside
svn_cl__args_to_target_array_print_reserved().  The trouble is that it
didn't work properly it converted

  ".../%2E" to ".../."

but it should have converted

  ".../%2E" to "..."

I think.  Can a URI expert confirm this?

So I think running the output through svn_uri_canonicalize again is not
the right solution, and we need to fix svn_uri_canonicalize instead.

If that analysis is right, then we will indeed need to test a URL like
repo_url + '/%2e', but only a single test.

- Julian

Re: [PATCH] Canonicalize dirent/URL before passing to client API

Reply via email to