On Thu, 2010-01-07, Kannan wrote:
> Tracing these back further, the value gets set in `end_element()'. This
> fixes one more instance of non-canonical URL in `get_version_url()',
> thus fixing all instances of non-canonical URLs AFAICS. Thank you for
> the feedback, Julian. Attaching the updated patch herewith.

Thanks for the patch, Kannan, and sorry for forgetting to deal with it.
Comments below.

> [[[
> Log:
> Ensure the URLs are always canonical.

A minor comment: Let's write that as,

  Ensure URLs in libsvn_ra_neon are always canonical.

to make clear what "the URLs" refers to.

> [ in subversion/libsvn_ra_neon ]
> 
> * util.c
>   (svn_ra_neon__request_get_location): Canonicalize the 'BASE URL' as
>    per the rule.
> 
> * props.c
>   (end_element): Same.
> 
> * options.c
>   (end_element): Same.
> 
> Found by: stsp
> Suggested by: stsp, julianfoad
> Patch by: Kannan R <kann...@collab.net>
> ]]]


> Index: subversion/libsvn_ra_neon/util.c
> ===================================================================
> --- subversion/libsvn_ra_neon/util.c    (revision 896759)
> +++ subversion/libsvn_ra_neon/util.c    (working copy)
> @@ -1490,5 +1490,5 @@
>                                    apr_pool_t *pool)
>  {
>    const char *val = ne_get_response_header(request->ne_req, "Location");
> -  return val ? apr_pstrdup(pool, val) : NULL;
> +  return val ? svn_uri_canonicalize(val, pool) : NULL;
>  }

That looks good. I would update its doc string to mention that it
canonicalizes the URL, because presently it just says it returns the
header:

Index: subversion/libsvn_ra_neon/ra_neon.h
===================================================================
--- subversion/libsvn_ra_neon/ra_neon.h (revision 899581)
+++ subversion/libsvn_ra_neon/ra_neon.h (working copy)
@@ -973,6 +973,7 @@
                   apr_pool_t *pool);
 
 /* Return the Location HTTP header or NULL if none was sent.
+ * (Return a canonical URL even if the header ended with a slash.)
  *
  * Do allocations in POOL.
  */

> Index: subversion/libsvn_ra_neon/options.c
> ===================================================================
> --- subversion/libsvn_ra_neon/options.c (revision 896759)
> +++ subversion/libsvn_ra_neon/options.c (working copy)
> @@ -107,7 +107,9 @@
>    options_ctx_t *oc = baton;
>  
>    if (state == ELEM_href)
> -    oc->activity_coll = svn_string_create_from_buf(oc->cdata, oc->pool);
> +    oc->activity_coll = 
> svn_string_create(svn_uri_canonicalize(oc->cdata->data,
> +                                                               oc->pool),
> +                                          oc->pool);
>  
>    return SVN_NO_ERROR;
>  }

That looks good, because the "activity_coll" field is always meant to
hold a URL.

> Index: subversion/libsvn_ra_neon/props.c
> ===================================================================
> --- subversion/libsvn_ra_neon/props.c   (revision 896759)
> +++ subversion/libsvn_ra_neon/props.c   (working copy)
> @@ -359,7 +359,7 @@
>    const elem_defn *parent_defn;
>    const elem_defn *defn;
>    ne_status status;
> -  const char *cdata = pc->cdata->data;
> +  const char *cdata = svn_uri_canonicalize(pc->cdata->data, pc->pool);
>  
>    switch (state)
>      {

This does not look good. The name of the variable "cdata" makes me think
it is meant to hold any kind of value, not always a URL. ("CDATA" is the
XML term for "any kind of printable data".) Let's look at where the
variable is used:

> case ELEM_status:
>       /* Parse the <status> tag's CDATA for a status code. */
>       if (ne_parse_statusline(cdata, &status))
>         return svn_error_create(SVN_ERR_XML_MALFORMED, NULL, NULL);

Here is the first place it is used. It holds some kind of status
indication, in XML syntax. We don't want to have called
svn_uri_canonicalize() on this value.

In the second place it is used, we know it holds a URL:

> case ELEM_href:
>       /* Special handling for <href> that belongs to the <response> tag. */
>       if (rsrc->href_parent == ELEM_response)
>         return assign_rsrc_url(pc->rsrc, cdata, pc->pool);

That's where we can canonicalize it. And are there any other places
where the "cdata" variable is known to hold a URL? I'm looking... No,
there are no others. In all the rest of the places, it is providing a
generic "property value".

With that fix, I believe the patch will be correct.

- Julian


Reply via email to