[PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes

2013-09-12 Thread Kyle J. McKay
On Sep 12, 2013, at 02:57, Thomas Rast wrote:

 The calls to strbuf_add* within append_normalized_escapes() can
 reallocate the buffer passed to it.  Therefore, the seg_start pointer
 into the string cannot be kept across such calls.

Thanks for finding this.

 It went undetected for a while because it does not fail the test: the
 calls to test-urlmatch-normalization happen inside a $() substitution.
 
 I checked the other call sites to append_normalized_escapes() for the
 same type of problem, and they seem to be okay.

 diff --git a/urlmatch.c b/urlmatch.c
 index 1db76c8..59abc80 100644
 --- a/urlmatch.c
 +++ b/urlmatch.c
 @@ -281,7 +281,8 @@ char *url_normalize(const char *url, struct url_info 
 *out_info)
   url_len--;
   }
   for (;;) {
 - const char *seg_start = norm.buf + norm.len;
 + const char *seg_start;
 + size_t prev_len = norm.len;

How about a more descriptive name for what prev_len is?  It's actually the
segment start offset.

   const char *next_slash = url + strcspn(url, /?#);
   int skip_add_slash = 0;
   /*
 @@ -297,6 +298,7 @@ char *url_normalize(const char *url, struct url_info 
 *out_info)
   strbuf_release(norm);
   return NULL;
   }
 + seg_start = norm.buf + prev_len;

A comment would be nice here to remind folks who might be tempted to
revert this to the previous version why it's being done this way.

I'm sure at some point someone will propose a simplification patch
otherwise.

Also some nits.  The patch description should be imperative mood
(cf. Documentation/SubmittingPatches).  And instead of mentioning the seg_start
pointer in the description (which will be meaningless to just about everyone and
it's clear from the diff), mention the bad thing the code was doing in more
general terms that will be clear to anyone familiar with a strbuf.

So how about this patch instead...

-- 8 --
From: Thomas Rast tr...@inf.ethz.ch
Subject: urlmatch.c: recompute pointer after append_normalized_escapes

When append_normalized_escapes is called, its internal strbuf_add* calls can
cause the strbuf's buf to be reallocated changing the value of the buf pointer.

Do not use the strbuf buf pointer from before any append_normalized_escapes
calls afterwards.  Instead recompute the needed pointer.

Signed-off-by: Thomas Rast tr...@inf.ethz.ch
Signed-off-by: Kyle J. McKay mack...@gmail.com
---
 urlmatch.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/urlmatch.c b/urlmatch.c
index 1db76c89..01c67467 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -281,8 +281,9 @@ char *url_normalize(const char *url, struct url_info 
*out_info)
url_len--;
}
for (;;) {
-   const char *seg_start = norm.buf + norm.len;
+   const char *seg_start;
+   size_t seg_start_off = norm.len;
const char *next_slash = url + strcspn(url, /?#);
int skip_add_slash = 0;
/*
 * RFC 3689 indicates that any . or .. segments should be
@@ -297,6 +298,8 @@ char *url_normalize(const char *url, struct url_info 
*out_info)
strbuf_release(norm);
return NULL;
}
+   /* append_normalized_escapes can cause norm.buf to change */
+   seg_start = norm.buf + seg_start_off;
if (!strcmp(seg_start, .)) {
/* ignore a . segment; be careful not to remove initial 
'/' */
if (seg_start == path_start + 1) {
-- 
1.8.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes

2013-09-12 Thread Thomas Rast
Kyle J. McKay mack...@gmail.com writes:

 Also some nits.  The patch description should be imperative mood
 (cf. Documentation/SubmittingPatches).

Heh.  Serves me right to go away for a while and get SubmittingPatches
cited at me on return ;-)

Thanks for the updated patch.  I agree with the changes.  I particularly
like the better variable name.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes

2013-09-12 Thread Junio C Hamano
Thanks, both.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes

2013-09-12 Thread Junio C Hamano
Kyle J. McKay mack...@gmail.com writes:

 So how about this patch instead...

 -- 8 --
 From: Thomas Rast tr...@inf.ethz.ch
 Subject: urlmatch.c: recompute pointer after append_normalized_escapes

 When append_normalized_escapes is called, its internal strbuf_add* calls can
 cause the strbuf's buf to be reallocated changing the value of the buf 
 pointer.

 Do not use the strbuf buf pointer from before any append_normalized_escapes
 calls afterwards.  Instead recompute the needed pointer.

 Signed-off-by: Thomas Rast tr...@inf.ethz.ch
 Signed-off-by: Kyle J. McKay mack...@gmail.com
 ---
  urlmatch.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

 diff --git a/urlmatch.c b/urlmatch.c
 index 1db76c89..01c67467 100644
 --- a/urlmatch.c
 +++ b/urlmatch.c
 @@ -281,8 +281,9 @@ char *url_normalize(const char *url, struct url_info 
 *out_info)
   url_len--;
   }
   for (;;) {
 - const char *seg_start = norm.buf + norm.len;
 + const char *seg_start;
 + size_t seg_start_off = norm.len;
   const char *next_slash = url + strcspn(url, /?#);
   int skip_add_slash = 0;
   /*
* RFC 3689 indicates that any . or .. segments should be
 @@ -297,6 +298,8 @@ char *url_normalize(const char *url, struct url_info 
 *out_info)
   strbuf_release(norm);
   return NULL;
   }
 + /* append_normalized_escapes can cause norm.buf to change */
 + seg_start = norm.buf + seg_start_off;

The change looks good, but I find that this comment is not placed in
the right place.  It is good if the reader knows about an old bug to
put it here, but if the first thing a reader reads is this updated
version, the comment is better placed close to the place where the
start_ofs variable captures the original value (i.e. because the
next call may relocate the buffer, we cannot grab seg_start upfront;
instead we need to record the start_ofs here, and that is what this
variable is about).

It is too minor a point for a reroll, so I'll try to tweak it
locally.  Something like this (but now I think about it, the comment
may not even be necessary).

 urlmatch.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/urlmatch.c b/urlmatch.c
index 01c6746..d1600e2 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -282,9 +282,17 @@ char *url_normalize(const char *url, struct url_info 
*out_info)
}
for (;;) {
const char *seg_start;
-   size_t seg_start_off = norm.len;
+   size_t seg_start_off;
const char *next_slash = url + strcspn(url, /?#);
int skip_add_slash = 0;
+
+   /*
+* record the starting offset; appending escapes may
+* relocate the buffer, so we cannot capture seg_start
+* upfront and use it later.
+*/
+   seg_start_off = norm.len;
+
/*
 * RFC 3689 indicates that any . or .. segments should be
 * unescaped before being checked for.
@@ -298,7 +306,7 @@ char *url_normalize(const char *url, struct url_info 
*out_info)
strbuf_release(norm);
return NULL;
}
-   /* append_normalized_escapes can cause norm.buf to change */
+
seg_start = norm.buf + seg_start_off;
if (!strcmp(seg_start, .)) {
/* ignore a . segment; be careful not to remove initial 
'/' */
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes

2013-09-12 Thread Kyle J. McKay

On Sep 12, 2013, at 11:30, Junio C Hamano wrote:


+   /* append_normalized_escapes can cause norm.buf to change */
+   seg_start = norm.buf + seg_start_off;


The change looks good, but I find that this comment is not placed in
the right place.  It is good if the reader knows about an old bug to
put it here, but if the first thing a reader reads is this updated
version, the comment is better placed close to the place where the
start_ofs variable captures the original value (i.e. because the
next call may relocate the buffer, we cannot grab seg_start upfront;
instead we need to record the start_ofs here, and that is what this
variable is about).

It is too minor a point for a reroll, so I'll try to tweak it
locally.  Something like this (but now I think about it, the comment
may not even be necessary).


The longer comment looks good to me.  If you think the code will be  
safe from
simplification patches without a comment, that works for me too.  I've  
just seen
so many simplification patches go by on the list I'm concerned it  
will be a

target otherwise leading to re-introduction of the problem.


diff --git a/urlmatch.c b/urlmatch.c
index 01c6746..d1600e2 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -282,9 +282,17 @@ char *url_normalize(const char *url, struct  
url_info *out_info)

}
for (;;) {
const char *seg_start;
-   size_t seg_start_off = norm.len;
+   size_t seg_start_off;
const char *next_slash = url + strcspn(url, /?#);
int skip_add_slash = 0;
+
+   /*
+* record the starting offset; appending escapes may
+* relocate the buffer, so we cannot capture seg_start
+* upfront and use it later.
+*/
+   seg_start_off = norm.len;
+
/*
 * RFC 3689 indicates that any . or .. segments should be
 * unescaped before being checked for.
@@ -298,7 +306,7 @@ char *url_normalize(const char *url, struct  
url_info *out_info)

strbuf_release(norm);
return NULL;
}
-   /* append_normalized_escapes can cause norm.buf to change */
+
seg_start = norm.buf + seg_start_off;
if (!strcmp(seg_start, .)) {
/* ignore a . segment; be careful not to remove initial 
'/' */


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes

2013-09-12 Thread Jonathan Nieder
Kyle J. McKay wrote:

 The longer comment looks good to me.  If you think the code will be safe from
 simplification patches without a comment, that works for me too.

I think if we can't trust reviewers to catch this kind of thing, we're
in trouble (i.e., moving too fast). :)

So FWIW my instinct is to leave the comment out, since I actually find
it more readable that way (otherwise I would wonder, Why am I being
told that a strbuf's buffer has a nonconstant address?  Do some other
strbufs have a constant address or something?)

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] urlmatch.c: recompute ptr after append_normalized_escapes

2013-09-12 Thread Junio C Hamano
Jonathan Nieder jrnie...@gmail.com writes:

 Kyle J. McKay wrote:

 The longer comment looks good to me.  If you think the code will be safe from
 simplification patches without a comment, that works for me too.

 I think if we can't trust reviewers to catch this kind of thing, we're
 in trouble (i.e., moving too fast). :)

 So FWIW my instinct is to leave the comment out, since I actually find
 it more readable that way (otherwise I would wonder, Why am I being
 told that a strbuf's buffer has a nonconstant address?  Do some other
 strbufs have a constant address or something?)

Yeah, I was staring at that message and coming to the same
conclusion.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html