Re: svn commit: r573831

2007-10-26 Thread Nick Kew
On Tue, 23 Oct 2007 00:22:56 +0200
André Malo <[EMAIL PROTECTED]> wrote:

> 
> > > The main difference is this escaping of ' ' to '+'. The reason for
> > > this is that while ' ' gets encoded as %20 in paths, it gets
> > > encoded as '+' in query strings (afaik for historic reasons).
> > > Therefore, languages which interpret the query string (like PHP)
> > > might get confused if they receive a %20 in the query string (or
> > > at least that was my concern).
> >
> > That sounds plausible, but I'm not sure.  Anyone else?
> 
> Nah. Everyone just takes %hh and turns it into an octet. The special 
> handling takes place for the + sign only, on both sides.
> In fact, every script must be prepared to get a %20 from the client,
> as well.

You're right of course, and it's also so easy to implement
that it's unlikely to have been messed up in real life.

As I see it, we have two variants of the fix:
  (1) Günther Gsenger's patch, with + for space.
  (2) Use ap_escape_path_segment, with %20 for space.

My (slight) inclination is to apply (2), but if anyone has strong
views, I'm happy with (1).  If noone cares enough to reply here,
I'll go ahead and apply (2).

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


Re: svn commit: r573831

2007-10-22 Thread André Malo
* Nick Kew wrote:

> > The main difference is this escaping of ' ' to '+'. The reason for
> > this is that while ' ' gets encoded as %20 in paths, it gets encoded
> > as '+' in query strings (afaik for historic reasons). Therefore,
> > languages which interpret the query string (like PHP) might get
> > confused if they receive a %20 in the query string (or at least that
> > was my concern).
>
> That sounds plausible, but I'm not sure.  Anyone else?

Nah. Everyone just takes %hh and turns it into an octet. The special 
handling takes place for the + sign only, on both sides.
In fact, every script must be prepared to get a %20 from the client, as 
well.

nd
-- 
> Rätselnd, was ein Anthroposoph mit Unterwerfung zu tun hat...

[...] Dieses Wort gibt so viele Stellen für einen Spelling Flame her, und
Du gönnst einem keine einzige.-- Jean Claude und David Kastrup in dtl


Re: svn commit: r573831

2007-10-22 Thread William A. Rowe, Jr.

Nick Kew wrote:

Günther Gsenger <[EMAIL PROTECTED]> wrote:

André Malo:
This spreads another uri escaper copy around. Why can't we take 
ap_escape_uri? Without deep digging: what's the difference?

Also I don't like the ' ' => '+' transition, which should not be
applied  
forpaths. It's safer to translate that always to %20, I guess.

The main difference is this escaping of ' ' to '+'. The reason for
this is that while ' ' gets encoded as %20 in paths, it gets encoded
as '+' in query strings (afaik for historic reasons). Therefore,
languages which interpret the query string (like PHP) might get
confused if they receive a %20 in the query string (or at least that
was my concern).


That sounds plausible, but I'm not sure.  Anyone else?


I strongly expect every CGI consumer expects to see '+' where appropriate,
and changing this will alter a vast number of CGI scripts.  Not all of them
broken, of course, but some will be.

You can't touch QUERY_ARGS escaping, as it's well defined.  If the patch
does this it needs refinement.

Bill


Re: svn commit: r573831

2007-10-22 Thread Nick Kew
On Mon, 22 Oct 2007 21:03:25 +0200
Günther Gsenger <[EMAIL PROTECTED]> wrote:

> Hi,

Hi, thanks for revisiting this!

> Ruediger Pluem:
> > Does it make sense to duplicate code? Shouldn't this be placed in
> > util.c?
> It most likely should. I placed it there because I thought the patch
> would get a higher acceptance if it just touched mod_rewrite.

I think perhaps Rudiger's point is that the patch essentially
duplicates ap_escape_path_segment, except for ' ' --> '+'.

> André Malo:
> > This spreads another uri escaper copy around. Why can't we take 
> > ap_escape_uri? Without deep digging: what's the difference?
> >Also I don't like the ' ' => '+' transition, which should not be
> >applied  
> > forpaths. It's safer to translate that always to %20, I guess.
> The main difference is this escaping of ' ' to '+'. The reason for
> this is that while ' ' gets encoded as %20 in paths, it gets encoded
> as '+' in query strings (afaik for historic reasons). Therefore,
> languages which interpret the query string (like PHP) might get
> confused if they receive a %20 in the query string (or at least that
> was my concern).

That sounds plausible, but I'm not sure.  Anyone else?

> I've created another patch, hopefully you'll like it better.

Looks good, but it might be nice to avoid introducing another
new escape function.  If space vs + is the issue, perhaps we
could fix it by making the third arg to ap_os_escape_path an
enum, with an option to escape space --> +.

Just thinking out loud.  I won't apply anything yet, but please
bug me if it's still languishing here by the weekend.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


svn commit: r573831

2007-10-22 Thread Günther Gsenger

Hi,

sorry I can't reply directly to this thread  
http://mail-archives.apache.org/mod_mbox/httpd-dev/200709.mbox/[EMAIL PROTECTED]  
- i only (re)subscribed to this list today after I found the thread.



+/*
+ * Escapes a uri in a similar way as php's urlencode does.
+ * Based on ap_os_escape_path in server/util.c
+ */
+static char *escape_uri(apr_pool_t *p, const char *path) {
+char *copy = apr_palloc(p, 3 * strlen(path) + 3);
+const unsigned char *s = (const unsigned char *)path;
+unsigned char *d = (unsigned char *)copy;
+unsigned c;
+
+while ((c = *s)) {
+if (apr_isalnum(c) || c == '_') {
+*d++ = c;
+}
+else if (c == ' ') {
+*d++ = '+';
+}
+else {
+d = c2x(c, '%', d);
+}
+++s;
+}
+*d = '\0';
+return copy;
+}
+

Ruediger Pluem:

Does it make sense to duplicate code? Shouldn't this be placed in util.c?
It most likely should. I placed it there because I thought the patch would  
get a higher acceptance if it just touched mod_rewrite.



+/* escape the backreference */
+char *tmp2, *tmp;
+tmp = apr_palloc(pool, span + 1);
+strncpy(tmp, bri->source + bri->regmatch[n].rm_so,  
span);

Ruediger Pluem:

How about using apr_pstrndup instead?

Yes, I was not aware of that function.


André Malo:
This spreads another uri escaper copy around. Why can't we take 
ap_escape_uri? Without deep digging: what's the difference?
Also I don't like the ' ' => '+' transition, which should not be applied  
forpaths. It's safer to translate that always to %20, I guess.
The main difference is this escaping of ' ' to '+'. The reason for this is  
that while ' ' gets encoded as %20 in paths, it gets encoded as '+' in  
query strings (afaik for historic reasons). Therefore, languages which  
interpret the query string (like PHP) might get confused if they receive a  
%20 in the query string (or at least that was my concern).



André Malo:
By the way, I'm wondering why nobody picked up the suggested use of the 
escape rewrite map (or I overread it).
I don't know if that works but I developed the patch because it can be  
used in a directory-context and the RewriteMap can't.


I've created another patch, hopefully you'll like it better.

Regards,
Guenther

rewrite.diff
Description: Binary data


Re: svn commit: r573831 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_rewrite.xml modules/mappers/mod_rewrite.c

2007-09-11 Thread Nick Kew
On Tue, 11 Sep 2007 18:10:57 +0200
André Malo <[EMAIL PROTECTED]> wrote:

> > A further improvement, round tuits permitting, would indeed be
> > to look deeper, and eliminate any duplication.
> 
> It should be done before considering backport, IMHO.

Yep, guess so.  I think I just put a proposal in STATUS (just
transcribing recent trunk/CHANGES entries), but I can withdraw
that pending further thought.

> By the way, I'm wondering why nobody picked up the suggested use of
> the escape rewrite map (or I overread it). 
> (http://issues.apache.org/bugzilla/show_bug.cgi?id=34602#c16)

Can't speak for others, but I simply missed it.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


Re: svn commit: r573831 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_rewrite.xml modules/mappers/mod_rewrite.c

2007-09-11 Thread André Malo
* Nick Kew wrote: 


> On Tue, 11 Sep 2007 15:11:35 +0200
>
> André Malo <[EMAIL PROTECTED]> wrote:
> > * [EMAIL PROTECTED] wrote:
> > > Author: niq
> > > Date: Sat Sep  8 05:46:10 2007
> > > New Revision: 573831
> > >
> > > URL: http://svn.apache.org/viewvc?rev=573831&view=rev
> > > Log:
> > > Add option to escape backreferences in RewriteRule.
> > > PR 34602  and  PR 39746
> > > Patch by Guenther Gsenger
> > >
> > > Modified:
> > > httpd/httpd/trunk/CHANGES
> > > httpd/httpd/trunk/docs/manual/mod/mod_rewrite.xml
> > > httpd/httpd/trunk/modules/mappers/mod_rewrite.c
> >
> > This spreads another uri escaper copy around. Why can't we take
> > ap_escape_uri? Without deep digging: what's the difference?
>
> As I said in my reply to Rüdiger, I just applied the patch from
> bugzilla, having ascertained that it looked sound and worked for
> cases identified in both the bug reports referenced.

Yep, sorry. Saw that reply too late. Just took a look at the diff and was 
disturbed by the cluttering ;)

Also I don't like the ' ' => '+' transition, which should not be applied for 
paths. It's safer to translate that always to %20, I guess.

> A further improvement, round tuits permitting, would indeed be
> to look deeper, and eliminate any duplication.

It should be done before considering backport, IMHO.

By the way, I'm wondering why nobody picked up the suggested use of the 
escape rewrite map (or I overread it). 
(http://issues.apache.org/bugzilla/show_bug.cgi?id=34602#c16)

nd


Re: svn commit: r573831 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_rewrite.xml modules/mappers/mod_rewrite.c

2007-09-11 Thread Nick Kew
On Tue, 11 Sep 2007 15:11:35 +0200
André Malo <[EMAIL PROTECTED]> wrote:

> * [EMAIL PROTECTED] wrote: 
> 
> 
> > Author: niq
> > Date: Sat Sep  8 05:46:10 2007
> > New Revision: 573831
> >
> > URL: http://svn.apache.org/viewvc?rev=573831&view=rev
> > Log:
> > Add option to escape backreferences in RewriteRule.
> > PR 34602  and  PR 39746
> > Patch by Guenther Gsenger
> >
> > Modified:
> > httpd/httpd/trunk/CHANGES
> > httpd/httpd/trunk/docs/manual/mod/mod_rewrite.xml
> > httpd/httpd/trunk/modules/mappers/mod_rewrite.c
> 
> This spreads another uri escaper copy around. Why can't we take 
> ap_escape_uri? Without deep digging: what's the difference?

As I said in my reply to Rüdiger, I just applied the patch from
bugzilla, having ascertained that it looked sound and worked for
cases identified in both the bug reports referenced.

A further improvement, round tuits permitting, would indeed be
to look deeper, and eliminate any duplication.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


Re: svn commit: r573831 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_rewrite.xml modules/mappers/mod_rewrite.c

2007-09-11 Thread André Malo
* [EMAIL PROTECTED] wrote: 


> Author: niq
> Date: Sat Sep  8 05:46:10 2007
> New Revision: 573831
>
> URL: http://svn.apache.org/viewvc?rev=573831&view=rev
> Log:
> Add option to escape backreferences in RewriteRule.
> PR 34602  and  PR 39746
> Patch by Guenther Gsenger
>
> Modified:
> httpd/httpd/trunk/CHANGES
> httpd/httpd/trunk/docs/manual/mod/mod_rewrite.xml
> httpd/httpd/trunk/modules/mappers/mod_rewrite.c

This spreads another uri escaper copy around. Why can't we take 
ap_escape_uri? Without deep digging: what's the difference?

nd


Re: svn commit: r573831 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_rewrite.xml modules/mappers/mod_rewrite.c

2007-09-09 Thread Nick Kew
On Sun, 09 Sep 2007 11:41:53 +0200
Ruediger Pluem <[EMAIL PROTECTED]> wrote:

> 
> 
> On 09/08/2007 02:46 PM, wrote:
> > Author: niq
> > Date: Sat Sep  8 05:46:10 2007
> > New Revision: 573831
> > 
> > URL: http://svn.apache.org/viewvc?rev=573831&view=rev
> > Log:
> > Add option to escape backreferences in RewriteRule.
> > PR 34602  and  PR 39746
> > Patch by Guenther Gsenger

The patch is in bugzilla.  I applied it without modification
because:
  * It fixes both the bugs listed.
  * The code looks good.
I'm sure it could benefit from further refactoring, but I didn't
want to spend more time on this than necessary.

> I am a little bit unsure if this can have security implications in
> some cases.

I'd like to see an example of how it might affect security.

> Does it make sense to duplicate code? Shouldn't this be placed in
> util.c?

Very likely.  But that escalates it from a bugfix to an API change.

> How about using apr_pstrndup instead?

Indeed.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/


Re: svn commit: r573831 - in /httpd/httpd/trunk: CHANGES docs/manual/mod/mod_rewrite.xml modules/mappers/mod_rewrite.c

2007-09-09 Thread Ruediger Pluem


On 09/08/2007 02:46 PM, wrote:
> Author: niq
> Date: Sat Sep  8 05:46:10 2007
> New Revision: 573831
> 
> URL: http://svn.apache.org/viewvc?rev=573831&view=rev
> Log:
> Add option to escape backreferences in RewriteRule.
> PR 34602  and  PR 39746
> Patch by Guenther Gsenger
> 
> Modified:
> httpd/httpd/trunk/CHANGES
> httpd/httpd/trunk/docs/manual/mod/mod_rewrite.xml
> httpd/httpd/trunk/modules/mappers/mod_rewrite.c
> 

> 
> Modified: httpd/httpd/trunk/docs/manual/mod/mod_rewrite.xml
> URL: 
> http://svn.apache.org/viewvc/httpd/httpd/trunk/docs/manual/mod/mod_rewrite.xml?rev=573831&r1=573830&r2=573831&view=diff
> ==
> --- httpd/httpd/trunk/docs/manual/mod/mod_rewrite.xml (original)
> +++ httpd/httpd/trunk/docs/manual/mod/mod_rewrite.xml Sat Sep  8 05:46:10 2007
> @@ -1243,6 +1243,21 @@
>brackets, of any of the following flags: 
>  
>
> +'B' (escape backreferences)
> +Apache has to unescape URLs before mapping them,
> +so backreferences will be unescaped at the time they are applied.
> +Using the B flag, non-alphanumeric characters in backreferences
> +will be escaped.  For example, consider the rule:
> + RewriteRule RewriteRule ^(.*)$   index.php?show=$1 
> 
> +This will map /C++ to 
> index.php?show=C++.
> +But it will also map /C%2b%2b to
> +index.php?show=C++, because the %2b
> +has been unescaped.  With the B flag, it will instead map to
> +index.php?show=>/C%2b%2b.
> +This escaping is particularly necessary in a proxy situation,
> +when the backend may break if presented with an unescaped URL.
> +
> +

I am a little bit unsure if this can have security implications in some cases.

>
> 
> Modified: httpd/httpd/trunk/modules/mappers/mod_rewrite.c
> URL: 
> http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/mappers/mod_rewrite.c?rev=573831&r1=573830&r2=573831&view=diff
> ==
> --- httpd/httpd/trunk/modules/mappers/mod_rewrite.c (original)
> +++ httpd/httpd/trunk/modules/mappers/mod_rewrite.c Sat Sep  8 05:46:10 2007

> @@ -635,6 +637,46 @@
>  return 0;
>  }
>  
> +static const char c2x_table[] = "0123456789abcdef";
> +
> +static APR_INLINE unsigned char *c2x(unsigned what, unsigned char prefix,
> + unsigned char *where)
> +{
> +#if APR_CHARSET_EBCDIC
> +what = apr_xlate_conv_byte(ap_hdrs_to_ascii, (unsigned char)what);
> +#endif /*APR_CHARSET_EBCDIC*/
> +*where++ = prefix;
> +*where++ = c2x_table[what >> 4];
> +*where++ = c2x_table[what & 0xf];
> +return where;
> +}
> +
> +/*
> + * Escapes a uri in a similar way as php's urlencode does.
> + * Based on ap_os_escape_path in server/util.c
> + */
> +static char *escape_uri(apr_pool_t *p, const char *path) {
> +char *copy = apr_palloc(p, 3 * strlen(path) + 3);
> +const unsigned char *s = (const unsigned char *)path;
> +unsigned char *d = (unsigned char *)copy;
> +unsigned c;
> +
> +while ((c = *s)) {
> +if (apr_isalnum(c) || c == '_') {
> +*d++ = c;
> +}
> +else if (c == ' ') {
> +*d++ = '+';
> +}
> +else {
> +d = c2x(c, '%', d);
> +}
> +++s;
> +}
> +*d = '\0';
> +return copy;
> +}
> +

Does it make sense to duplicate code? Shouldn't this be placed in util.c?

>  /*
>   * escape absolute uri, which may or may not be path oriented.
>   * So let's handle them differently.

> @@ -2322,9 +2364,23 @@
>  if (bri->source && n < AP_MAX_REG_MATCH
>  && bri->regmatch[n].rm_eo > bri->regmatch[n].rm_so) {
>  span = bri->regmatch[n].rm_eo - bri->regmatch[n].rm_so;
> -
> -current->len = span;
> -current->string = bri->source + bri->regmatch[n].rm_so;
> +if (entry && (entry->flags & RULEFLAG_ESCAPEBACKREF)) {
> +/* escape the backreference */
> +char *tmp2, *tmp;
> +tmp = apr_palloc(pool, span + 1);
> +strncpy(tmp, bri->source + bri->regmatch[n].rm_so, span);

How about using apr_pstrndup instead?

> +tmp[span] = '\0';
> +tmp2 = escape_uri(pool, tmp);
> +rewritelog((ctx->r, 5, ctx->perdir, "escaping 
> backreference '%s' to '%s'",
> +tmp, tmp2));
> +
> +current->len = span = strlen(tmp2);
> +current->string = tmp2;
> +} else {
> +current->len = span;
> +current->string = bri->source + bri->regmatch[n].rm_so;
> +}
> +
>  outlen += span;
>  }
>  

Regards

Rüdiger