RE: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-20 Thread GOMEZ Henri

What the status of that one about a week later ?

I recall the discussions some months ago about replacing
the previous uri with unparsed_uri.

Did we have a way to determine that the uri came from 
mod_rewrite and not from client (via the notes). 
In that case what about using r-uri instead of r-unparsed_uri ?

-
Henri Gomez ___[_]
EMAIL : [EMAIL PROTECTED](. .) 
PGP KEY : 697ECEDD...oOOo..(_)..oOOo...
PGP Fingerprint : 9DF8 1EA8 ED53 2F39 DC9B 904A 364F 80E6 



-Original Message-
From: Bill Barker [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, August 15, 2001 9:51 PM
To: [EMAIL PROTECTED]
Subject: Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix


1.3.17 (with negotiation_module removed to prevent that problem).
- Original Message -
From: [EMAIL PROTECTED]
To: Bill Barker [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Wednesday, August 15, 2001 1:01 PM
Subject: Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix


 Apache2.0 + mod_jk + JNI + tc3.3 gives me the correct answer,
 404 ( with the correct URI - /?A=B.jsp ). Note that typing
 the unencoded version is returning the correct answer too, i.e.
 index.html.

 What version of apache are you using ?

 Costin



 On Wed, 15 Aug 2001, Bill Barker wrote:

  It is actually worse than that.  TC3.3B1 (with the mod_jk 
that it ships
  with, I haven't tried j-t-c yet) gives a directory listing 
in response
to:
  http://myserver/%3f%41%3d%42.jsp
  - Original Message -
  From: [EMAIL PROTECTED]
  To: [EMAIL PROTECTED]; Bill Barker
  [EMAIL PROTECTED]
  Sent: Wednesday, August 15, 2001 11:44 AM
  Subject: Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix
 
 
   On Wed, 15 Aug 2001, Bill Barker wrote:
  
Personally, I agree with Justin and Costin that mod_jk 
should be
able to
  use
the uri field.
   
Having said that, I'd like to point out that the 
mod_jk.c in j-t-c
is
flat-out broken.  It doesn't handle the case where the 
'?' itself is
encoded.  Since this case is part of a currently 
popular attack on
IIS,
  it
will show up.
  
   Interesting finding. However tomcat decoder should be 
able to do so -
if
   it doesn't we must fix it. Can you check against 3.3beta1 ?
  
   As a note, IMHO it is perfectly legal to have an encoded 
'?' in the
URI,
   and the behavior should be: the '?' will be decoded 
_after_ the URI is
   separated from query string, and it's used as part of 
the file name.
  
   AFAIK there is no reason a file ( or pathInfo ) can't 
have the '?'
char
   inside, and the URI spec allow that.
  
   ( of course, paranoia may force us to remove this kind 
of behavior ).
  
   Costin
  
  
  
  
 







Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-15 Thread David Rees

On Tue, Aug 14, 2001 at 11:49:43PM -0400, Keith Wannamaker wrote:
 Try ap_escape_uri

That does the trick.

Here's the patch which gets things working again, thanks for all the help. 
Hopefully this will get applied soon.  Is there any 3.2.4 release planned to
fix the small number of bugs/problems in 3.2.3 (I also recall bumping into
some issues with error documents and getting into infinite loops which were
fixed)

Thanks,
Dave

--- mod_jk.c.orig   Tue Jun 19 15:44:57 2001
+++ mod_jk.cTue Aug 14 22:42:32 2001
@@ -358,13 +358,12 @@
 s-method   = (char *)r-method;
 s-content_length = get_content_length(r);
 s-query_string = r-args;
-s-req_uri  = r-unparsed_uri;
-if (s-req_uri != NULL) {
-   char *query_str = strchr(s-req_uri, '?');
-   if (query_str != NULL) {
-   *query_str = 0;
-   }
-}
+/*
+ * The 2.2 servlet spec errata says the uri from
+ * HttpServletRequest.getRequestURI() should remain encoded.
+ * [http://java.sun.com/products/servlet/errata_042700.html]
+ */
+s-req_uri = ap_escape_uri(r-pool, r-uri);
 
 s-is_ssl   = JK_FALSE;
 s-ssl_cert = NULL;






RE: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-15 Thread Keith Wannamaker

I am concerned that the loss of original escaping
will break somebody.  For instance:

r-unparsed_uri   = fe%3afi%40fo%3ffum
r-uri= fe:fi@fo?fum
ap_escape_uri(r-uri) = fe:fi@fo%3ffum

Magically authentication information appears in
my request to an oddly-named server.

Maybe the solution is to choose one of the three
at runtime by a mod_jk config option?

Keith

| -Original Message-
| From: David Rees [mailto:[EMAIL PROTECTED]]
| Sent: Wednesday, August 15, 2001 1:45 AM
| To: [EMAIL PROTECTED]
| Subject: Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix
| 
| 
| On Tue, Aug 14, 2001 at 11:49:43PM -0400, Keith Wannamaker wrote:
|  Try ap_escape_uri
| 
| That does the trick.
| 
| Here's the patch which gets things working again, thanks for all the help. 
| Hopefully this will get applied soon.  Is there any 3.2.4 release planned to
| fix the small number of bugs/problems in 3.2.3 (I also recall bumping into
| some issues with error documents and getting into infinite loops which were
| fixed)
| 
| Thanks,
| Dave




Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-15 Thread Justin Erenkrantz

On Wed, Aug 15, 2001 at 08:56:45AM -0400, Keith Wannamaker wrote:
 I am concerned that the loss of original escaping
 will break somebody.  For instance:

As Costin pointed out, the escaping of a URI does not change its
semantics - they should be treated as identical by anyone who follows
the URI spec.  Escaping where it wasn't escaped *shouldn't* break 
anyone.  

And, the whole question is what does Tomcat see the request as?  I 
could make a case that it should never know about the unparsed_uri, 
but only the uri that httpd finally resolved to and that mod_jk 
picked up.  -- justin




Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-15 Thread cmanolache

On Wed, 15 Aug 2001, Justin Erenkrantz wrote:

 On Wed, Aug 15, 2001 at 08:56:45AM -0400, Keith Wannamaker wrote:
  I am concerned that the loss of original escaping
  will break somebody.  For instance:

 As Costin pointed out, the escaping of a URI does not change its
 semantics - they should be treated as identical by anyone who follows
 the URI spec.  Escaping where it wasn't escaped *shouldn't* break
 anyone.

 And, the whole question is what does Tomcat see the request as?  I
 could make a case that it should never know about the unparsed_uri,
 but only the uri that httpd finally resolved to and that mod_jk
 picked up.  -- justin

I guess the only choice we can make is if Apache is part of the servlet
container ( and most follow its rules ) or not. If it is, then mod_rewrite
( and half of the modules ) just can't be used - they alter the request in
a way that's not allowed by the spec. Apache can only forward requests to
tomcat, and if it's lucky serve static files ( for apps not using filters
or strange mappings ). It can't authenticate ( since the auth model
doesn't follow the role based rules ), can't filter ( since Apache2.0
filters are very different from 2.3 filters ).
But the bright side - our live is much simpler, we don't have to worry.


If we treat apache as a web server, that cooperates with tomcat but can
do at least what a proxy is allowed to do by the HTTP spec ( i.e. alter
the request, etc ) - then we are fine ( except the life is interesting
again, and a lot of work to do including this fix ).


Costin






Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-15 Thread Bill Barker

Personally, I agree with Justin and Costin that mod_jk should be able to use
the uri field.

Having said that, I'd like to point out that the mod_jk.c in j-t-c is
flat-out broken.  It doesn't handle the case where the '?' itself is
encoded.  Since this case is part of a currently popular attack on IIS, it
will show up.
- Original Message -
From: Justin Erenkrantz [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, August 15, 2001 8:27 AM
Subject: Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix


 On Wed, Aug 15, 2001 at 08:56:45AM -0400, Keith Wannamaker wrote:
  I am concerned that the loss of original escaping
  will break somebody.  For instance:

 As Costin pointed out, the escaping of a URI does not change its
 semantics - they should be treated as identical by anyone who follows
 the URI spec.  Escaping where it wasn't escaped *shouldn't* break
 anyone.

 And, the whole question is what does Tomcat see the request as?  I
 could make a case that it should never know about the unparsed_uri,
 but only the uri that httpd finally resolved to and that mod_jk
 picked up.  -- justin






Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-15 Thread cmanolache

On Wed, 15 Aug 2001, Bill Barker wrote:

 Personally, I agree with Justin and Costin that mod_jk should be able to use
 the uri field.

 Having said that, I'd like to point out that the mod_jk.c in j-t-c is
 flat-out broken.  It doesn't handle the case where the '?' itself is
 encoded.  Since this case is part of a currently popular attack on IIS, it
 will show up.

Interesting finding. However tomcat decoder should be able to do so - if
it doesn't we must fix it. Can you check against 3.3beta1 ?

As a note, IMHO it is perfectly legal to have an encoded '?' in the URI,
and the behavior should be: the '?' will be decoded _after_ the URI is
separated from query string, and it's used as part of the file name.

AFAIK there is no reason a file ( or pathInfo ) can't have the '?' char
inside, and the URI spec allow that.

( of course, paranoia may force us to remove this kind of behavior ).

Costin






Fw: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-15 Thread Bill Barker


- Original Message -
From: Bill Barker [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, August 15, 2001 12:15 PM
Subject: Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix


 It is actually worse than that.  TC3.3B1 (with the mod_jk that it ships
 with, I haven't tried j-t-c yet) gives a directory listing in response to:
 http://myserver/%3f%41%3d%42.jsp
 - Original Message -
 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]; Bill Barker
 [EMAIL PROTECTED]
 Sent: Wednesday, August 15, 2001 11:44 AM
 Subject: Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix


  On Wed, 15 Aug 2001, Bill Barker wrote:
 
   Personally, I agree with Justin and Costin that mod_jk should be able
to
 use
   the uri field.
  
   Having said that, I'd like to point out that the mod_jk.c in j-t-c is
   flat-out broken.  It doesn't handle the case where the '?' itself is
   encoded.  Since this case is part of a currently popular attack on
IIS,
 it
   will show up.
 
  Interesting finding. However tomcat decoder should be able to do so - if
  it doesn't we must fix it. Can you check against 3.3beta1 ?
 
  As a note, IMHO it is perfectly legal to have an encoded '?' in the URI,
  and the behavior should be: the '?' will be decoded _after_ the URI is
  separated from query string, and it's used as part of the file name.
 
  AFAIK there is no reason a file ( or pathInfo ) can't have the '?' char
  inside, and the URI spec allow that.
 
  ( of course, paranoia may force us to remove this kind of behavior ).
 
  Costin
 
 
 
 





Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-15 Thread David Rees

On Wed, Aug 15, 2001 at 08:58:00AM -0700, [EMAIL PROTECTED] wrote:

  And, the whole question is what does Tomcat see the request as?  I
  could make a case that it should never know about the unparsed_uri,
  but only the uri that httpd finally resolved to and that mod_jk
  picked up.  -- justin
 
 If we treat apache as a web server, that cooperates with tomcat but can
 do at least what a proxy is allowed to do by the HTTP spec ( i.e. alter
 the request, etc ) - then we are fine ( except the life is interesting
 again, and a lot of work to do including this fix ).

This is the way I expect it to behave, but as Keith pointed out, it may be
useful to have this as a configuration option.

-Dave



Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-15 Thread cmanolache

On Wed, 15 Aug 2001, Bill Barker wrote:

 It is actually worse than that.  TC3.3B1 (with the mod_jk that it ships
 with, I haven't tried j-t-c yet) gives a directory listing in response to:
 http://myserver/%3f%41%3d%42.jsp

If I translate this corectly, your request is
  http://myserver/?a=b.jsp

This is treated as a request for /, with parameters ( that are
ignored since it's a static page ). Hmm, it should return a redirect or
index.html if exists.

Tomcat standalone is ok, it responds 404 ( and it does so because it
corectly does a single decoding _after_ separating the URI in components,
as required by URI spec ).

For mod_jk, it's a bit tricky. I assume you configured apache to handle
the static requests ?

Can you make sure you have a index.html page ? If you see a dir listing,
can you tell me who's generating it ( tomcat adds the version number at
bottom )

Thanks,
Costin




 - Original Message -
 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]; Bill Barker
 [EMAIL PROTECTED]
 Sent: Wednesday, August 15, 2001 11:44 AM
 Subject: Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix


  On Wed, 15 Aug 2001, Bill Barker wrote:
 
   Personally, I agree with Justin and Costin that mod_jk should be able to
 use
   the uri field.
  
   Having said that, I'd like to point out that the mod_jk.c in j-t-c is
   flat-out broken.  It doesn't handle the case where the '?' itself is
   encoded.  Since this case is part of a currently popular attack on IIS,
 it
   will show up.
 
  Interesting finding. However tomcat decoder should be able to do so - if
  it doesn't we must fix it. Can you check against 3.3beta1 ?
 
  As a note, IMHO it is perfectly legal to have an encoded '?' in the URI,
  and the behavior should be: the '?' will be decoded _after_ the URI is
  separated from query string, and it's used as part of the file name.
 
  AFAIK there is no reason a file ( or pathInfo ) can't have the '?' char
  inside, and the URI spec allow that.
 
  ( of course, paranoia may force us to remove this kind of behavior ).
 
  Costin
 
 
 
 






Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-15 Thread cmanolache

Apache2.0 + mod_jk + JNI + tc3.3 gives me the correct answer,
404 ( with the correct URI - /?A=B.jsp ). Note that typing
the unencoded version is returning the correct answer too, i.e.
index.html.

What version of apache are you using ?

Costin



On Wed, 15 Aug 2001, Bill Barker wrote:

 It is actually worse than that.  TC3.3B1 (with the mod_jk that it ships
 with, I haven't tried j-t-c yet) gives a directory listing in response to:
 http://myserver/%3f%41%3d%42.jsp
 - Original Message -
 From: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]; Bill Barker
 [EMAIL PROTECTED]
 Sent: Wednesday, August 15, 2001 11:44 AM
 Subject: Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix


  On Wed, 15 Aug 2001, Bill Barker wrote:
 
   Personally, I agree with Justin and Costin that mod_jk should be able to
 use
   the uri field.
  
   Having said that, I'd like to point out that the mod_jk.c in j-t-c is
   flat-out broken.  It doesn't handle the case where the '?' itself is
   encoded.  Since this case is part of a currently popular attack on IIS,
 it
   will show up.
 
  Interesting finding. However tomcat decoder should be able to do so - if
  it doesn't we must fix it. Can you check against 3.3beta1 ?
 
  As a note, IMHO it is perfectly legal to have an encoded '?' in the URI,
  and the behavior should be: the '?' will be decoded _after_ the URI is
  separated from query string, and it's used as part of the file name.
 
  AFAIK there is no reason a file ( or pathInfo ) can't have the '?' char
  inside, and the URI spec allow that.
 
  ( of course, paranoia may force us to remove this kind of behavior ).
 
  Costin
 
 
 
 






Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-15 Thread Bill Barker

Actually, I have an index.jsp file.

According to the logs (I haven't turned up the logging level yet, so the
information in mininal), I get:
Ctx() : Compiling: /?A=B.jsp to _0003fA_0003dB_0
The corresponding .java file just prints static HTML with a
base href=file://localhost/path/to/ROOT/h1/path/to/ROOT/h1

followed by lines like:
img align=middle src=doc:/lib/images/ftp/file.gif width=32 height=32a
href=index.jspindex.jsp/abr
- Original Message -
From: [EMAIL PROTECTED]
To: Bill Barker [EMAIL PROTECTED]
Sent: Wednesday, August 15, 2001 12:59 PM
Subject: Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix


 On Wed, 15 Aug 2001, Bill Barker wrote:

  It is actually worse than that.  TC3.3B1 (with the mod_jk that it ships
  with, I haven't tried j-t-c yet) gives a directory listing in response
to:
  http://myserver/%3f%41%3d%42.jsp

 If I translate this corectly, your request is
   http://myserver/?a=b.jsp

 This is treated as a request for /, with parameters ( that are
 ignored since it's a static page ). Hmm, it should return a redirect or
 index.html if exists.

 Tomcat standalone is ok, it responds 404 ( and it does so because it
 corectly does a single decoding _after_ separating the URI in components,
 as required by URI spec ).

 For mod_jk, it's a bit tricky. I assume you configured apache to handle
 the static requests ?

 Can you make sure you have a index.html page ? If you see a dir listing,
 can you tell me who's generating it ( tomcat adds the version number at
 bottom )

 Thanks,
 Costin




  - Original Message -
  From: [EMAIL PROTECTED]
  To: [EMAIL PROTECTED]; Bill Barker
  [EMAIL PROTECTED]
  Sent: Wednesday, August 15, 2001 11:44 AM
  Subject: Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix
 
 
   On Wed, 15 Aug 2001, Bill Barker wrote:
  
Personally, I agree with Justin and Costin that mod_jk should be
able to
  use
the uri field.
   
Having said that, I'd like to point out that the mod_jk.c in j-t-c
is
flat-out broken.  It doesn't handle the case where the '?' itself is
encoded.  Since this case is part of a currently popular attack on
IIS,
  it
will show up.
  
   Interesting finding. However tomcat decoder should be able to do so -
if
   it doesn't we must fix it. Can you check against 3.3beta1 ?
  
   As a note, IMHO it is perfectly legal to have an encoded '?' in the
URI,
   and the behavior should be: the '?' will be decoded _after_ the URI is
   separated from query string, and it's used as part of the file name.
  
   AFAIK there is no reason a file ( or pathInfo ) can't have the '?'
char
   inside, and the URI spec allow that.
  
   ( of course, paranoia may force us to remove this kind of behavior ).
  
   Costin
  
  
  
  
 






Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-15 Thread Bill Barker

1.3.17 (with negotiation_module removed to prevent that problem).
- Original Message -
From: [EMAIL PROTECTED]
To: Bill Barker [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Wednesday, August 15, 2001 1:01 PM
Subject: Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix


 Apache2.0 + mod_jk + JNI + tc3.3 gives me the correct answer,
 404 ( with the correct URI - /?A=B.jsp ). Note that typing
 the unencoded version is returning the correct answer too, i.e.
 index.html.

 What version of apache are you using ?

 Costin



 On Wed, 15 Aug 2001, Bill Barker wrote:

  It is actually worse than that.  TC3.3B1 (with the mod_jk that it ships
  with, I haven't tried j-t-c yet) gives a directory listing in response
to:
  http://myserver/%3f%41%3d%42.jsp
  - Original Message -
  From: [EMAIL PROTECTED]
  To: [EMAIL PROTECTED]; Bill Barker
  [EMAIL PROTECTED]
  Sent: Wednesday, August 15, 2001 11:44 AM
  Subject: Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix
 
 
   On Wed, 15 Aug 2001, Bill Barker wrote:
  
Personally, I agree with Justin and Costin that mod_jk should be
able to
  use
the uri field.
   
Having said that, I'd like to point out that the mod_jk.c in j-t-c
is
flat-out broken.  It doesn't handle the case where the '?' itself is
encoded.  Since this case is part of a currently popular attack on
IIS,
  it
will show up.
  
   Interesting finding. However tomcat decoder should be able to do so -
if
   it doesn't we must fix it. Can you check against 3.3beta1 ?
  
   As a note, IMHO it is perfectly legal to have an encoded '?' in the
URI,
   and the behavior should be: the '?' will be decoded _after_ the URI is
   separated from query string, and it's used as part of the file name.
  
   AFAIK there is no reason a file ( or pathInfo ) can't have the '?'
char
   inside, and the URI spec allow that.
  
   ( of course, paranoia may force us to remove this kind of behavior ).
  
   Costin
  
  
  
  
 







[TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread David Rees

Hi,

I came across the need to use mod_rewrite to rewrite some URLs I was sending
to Tomcat.

After playing with it a bit (I had it working a while ago) and finding that
Tomcat was not receiving the rewritten URLs no matter what I did, I took a
look at the source to native/apache1.3/mod_jk.c.  Not being much of an
Apache hacker, the variables were descriptive enough to tell me to make this
change  to the file:

--- mod_jk.c.orig   Tue Aug 14 17:58:21 2001
+++ mod_jk.cTue Aug 14 18:04:58 2001
@@ -358,7 +358,7 @@
 s-method   = (char *)r-method;
 s-content_length = get_content_length(r);
 s-query_string = r-args;
-s-req_uri  = r-unparsed_uri;
+s-req_uri  = r-uri;
 if (s-req_uri != NULL) {
char *query_str = strchr(s-req_uri, '?');
if (query_str != NULL) {

After this change my URLs were getting rewritten as expected again.

Can we apply this change to the tree if there's nothing wrong with it for
the next release?  This problem has affected a large number of users, just
take a look at the tomcat-dev/user archives.

It seems that this change was made to satisfy the errata at
http://java.sun.com/products/servlet/errata_042700.html, but is it the
correct fix if we're intentionally munging the request?

Thanks,
Dave



RE: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread Keith Wannamaker

Hi David,

Unfortunately there are people who were breaking because
we didn't follow the spec.  The better way to fix it is
to create an inverse function for 
ap_parse_uri(request_rec *r, const char *uri) [http_protocol.c]
in mod_jk... one that would 'unparse' the munged
r-uri rewrite and use it instead of r-unparsed_uri.

Keith

| -Original Message-
| From: David Rees [mailto:[EMAIL PROTECTED]]
| Sent: Tuesday, August 14, 2001 9:13 PM
| To: [EMAIL PROTECTED]
| Subject: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix 
| 
| 
| Hi,
| 
| I came across the need to use mod_rewrite to rewrite some URLs I was sending
| to Tomcat.
| 
| After playing with it a bit (I had it working a while ago) and finding that
| Tomcat was not receiving the rewritten URLs no matter what I did, I took a
| look at the source to native/apache1.3/mod_jk.c.  Not being much of an
| Apache hacker, the variables were descriptive enough to tell me to make this
| change  to the file:
| 
| --- mod_jk.c.orig   Tue Aug 14 17:58:21 2001
| +++ mod_jk.cTue Aug 14 18:04:58 2001
| @@ -358,7 +358,7 @@
|  s-method   = (char *)r-method;
|  s-content_length = get_content_length(r);
|  s-query_string = r-args;
| -s-req_uri  = r-unparsed_uri;
| +s-req_uri  = r-uri;
|  if (s-req_uri != NULL) {
| char *query_str = strchr(s-req_uri, '?');
| if (query_str != NULL) {
| 
| After this change my URLs were getting rewritten as expected again.
| 
| Can we apply this change to the tree if there's nothing wrong with it for
| the next release?  This problem has affected a large number of users, just
| take a look at the tomcat-dev/user archives.
| 
| It seems that this change was made to satisfy the errata at
| http://java.sun.com/products/servlet/errata_042700.html, but is it the
| correct fix if we're intentionally munging the request?
| 
| Thanks,
| Dave



Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread Justin Erenkrantz

On Tue, Aug 14, 2001 at 06:13:24PM -0700, David Rees wrote:
 --- mod_jk.c.orig   Tue Aug 14 17:58:21 2001
 +++ mod_jk.cTue Aug 14 18:04:58 2001
 @@ -358,7 +358,7 @@
  s-method   = (char *)r-method;
  s-content_length = get_content_length(r);
  s-query_string = r-args;
 -s-req_uri  = r-unparsed_uri;
 +s-req_uri  = r-uri;
  if (s-req_uri != NULL) {
 char *query_str = strchr(s-req_uri, '?');
 if (query_str != NULL) {
 
 After this change my URLs were getting rewritten as expected again.
 
 Can we apply this change to the tree if there's nothing wrong with it for
 the next release?  This problem has affected a large number of users, just
 take a look at the tomcat-dev/user archives.

This breaks query strings.

r-uri contains only the path portion of the URL.  r-unparsed_uri
contains the URL in its virgin format - as sent by the client.

You can see that mod_jk is looking for the query string (look at the
strchr two lines down) - it won't be there in the r-uri.  You now
need to modify mod_jk to look at r-args.  

But, if you need access to the encoded URI (which is what the comment
above that line in the j-t-c version of mod_jk seems to indicate), the 
only way to do it in httpd is to do with unparsed_uri.  All of the 
other parameters (i.e. r-uri) have been escaped already.  

I'm not sure what the solution is.  But, this one kills off query 
strings to servlets.  That's even worse than losing internal rewrite 
capabilities.

I wonder how Pier is addressing this in mod_webapp.  I'll have to 
look.  -- justin




Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread David Rees

On Tue, Aug 14, 2001 at 10:20:26PM -0400, Keith Wannamaker wrote:

 Unfortunately there are people who were breaking because
 we didn't follow the spec.  The better way to fix it is
 to create an inverse function for 
 ap_parse_uri(request_rec *r, const char *uri) [http_protocol.c]
 in mod_jk... one that would 'unparse' the munged
 r-uri rewrite and use it instead of r-unparsed_uri.

Hi,

OK, are you volunteering to write it?  ;-)  If not, I'll have to take a look
when I get some time and see if I can figure it out.

As an aside, it appears that Tomcat 3.3 remains broken in this regard, as it
uses r-uri instead of r-unparsed_uri.

-Dave



Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread Justin Erenkrantz

On Tue, Aug 14, 2001 at 10:20:26PM -0400, Keith Wannamaker wrote:
 Hi David,
 
 Unfortunately there are people who were breaking because
 we didn't follow the spec.  The better way to fix it is
 to create an inverse function for 
 ap_parse_uri(request_rec *r, const char *uri) [http_protocol.c]
 in mod_jk... one that would 'unparse' the munged
 r-uri rewrite and use it instead of r-unparsed_uri.

You *could* just call ap_escape_uri and try to recreate the relevant
pieces.  Rough pseudocode:

t1 = ap_escape_uri(r-uri)
t2 = ap_escape_uri(r-args)
mod_jk's-uri = strcat(r-uri, ?, r-args, NULL)

The root problem is that r-unparsed_uri and r-uri may not be 
identical in their context.  If you are using mod_rewrite, you could 
have:

r-unparsed_uri=/foo.jsp?bar=baz
r-uri=/spaz.jsp
r-args=bar=baz

But, now you may have escaped something that wasn't originally escaped.
That may be bad as well.  -- justin




Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread Justin Erenkrantz

On Tue, Aug 14, 2001 at 07:25:32PM -0700, David Rees wrote:
 On Tue, Aug 14, 2001 at 10:20:26PM -0400, Keith Wannamaker wrote:
 
  Unfortunately there are people who were breaking because
  we didn't follow the spec.  The better way to fix it is
  to create an inverse function for 
  ap_parse_uri(request_rec *r, const char *uri) [http_protocol.c]
  in mod_jk... one that would 'unparse' the munged
  r-uri rewrite and use it instead of r-unparsed_uri.
 
 Hi,
 
 OK, are you volunteering to write it?  ;-)  If not, I'll have to take a look
 when I get some time and see if I can figure it out.
 
 As an aside, it appears that Tomcat 3.3 remains broken in this regard, as it
 uses r-uri instead of r-unparsed_uri.

My bad.  It is actually easier than I just said - s-req_uri isn't
the complete unparsed URI - just the path.

I didn't look high enough in mod_jk.c.  The version in j-t-c for 
apache-1.3 has:

s-query_string = r-args;

/*
 * The 2.2 servlet spec errata says the uri from
 * HttpServletRequest.getRequestURI() should remain encoded.
 * [http://java.sun.com/products/servlet/errata_042700.html]
 */
s-req_uri  = r-unparsed_uri;
if (s-req_uri != NULL) {
char *query_str = strchr(s-req_uri, '?');
if (query_str != NULL) {
*query_str = 0;
}
}

That strchr call is trying to remove the query string (dicking with
the unparsed_uri like that is a BAD idea - imagine logs looking at the
unparsed_uri).  

You could just have:

s-query_string = r-args;
/*
 * The 2.2 servlet spec errata says the uri from
 * HttpServletRequest.getRequestURI() should remain encoded.
 * [http://java.sun.com/products/servlet/errata_042700.html]
 */
s-req_uri  = ap_encode_uri(r-pool, r-uri);

That seems like it'd satisfy everyone.  -- justin




Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread Pier P. Fumagalli

Justin Erenkrantz at [EMAIL PROTECTED] wrote:
 
 I wonder how Pier is addressing this in mod_webapp.  I'll have to
 look.  -- justin

Easy as 1.2.3... WARP has a concept of URI and QUERY STRING... Very separate
things... All I do is

req-ruri=apr_pstrdup(req-pool,r-uri);
req-args=apr_pstrdup(req-pool,r-args);

The URI goes into the URI, the query string goes into the query string...
Apache does it for me, why should I bother? :)

Pier




Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread Justin Erenkrantz

On Wed, Aug 15, 2001 at 03:41:30AM +0100, Pier P. Fumagalli wrote:
 Justin Erenkrantz at [EMAIL PROTECTED] wrote:
  
  I wonder how Pier is addressing this in mod_webapp.  I'll have to
  look.  -- justin
 
 Easy as 1.2.3... WARP has a concept of URI and QUERY STRING... Very separate
 things... All I do is
 
 req-ruri=apr_pstrdup(req-pool,r-uri);
 req-args=apr_pstrdup(req-pool,r-args);
 
 The URI goes into the URI, the query string goes into the query string...
 Apache does it for me, why should I bother? :)

Which, of course, is the right solution.  

But, do you have to (re)escape the uri (or, is that done in Java 
land?)?  Seems like the 2.2 spec says that the getRequestURI() 
function must return an escaped URI.  r-uri is unescaped.  Or, does 
2.3 say something different?  -- justin




Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread Pier P. Fumagalli

Justin Erenkrantz at [EMAIL PROTECTED] wrote:

 On Wed, Aug 15, 2001 at 03:41:30AM +0100, Pier P. Fumagalli wrote:
 Justin Erenkrantz at [EMAIL PROTECTED] wrote:
 
 I wonder how Pier is addressing this in mod_webapp.  I'll have to
 look.  -- justin
 
 Easy as 1.2.3... WARP has a concept of URI and QUERY STRING... Very separate
 things... All I do is
 
 req-ruri=apr_pstrdup(req-pool,r-uri);
 req-args=apr_pstrdup(req-pool,r-args);
 
 The URI goes into the URI, the query string goes into the query string...
 Apache does it for me, why should I bother? :)
 
 Which, of course, is the right solution.

DOH! :) Am I lucky or what :) :) :)

 But, do you have to (re)escape the uri (or, is that done in Java
 land?)?  Seems like the 2.2 spec says that the getRequestURI()
 function must return an escaped URI.  r-uri is unescaped.  Or, does
 2.3 say something different?  -- justin

It's done in Java land (well, in theory! :) I should really check, that
might be one hit of performance improvement (like 1 millisecond per
request).

Ok, get over it Pier, performance is after the beta :)

Pier (love talking to himself -who, me?- at 4 AM :)




RE: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread Keith Wannamaker

| This breaks query strings.
| 
| r-uri contains only the path portion of the URL.  r-unparsed_uri
| contains the URL in its virgin format - as sent by the client.

No, I don't believe this is quite right.
getRequestURI() in a servlet should return 
r-unparsed_uri minus a query string.

Setting s-uri = r-uri doesn't break
query strings.. but it *does* break the
encoding of the uri.

So tc 3.3 is currently broken as is mod_webapp
(unless the string is encoded on the java side
 in TC4).

However, Justin, I think your suggestion is
the correct solution:

s-req_uri = ap_encode_uri(r-pool, r-uri);

David, or anyone else interested too, would you
try this with some corner test cases and see if
it lives up to expectation?

Keith




Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread cmanolache

On Tue, 14 Aug 2001, Justin Erenkrantz wrote:

 Which, of course, is the right solution.

Is it ? Re-escaping the URI will most likely generate something very
different from the original, it's not symetrical. Getting a re-escaped
request is different from the original, unescaped uri. That's the reason
we use the unescaped uri...

Costin




RE: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread Keith Wannamaker

Costin's right.. seems like the problem encountered
was that there was no way to recreate the encoding 
(or lack thereof) on the original uri.  So the
kludge/solution was to use the unparsed uri and 
chop off the query string.

Keith

| -Original Message-
| From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
| Sent: Tuesday, August 14, 2001 11:13 PM
| To: [EMAIL PROTECTED]
| Subject: Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix
| 
| 
| On Tue, 14 Aug 2001, Justin Erenkrantz wrote:
| 
|  Which, of course, is the right solution.
| 
| Is it ? Re-escaping the URI will most likely generate something very
| different from the original, it's not symetrical. Getting a re-escaped
| request is different from the original, unescaped uri. That's the reason
| we use the unescaped uri...
| 
| Costin
| 



Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread cmanolache

 You could just have:

 s-query_string = r-args;
 /*
  * The 2.2 servlet spec errata says the uri from
  * HttpServletRequest.getRequestURI() should remain encoded.
  * [http://java.sun.com/products/servlet/errata_042700.html]
  */
 s-req_uri  = ap_encode_uri(r-pool, r-uri);


Sounds like a reasonable solution.

Costin




Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread Justin Erenkrantz

On Tue, Aug 14, 2001 at 08:12:31PM -0700, [EMAIL PROTECTED] wrote:
 On Tue, 14 Aug 2001, Justin Erenkrantz wrote:
 
  Which, of course, is the right solution.
 
 Is it ? Re-escaping the URI will most likely generate something very
 different from the original, it's not symetrical. Getting a re-escaped
 request is different from the original, unescaped uri. That's the reason
 we use the unescaped uri...

Potentially, you are correct.  It may not be symmetrical.  However, 
httpd may jump in and rewrite the uri for you.  If that is a problem
(which is what the original poster was complaining about), then you
need to use r-uri instead and escape it.  Unless you want to only
pass the original string NOT what the server is serving.

I'm not sure what the Servlet spec says - use the original string 
that the client passed in, or use the real URI.  I'm out of my 
depth here.  *shrug*  -- justin




Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread Justin Erenkrantz

On Tue, Aug 14, 2001 at 11:13:34PM -0400, Keith Wannamaker wrote:
 Costin's right.. seems like the problem encountered
 was that there was no way to recreate the encoding 
 (or lack thereof) on the original uri.  So the
 kludge/solution was to use the unparsed uri and 
 chop off the query string.

mod_jk chops off the r-unparsed_uri itself without copying.  Negative
points for style.  =-)  -- justin




Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread Craig R. McClanahan



On Tue, 14 Aug 2001, Justin Erenkrantz wrote:

 On Wed, Aug 15, 2001 at 03:41:30AM +0100, Pier P. Fumagalli wrote:
  Justin Erenkrantz at [EMAIL PROTECTED] wrote:
   
   I wonder how Pier is addressing this in mod_webapp.  I'll have to
   look.  -- justin
  
  Easy as 1.2.3... WARP has a concept of URI and QUERY STRING... Very separate
  things... All I do is
  
  req-ruri=apr_pstrdup(req-pool,r-uri);
  req-args=apr_pstrdup(req-pool,r-args);
  
  The URI goes into the URI, the query string goes into the query string...
  Apache does it for me, why should I bother? :)
 
 Which, of course, is the right solution.  
 
 But, do you have to (re)escape the uri (or, is that done in Java 
 land?)?  Seems like the 2.2 spec says that the getRequestURI() 
 function must return an escaped URI.  r-uri is unescaped.  Or, does 
 2.3 say something different?  -- justin
 
 

The getRequestURI() method is supposed to return the *undecoded* request
URI.  As Costin points out, re-escaping an escaped version is not the same
thing. This didn't change in 2.3 -- however, in 2.2. it wasn't formally
documented until an errata was published:

  http://java.sun.com/products/servlet/errata_042700.html

Same thing for getQueryString() -- must remain undecoded.

Craig





Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread David Rees

On Tue, Aug 14, 2001 at 11:05:38PM -0400, Keith Wannamaker wrote:
 | This breaks query strings.
 | 
 | r-uri contains only the path portion of the URL.  r-unparsed_uri
 | contains the URL in its virgin format - as sent by the client.
 
 No, I don't believe this is quite right.
 getRequestURI() in a servlet should return 
 r-unparsed_uri minus a query string.
 
 Setting s-uri = r-uri doesn't break
 query strings.. but it *does* break the
 encoding of the uri.
 
 So tc 3.3 is currently broken as is mod_webapp
 (unless the string is encoded on the java side
  in TC4).
 
 However, Justin, I think your suggestion is
 the correct solution:
 
 s-req_uri = ap_encode_uri(r-pool, r-uri);
 
 David, or anyone else interested too, would you
 try this with some corner test cases and see if
 it lives up to expectation?

I gave it a shot and it compiled fine, but got this error at runtime:

Cannot load /usr/local/apache/libexec/mod_jk.so into server:
/usr/local/apache/libexec/mod_jk.so: undefined symbol: ap_encode_uri

Any hints?  I'm new at Apache module hacking.

-Dave



RE: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread Keith Wannamaker

Try ap_escape_uri

Keith

|  
|  s-req_uri = ap_encode_uri(r-pool, r-uri);
|  
|  David, or anyone else interested too, would you
|  try this with some corner test cases and see if
|  it lives up to expectation?
| 
| I gave it a shot and it compiled fine, but got this error at runtime:
| 
| Cannot load /usr/local/apache/libexec/mod_jk.so into server:
| /usr/local/apache/libexec/mod_jk.so: undefined symbol: ap_encode_uri
| 
| Any hints?  I'm new at Apache module hacking.
| 
| -Dave



Re: [TC3.2.3][PATCH] mod_jk / mod_rewrite bug fix

2001-08-14 Thread cmanolache

On Tue, 14 Aug 2001, Justin Erenkrantz wrote:

 mod_jk chops off the r-unparsed_uri itself without copying.  Negative
 points for style.  =-)  -- justin

That's true. However I'm not sure what else could we do - copy it once
again to another buffer where we chop it ? It's not very much going on
with the unparsed uri.

If you strictly follow the spec,  mod_rewrite is out of question - and
same for most other apache modules that alter the request.
Since all of them are working on the URI, the result is just something
that has no unmodified orginal.

However, if you read the URI spec, 2 URIs are equivalent if the octets are
identical - it doesn't matter how you encode it. Re-escaping the URI has
the extra benefit of getting a canonical escaping, which is also a bit
safer ( hey, we also get the first class security checks apache is doing
on the parsed uris ).

Another note - my understanding of the HTTP specification is that proxies
_are_ allowed to escape/unescape the URI - as long as the result is
equivalent. So if a proxy is used, the original URI the user typed
will be lost. Same for the browsers - what the user types is very
different from what is sent ( at least in Opera ).

Of course, we can define unparsed URI to be whatver the servlet
container receives. This may be different from the original request ( if
it goes through proxies ).

Now the question is - where does the container starts :=). I think there
are plenty of reasons to treat the Apache as not beeing part of the
container - after all it follows completely different rules on mappings
( extension mapps can have path info), and in almost everything.

In fact, I'm not sure all web servers even allow access to the original
unescaped URI. Some IIS or NES expert should let us know.

So my take is that the container should indeed return the original URI -
that the container received. What apache does ( like rewriting, or
canonicalise the URI ) is separate.

Otherwise - the rewriting itself would violate the servlet spec, since it
would alter the URI.

Again - I would bet that at least one of IIS and NES doesn't allow access
to original URI anyway.

Costin

P.S. Quite a long mail for something as simple as 1-2-3, I spend quite a
lot of time with this issue - Larry may remember how long the bug was
open and with my name on it.