----- Original Message -----
From: <[EMAIL PROTECTED]>
To: "Tomcat Developers List" <[EMAIL PROTECTED]>
Sent: Sunday, February 03, 2002 10:36 PM
Subject: Re: cvs commit: jakarta-tomcat RELEASE-PLAN-3.3.1.txt


> On Sat, 2 Feb 2002, Bill Barker wrote:
>
> > >   +        4416  URI En/Decoding not working
> > >   +              (investigate and fix if feasible)
> > My vote is for LATER, since as I understand the bug it is too late to
test
> > this well, and  the fix (if not done right) has the potential to create
> > security problems.  The fix is to basically flip UEncoder on it's head,
and
> > work with "un-safe chars" instead of "safe chars" (as well as to add the
> > logic to use the encoding).  If Costin (since it's his baby) thinks he's
up
> > to it, by all means go for it.  I just don't want to delay the release
for
> > the amount of time it would take me to make and be comfortable with the
fix
> > (esp. since there is a work-around already).
>
> I'm not sure I understand - the bug seems to be about
> DecodeInterceptor using 8859_1 for decoding, even if a different
> decoding was found.
>
> I don't think it is touching UEncoder and the url encoding/decoding.
> The url decoding has nothing to do with the charset - we decode
> %xx as bytes, the url encoding happens after char->byte and decoding
> happen before byte->char conversions ( i.e. uencoding operates on
> bytes ).
My understanding of this is that if the request is for:
    /el-niņo.jsp
then most of the time Tomcat will read it correctly. But it will return for
requestURI:
    /el-ni%A1o.jso
The "safe chars" map to the same code points under iso-latin-1 and utf-8
(that's why they are "safe chars").  UEncoder is strict in what is safe, but
the RFC isn't.  You are allowed to use exteded chars if the other side is
capable of detecting the charset.
>
> It is possible we have a bug - and a test case would help finding it. The
> code is quite tricky ( I spent huge amounts of time with charset/encoding

> issues ), and I agree LATER is good given the risks. But if I have
> the test case, I can take a look, it may be a simple fix.
>
> The way it is supposed to work - first the bytes are url decoded,
> then we detect the charset, then convert bytes to chars.
>
> Am I missing something here ?
>
> Costin
>
>
> --
> To unsubscribe, e-mail:
<mailto:[EMAIL PROTECTED]>
> For additional commands, e-mail:
<mailto:[EMAIL PROTECTED]>
>


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to