On Mon, Aug 1, 2011 at 2:58 PM, Leonardo Uribe <[email protected]> wrote: > Hi > > 2011/8/1 Curtiss Howard <[email protected]>: >> Hi, >> >> One of the teams we work with was recently complaining that whenever >> they pass an already-encoded URL to <h:graphicImage> that they get a >> double encoding, which never happened before in our previous release >> (which corresponded to the 1.2 timeframe). It appears that this is >> the result of a bug in encodeURIAttribute(), introduced after 1.2 and >> before 2.0: >> >> else if (c == '%') >> { >> if (i + 2 < string.length()) >> { >> char c1 = string.charAt(i+1); >> char c2 = string.charAt(i+2); >> if ((( c1 >= '0' && c1 <='9') || (c1 >='A' && c1 >> <='Z')) && >> (( c2 >= '0' && c2 <='9') || (c2 >='A' && c2 >> <='Z'))) >> { >> // do not percent encode, because it could be >> already encoded >> // and we don't want encode it twice >> } >> else >> { >> app = percentEncode(c, UTF8); >> } >> } >> else >> { >> app = percentEncode(c, UTF8); >> } >> } >> >> (Forgive me if the formatting doesn't come out right...) > > Just for reference the issue you are talking about is MYFACES-1841 > HtmlResponseWriterImpl.writeURIAttribute does not perform proper URLs > encoding ( ex: & should be encoded in &) >
That's the issue where this code was introduced, yes. >> >> So whenever we find a %, we try to look ahead and see if the next two >> characters constitute two hex characters. If so, we leave the % alone >> since this is clearly a portion of the string that has already been >> encoded. >> >> There are two problems with that code however: >> >> 1. Checking for '0' <= c <= '9' && 'A' <= c <= 'Z' means that "%XX" >> would remain unchanged, when really it needs to be "%2AXX". We should >> be checking for 'A' <= c <= 'F' since we're talking about hex >> characters. > > Are you mixing some javascript here? In theory if the percent encoding > was already done, there is no reason to do it again, because this > could break the generated URI. The algorithm check that condition and > do this is intentional. Maybe the problem could be solved detecting > this special condition when h:graphicImage do its job. > No, there's no javascript. Just URLs. Specifically, this is in a Portal environment when using WSRP. The image URLs are altered by some Portal/WSRP magic into some sort of query string that includes the original URL as a parameter (I don't follow all of the details, I'm not a Portal guy). Hence, they %-encode the URL before it reaches the h:graphicImage renderer, which in turn calls ResponseWriter.writeURIAttribute(), which in turn calls into HTMLEncoder.encodeURIAttribute(). I've argued back and forth with them but I don't see any reason why they can't pass a %-encoded value into ResponseWriter.writeURIAttribute() and assume that they don't get encoded twice. In other words, the argument is that: writeURIAttribute(writeURIAttribute (value)) === writeURIAttribute(value) The code I've highlighted does what we're talking about, but it clearly doesn't work with lowercase characters, which we both agree on (and is the source of their problem; the URLs they pass in use lowercase hex characters). The second point amounts to something I think is a bug in the code (rather than just a missing check). It's deciding whether or not to leave a % alone if the next two characters are alphanumeric. I think this is wrong, it should be checking for the next two characters to be _hex only_. After all, if my value is "%XX", then the result should be "%25XX" since "XX" isn't valid hexadecimal. But, I can understand if changing the character range has greater implications. >> 2. The problem our colleagues saw: we aren't handling lowercase characters! >> >> So I just want to make sure everyone is in agreement with my >> assessment that we need to change the character range from A..Z to >> A..F and also check for a..f. If so, I'll make the change. > > I checked it and rfc 2141 specify lower case chars are valid in this > case, so this change is valid. See: > > http://www.ietf.org/rfc/rfc2141.txt > > regards, > > Leonardo Uribe > >> >> Thanks, >> >> >> Curtiss Howard >> >
