Problem in authentication framework when using umlauts or other language 
specific characters
--------------------------------------------------------------------------------------------

                 Key: COCOON-2263
                 URL: https://issues.apache.org/jira/browse/COCOON-2263
             Project: Cocoon
          Issue Type: Bug
          Components: * Cocoon Core
    Affects Versions: 2.1.9
            Reporter: Ralph


There is a problem in Cocoon regarding URL encoding. We saw this problem on a 
Cocoon 2.1.9, however, this likely applies to Cocoon 2.1.11 + 2.2 versions as 
well, as the relevant code parts are identical.

What happens is the following:

We're using the authentication framework which internally calls a pipeline to 
do the authentication which in turn uses a generator we have written. Within 
the generator, we're accessing values of the parameters object passed in via 
public void setup(SourceResolver resolver, Map objectModel, String src, 
Parameters parameters).

In this case (and the problem really seems to be limited to pipelines called 
internally within cocoon as done by the authentication framework) german 
umlauts and other language specific characters like for the russian, ...  
language are not handled correctly when getting values from the parameters 
object. The problem basically is that Cocoon at some point during 
authentication URL encodes the umlauts in

buf.append(resourceParameters.getEncodedQueryString()); 
-> org.apache.cocoon.components.source.SourceUtil.java, Line 598

and at a later point, before calling the setup(...) of the generator again 
decodes them in

org.apache.cocoon.environment.wrapper.RequestParameters.java, method private 
String parseName(String s).

During encoding, an Umlaut like ä is encoded as %C3%A4 (this is 2 (!) 
characters, which seems to be the UTF-8 encoding that is also returned when 
executing java.net.URLEncoder.encode("ä", "UTF-8")).

When later on Cocoon decodes %C3%A4 using the parseName method though, this 
produces garbage instead of the original ä umlaut. This is due to the handling 
of escape characters in parseName which does not support a 
2-charachter-encoding because of

               case '%':
                   try {
                       sb.append((char) Integer.parseInt(s.substring(i+1, i+3),
                             16));
                       i += 2;
-> org.apache.cocoon.environment.wrapper.RequestParameters.java, starting line 
43

This code treats each %xx as a unique character and doesn't handle the case 
where %xx%yy actually represents only one character. Looks like the 
getEncodedQueryString and the parseName do not work with the same encoding 
scheme or more like the parseName implicitely assumes a certain encoding scheme 
which is not UTF-8. 

In this case, the authentication will fail as the data passed in to the 
generator within the authentication pipeline has been corrupted. This might 
generally apply to other situations as well where pipelines are called 
internally within Cocoon.

Let me know if you need any more information.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to