Code review request: 6961765: Double byte characters corrupted in DN for LDAP referrals

Weijun Wang Tue, 06 Mar 2012 00:34:13 -0800

Hi Vinnie

This bug is about using UrlUtil.decode() to decode a URL that is notfully encoded, i.e. including non-ASCII characters.


The webrev is at

   http://cr.openjdk.java.net/~weijun/6961765/webrev.00/

It simply delegates the call to URLDecoder.decode().

LDAP URL (RFC 4516 2.1) specifies that only <reserved>, <unreserved>,and <pct-encoded> chars can be used, which do not include generalnon-ASCII unicode. So precisely the user input in the bug report isillegal, but since it's already a valid URL/URI in Java, we can somehowbe more friendly.

In fact, the javadoc of URLDecoder [1] also only allows thesecharacters, but at the same time it says --


   There are two possible ways in which this decoder could deal with
   illegal strings. It could either leave illegal characters alone or
   it could throw an IllegalArgumentException. Which approach the
   decoder takes is left to the implementation.

Now the Oracle implementation of the class "leave illegal charactersalone". In this sense, UrlUtil is not as good as URLDecoder. It neitherleaves them alone nor throws an exception.

To be more correct, I think we can update URLDecoder so that it leavesUnicode in the "other" category (non-control, non-whitespace non-ASCIIUnicode chars, as described in URI's spec) unchanged, and throw anexception otherwise (that is, non-ASCII, and control or space). But I'llleave that to another RFE.


Thanks
Max


-------- Original Message --------
*Change Request ID*: 6961765
*Synopsis*: Double byte characters corrupted in DN for LDAP referrals

=== *Description*============================================================

SYNOPSIS
--------
Double byte characters corrupted in DN for LDAP referrals

OPERATING SYSTEM
----------------
All

FULL JDK VERSION
----------------
All

DESCRIPTION
-----------

If the DN component of an LDAP URL contains double byte characters, itis corrupted by com.sun.jndi.toolkit.url.UrlUtil.decode(). Thiscorruption leads to application level failures.


Consider the following scenario:

1. Application connects to an LDAP server and searches for the string
   uid=???,??? (where ??? are double byte characters)

2. JNDI code receives a referral, for example:
   ldap://www.test.com/uid=???,???,ou=people,ou=test,ou=test,o=test

3. The referral is then parsed to split the hostname, port number and
   the DN element of the URI via
   com.sun.jndi.ldap.LdapURL.parsePathAndQuery()

4. The DN element is decoded using
   com.sun.jndi.toolkit.url.UrlUtil.decode()

5. This method expects the characters to be ASCII. If the characters
   are non-ASCII, as in our example, then those characters are not
   converted properly.

6. This corrupted DN is then passed to the LDAP server, resulting in an
   unexpected failure.

TESTCASE
--------

This testcase does not represent normal application code. It highlightsthe problem by calling into com.sun.* internal classes directly. Thisallows the problem to be demonstrated without setting up an LDAP server.


import java.net.URI;
import java.net.URLDecoder;
import com.sun.jndi.ldap.LdapURL;

public class LdapURLTest {
    public static void main (String args[]) throws Exception {

String testString =("ldap://www.test.com/uid=\u3070\u3073\u3076,\u3079\u307C\u307E,ou=test,ou=test,ou=test,o=test";);

        LdapURL ldURL = new LdapURL(testString);
        System.out.println("     LDAP URL String: " + testString);
        System.out.println("          decoded DN: " + ldURL.getDN());

        // suggested fix demonstration
        String DN;
        String path = new URI(testString).getPath();

        DN = path.startsWith("/") ? path.substring(1) : path;
        String proposedDN = URLDecoder.decode(DN, "UTF8");

        System.out.println("\nDN from proposed fix: " + proposedDN);
    }
}

SUGGESTED FIX
-------------

Use java.net.URLDecoder rather than com.sun.jndi.toolkit.url.UrlUtil toconduct the URL decoding in parsePathAndQuery().

Specifically, change the line that decodes the DN element incom.sun.jndi.ldap.LdapURL.parsePathAndQuery() from:


    DN = path.startsWith("/") ? path.substring(1) : path;
    if (DN.length() > 0) {
-->     DN = UrlUtil.decode(DN, "UTF8");       <--
    }

to:

    DN = path.startsWith("/") ? path.substring(1) : path;
    if (DN.length() > 0) {
-->     DN = URLDecoder.decode(DN, "UTF8");    <--
    }

=== *Evaluation*=============================================================

The URL in the testcase has an invalid encoding. Its Unicode characters
must be encoded in UTF-8. For example,

    \u3070 -> \e3\81\b0 -> %5Ce3%5C81%5Cb0

Code review request: 6961765: Double byte characters corrupted in DN for LDAP referrals

Reply via email to