RE: ApacheDS non US-ASCII DN manipulation?

Emmanuel Lecharny Tue, 30 Jan 2007 16:04:48 -0800

Pierre-Alain RIVIERE wrote :

Hi everyone,

Hi Pierre-Alain,

I'm working on a project where I have embedded an Apache DS for testingpurpose.


That's a good news :)

In my unit test, I'm trying to launch a request like this one

    "(&(member=cn=John
    Doe,ou=Peoples,ou=Paris,ou=Offices,dc=ippon,dc=fr)(objectClass=ipponGroup))"


Seems to be perfectly correct, at first sight. And at second sight, I confirm 
this is a correct filter (and a correct DN)

Unfortunaly this request fails with the following stack trace

    Caused by: java.lang.NullPointerException
        at
    
org.apache.directory.server.core.schema.DnNormalizer.normalize(DnNormalizer.java:64)


Hmmm. NPE are never a good thing.

With the debugger I found that the concerned method -DnNormalizer#normalize(Object) - does not perform LdapDN dninitilization. Here the code :

<snip/>

Yeah, I think that we should throw an exception if the dn is not a String, a 
Name or a LdapDN. But not a NPE...

In my case dn is null because value passed as parameter is a byte[].


Ahhh. This is not a good idea to pass a byte[]...

Indeed, it seems that the case of DN containing non US-ASCII characters- my DNs may be composed with all characters used for (french) name andsurname - is represented by ApacheDS with a byte[] - which can be usedto construct a new String representation.

Well, this is not true. In fact, if you are using an embbeded ADS, then you 
should pass Human Readable data - like DNs - as UTF-8 strings,
and all other values as byte[] - like JPegPhoto, for instance -.

Passing a byte[] is not a option, because then we have no clue about which kind 
of encoding an user has used to transform the String
to a byte[]. For instance, let's assume you have a String, with french chars ( 
like 'é'). If your local encoding is UTF-8, then

the transformation will generate a different byte array than if your local encoding is ISO-8859-1.But as you can't tell the server which encoding you have used, there is no way it can assume that you have used UTF-8 or something else

(even if you used "UTF-8", as expected).

So, basically, you should _always_ pass the filter as a String. If you use a 
Java string, then be carefull about special chars. Don't forget

that a java file will be stored using a special encoding on your system. It's better to use the '\uxxxx' for special chars into your string,this way this is guarantee that your string will be correctly transmitted.

Is ApacheDS fails to handle UTF8 DN or should I not use UTF8 DN? In thesecond case, is the LDAP protocol explicitly proscribe non US-ASCII DN?


Apache DS handle correctly LdapString. UTF-8 byte[] encoded strings are just 
used to transmit data from a client to the server

and from the server to the data. The first thing the server does is to transform those byte[] to Strings (for attributeTypes which areHuman readable)

So, basically, never use UTF-8 encoded DN.

The Ldap Protocol does not handle anything but bytes. Rules for switching from 
Strings to byte[] for DN are given in RFC 2253 
(http://www.faqs.org/rfcs/rfc2253.html)
String transmitted through the Ldap Protocol messages are first transformed to 
UTF-8 encoded Strings (which are byte[], btw) and decoded before being handled.

If you don't use the protocol layer, then there is no need to pass byte[] to 
the API

I hope I was clear enough to be an help for you. Anyway, just consider that this is not an easy matter (I have spent days to undesrtand how toimplement it into the server...)


May be the API should also be changed to avoid such usage. I must admit that 
throwing a NPE is, well, not good at all ;)

Hope it helps,

Emmanuel Lécharny.

RE: ApacheDS non US-ASCII DN manipulation?

Reply via email to