Re: Dn, Rdn and Ava inconstancies

Emmanuel Lécharny Mon, 20 Feb 2012 13:59:34 -0800

Le 2/20/12 7:36 PM, Alex Karasulu a écrit :

On Mon, Feb 20, 2012 at 6:33 PM, Emmanuel Lécharny<[email protected]>wrote:

Hi guys,

those last days, we had to fight with some issues with the way we handle
DNs and its components :
- creating entries with a RDN containing two times the same AT is not
allowed by the spec
- searching for an entry which RDN is cn=Doe+gn=John does not work when
searching for gn=John+cn=Doe
- renaming an entry on itself when we want to upercase a RDN is not
possible when it should.

Digging a bit into the code, I found that many cases weren't handled
correctly, and the the API is not consistant. We also have issues with
escaped characters.

For instance, if we consider the Ava class, there are some methods that
need to be renamed :
o getUpName() should be renamed to getName() as Dn.getName() and
Rdn.getName() are used
o getUpType() should be renamed to getType() to be consistant with the
previous rename
o getUpValue() should also be renamed to getValue() for the very same
reason.

Now, when it comes to what the methods produce, here is a table showing
the expected values :

If the AVA is not schema aware :

    getNormName()    "ou=exemple \+ rdn\C3\A4\ "
    getNormType()    “ou”
    getNormValue()    "exemple + rdnä "
    getUpName()    "OU = Exemple \+ Rdn\C3\A4\ "
    getUpType()    “OU“
    getUpValue()    "Exemple + Rdnä "
    normalize()    "ou=exemple \+ rdn\C3\A4\ "
    toString()    "OU = Exemple \+ Rdn\C3\A4\ "

and if the AVA is schema aware :

    getNormName() “2.5.4.11=example \+ rdn\C3\A4\ ”
    getNormType() “2.4.5.11”
    getNormValue() “exemple + rdnä ”
    getUpName() "OU = Exemple \+ Rdn\C3\A4\ "
    getUpType() “OU“
    getUpValue() "Exemple + Rdnä "
    normalize() “2.5.4.11=example \+ rdn\C3\A4\ ”
    toString() "OU = Exemple \+ Rdn\C3\A4\ "

Currently, this is not what we get :

    Ava.getNormName() returns 'ou=Exemple \\\+ Rdn\\C3\\A4\\\ '
    Ava.getUpValue() returns 'Exemple \+ Rdn\C3\A4\ '
    Ava.normalize() returns 'ou=Exemple \\\+ Rdn\\C3\\A4\\\ '

The normalize() method seems useless.


For RDN, we also have some method renaming to anticipate :
o getUpType() should be renamed getType()
o getUpValue() should be renamed getValue()
o getValue(String) should be removed, we can grab the value using the
getAva( String ) instead

Same, for the expected values and the values we get :

    getName()    "OU = Exemple \+ Rdn\C3\A4\ +cn=  TEST"
    getNormName()    "ou=exemple \+ rdn\C3\A4\ "
    getNormType()    "ou"
    getNormValue()    "exemple + rdnä "
    getUpType()    “OU“
    getUpValue()    "Exemple \+ Rdn\C3\A4\ "
    getValue(String)    "Exemple \+ Rdn\C3\A4\ " and “TEST”
    toString()    "OU = Exemple \+ Rdn\C3\A4\ +cn=  TEST"

and if the RDN is schema aware :

    getName()    "OU = Exemple \+ Rdn\C3\A4\ +cn=  TEST"
    getNormName()    “2.5.4.11=example \+ rdn\C3\A4\ ”
    getNormType()    "2.5.4.3"
    getNormValue()    “exemple + rdnä “
    getUpType()    “OU“
    getUpValue()    "Exemple \+ Rdn\C3\A4\ “
    getValue(String)    "Exemple \+ Rdn\C3\A4\ " and “TEST”
    toString()    "OU = Exemple \+ Rdn\C3\A4\ +cn=  TEST"

This is what we get :

Rdn.getNormName() returns 'ou=Exemple \+ Rdnä\ +cn=TEST'
Rdn.getNormValue() returns 'Exemple + Rdnä '
Rdn.getUpValue() returns ' Exemple \+ Rdn\C3\A4\ '
Rdn.getValue( 'ou' ) returns 'Exemple + Rdnä '
Rdn.getValue( 'test' ) returns ''

Etc...

I have not yet coded the tests for the schema aware AVA and RDN, but be
sure we will get more inconsistencies. I still have to write down the same
analysis for Dn, but this is the same story.


We really need to fix those inconsistencies otherwise we will have endless
issues. This is not the first time we are dealing with them, bt so far, we
never had to face them for real, and we just tried our best to shoot the
errors when they appear. I think it's time to play medieval on the code !

This makes a lot of sense. As things matured in the project we started
seeing more and more of the corner cases that we need to account for.  As
you say, we did this incrementally as we encountered various situations.

Over time this strains how this area of the library was designed. Naturally
you cannot account for everything and over time various choices see
obsolete as you patch and patch and patch the code.

Now after seeing so much of the corner cases and how the design may not be
supporting it cleanly, and efficiently, then sure re-architect it know that
we have the tests, history and the knowledge.

Thanks a lot, Alex !!!

I have spent *so much* time in the past on those classes that I somepoint I felt a bit depressed that I wasn't able to design itcorrectly... Yes, it was like when a baby start to walk : you can't runbefore being able to walk...

So, the main issue is the way AVA handles values. As soon as we *know*what we should expect when we create an AVA, then suddenly it becomesway easier. Basically, an AVA contains one type and one value. Thisvalue can be a String or a byte[], depending on the type. Saddly, if theAVA is not schema aware, we can't tell if the value is binary or String.

The AVA are not intended to be used in insolation in a RDN, they arealso representing a part of the entry (keep in mind that a RDN AVA*must* be present in the entry). That means AVAs are dual objects (partof the RDN but also part of the Entry).

That being said, the AVA's value is always used internally as an UTF-8encoded String if it's a String, or as a byte[] if it's not a String. Sofar, so good, as we store the value in Value<?> objects.

It starts to be a bit complex when we know that we must keep the valueas it was provided by the user, but the server requires that the valueis normalized before being able to deal with it (it's not mandatory thatwe keep a normalized form of the value, but it's critical from aperformance POV).

That means we will keep two instances of the value : the user providedone and the normalized one.

Wait ! The Value<?> class already does that for us ! Great... So we aresafe, aren't we ?

Not even close... The tricky part is that we must use AVAs in a RDN, andRDN are often exposed as Strings, where some characters must be escaped.For instance, '+', spaces at the end, any non ASCII chars, etc need tobe escaped.

That implies we need a escape( String ) method to do this work when wewant to produce a String from an AVA instance. This method alreadyexists, thanks god !

Now, come the Schame into the game. A Schema aware AVA will be able todeterminate if the value is binary or not, and will modify the internedvalue by applying a normalization method (accordingly to theAttributeType). This will impact the String representation of the AVAwhen used in a RDN (not in the entry), when the user wants to manipulatethe RDN as a String.


Basically, we will have two forms for an AVA :
- a User Provided form (the standard form)

- a Normalized form which will differ depending on the fact that the AVAis schema aware, or not.

Here is a small table representing the different forms for each of themethods exposing an AVA content :


Considering that we want to create the following AVA :
"OU" with the value "Exemple + Rdn\u00E4 "

(note that the value has a space at the end, contains a '+' and a 'ä'character - which is representing using the \u00E4 unicode sequence in aJava string)


1) The Ava.getNormName() method will return :
 - No Schema    : "ou=Exemple \\+ Rdn\\C3\\A4\\ "
 - With Schema : "2.5.4.11=exemple \\+ rdn\\C3\\A4"

As we can see, the getNormName() return a String so the value must beescaped. As we dont know what kind of normalizer to use when we don'thave a schema, in the first case we still have 'ou' instead of'2.5.4.11', and the upper case chars remain upper cased. The 'ä'character is translated to its UTF-8 encoding and escaped counterpart,ie C3 A4. Last, not least, the end space is kept when we don't have aschema, but removed due to the normalization when we have a schema.

2) The Ava.getNormType() returns "ou" if we don't have a schema and"2.5.4.11" with a schema


3) The Ava.getNormValue().getString() will return :
 - No Schema   : "Exemple + Rdnä "
 - With Schema : "exemple + rdnä"

Same as (1), the normalization has removed the ending space and lowercased everything. The big difference here is that we don't escape anycharacter, as we are dealing with the Value itself, not with an elementof the RDN. The escaped characters are only useful when used in a DN ora RDN, when the AVA is seen as a combinaison of an AttributeType and aValue.

4) The Ava.getUpName() will return "OU=Exemple \\+ Rdn\\C3\\A4\\ " inboth case. It's the original value, escaped (because it's a String,likely to be used as a part of a DN or RDN).


5) The Ava.getUpType() returns "OU" in both cases

6) The Ava.getUpValue().toString() returns "Exemple + Rdnä " in bothcases. The ending space is kept, as it's the way the user typed it in.No escaped chars, it's a Value, not an AVA.


7) The Ava.normalize() returns :
 - No Schema   : "ou=Exemple \\+ Rdn\\C3\\A4\\ "
 - With Schema : "2.5.4.11=exemple \\+ rdn\\C3\\A4"

Here again, as the AVA has been normalized, the ending space has beenremoved in the second case. The attribute has been replaced by its OID,and when we have no schema, it has been lower cased. All the specialschars are escaped.

8) Last, not least, the toString() method returns the original value,escaped : "OU=Exemple \\+ Rdn\\C3\\A4\\ "

Note : we could decide that the toString() method should return thevalue the user provided, instead of escaping characters. That would makea lot of sense, IMO.

AFAICT, shared is building fine with the changes I applied to get theresults I exposed.


I still have to move forward to get RDN and DN be consistent.

--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: Dn, Rdn and Ava inconstancies

Reply via email to