Le 2/20/12 7:36 PM, Alex Karasulu a écrit :
On Mon, Feb 20, 2012 at 6:33 PM, Emmanuel Lécharny<[email protected]>wrote:
Hi guys,
those last days, we had to fight with some issues with the way we handle
DNs and its components :
- creating entries with a RDN containing two times the same AT is not
allowed by the spec
- searching for an entry which RDN is cn=Doe+gn=John does not work when
searching for gn=John+cn=Doe
- renaming an entry on itself when we want to upercase a RDN is not
possible when it should.
Digging a bit into the code, I found that many cases weren't handled
correctly, and the the API is not consistant. We also have issues with
escaped characters.
For instance, if we consider the Ava class, there are some methods that
need to be renamed :
o getUpName() should be renamed to getName() as Dn.getName() and
Rdn.getName() are used
o getUpType() should be renamed to getType() to be consistant with the
previous rename
o getUpValue() should also be renamed to getValue() for the very same
reason.
Now, when it comes to what the methods produce, here is a table showing
the expected values :
If the AVA is not schema aware :
getNormName() "ou=exemple \+ rdn\C3\A4\ "
getNormType() “ou”
getNormValue() "exemple + rdnä "
getUpName() "OU = Exemple \+ Rdn\C3\A4\ "
getUpType() “OU“
getUpValue() "Exemple + Rdnä "
normalize() "ou=exemple \+ rdn\C3\A4\ "
toString() "OU = Exemple \+ Rdn\C3\A4\ "
and if the AVA is schema aware :
getNormName() “2.5.4.11=example \+ rdn\C3\A4\ ”
getNormType() “2.4.5.11”
getNormValue() “exemple + rdnä ”
getUpName() "OU = Exemple \+ Rdn\C3\A4\ "
getUpType() “OU“
getUpValue() "Exemple + Rdnä "
normalize() “2.5.4.11=example \+ rdn\C3\A4\ ”
toString() "OU = Exemple \+ Rdn\C3\A4\ "
Currently, this is not what we get :
Ava.getNormName() returns 'ou=Exemple \\\+ Rdn\\C3\\A4\\\ '
Ava.getUpValue() returns 'Exemple \+ Rdn\C3\A4\ '
Ava.normalize() returns 'ou=Exemple \\\+ Rdn\\C3\\A4\\\ '
The normalize() method seems useless.
For RDN, we also have some method renaming to anticipate :
o getUpType() should be renamed getType()
o getUpValue() should be renamed getValue()
o getValue(String) should be removed, we can grab the value using the
getAva( String ) instead
Same, for the expected values and the values we get :
getName() "OU = Exemple \+ Rdn\C3\A4\ +cn= TEST"
getNormName() "ou=exemple \+ rdn\C3\A4\ "
getNormType() "ou"
getNormValue() "exemple + rdnä "
getUpType() “OU“
getUpValue() "Exemple \+ Rdn\C3\A4\ "
getValue(String) "Exemple \+ Rdn\C3\A4\ " and “TEST”
toString() "OU = Exemple \+ Rdn\C3\A4\ +cn= TEST"
and if the RDN is schema aware :
getName() "OU = Exemple \+ Rdn\C3\A4\ +cn= TEST"
getNormName() “2.5.4.11=example \+ rdn\C3\A4\ ”
getNormType() "2.5.4.3"
getNormValue() “exemple + rdnä “
getUpType() “OU“
getUpValue() "Exemple \+ Rdn\C3\A4\ “
getValue(String) "Exemple \+ Rdn\C3\A4\ " and “TEST”
toString() "OU = Exemple \+ Rdn\C3\A4\ +cn= TEST"
This is what we get :
Rdn.getNormName() returns 'ou=Exemple \+ Rdnä\ +cn=TEST'
Rdn.getNormValue() returns 'Exemple + Rdnä '
Rdn.getUpValue() returns ' Exemple \+ Rdn\C3\A4\ '
Rdn.getValue( 'ou' ) returns 'Exemple + Rdnä '
Rdn.getValue( 'test' ) returns ''
Etc...
I have not yet coded the tests for the schema aware AVA and RDN, but be
sure we will get more inconsistencies. I still have to write down the same
analysis for Dn, but this is the same story.
We really need to fix those inconsistencies otherwise we will have endless
issues. This is not the first time we are dealing with them, bt so far, we
never had to face them for real, and we just tried our best to shoot the
errors when they appear. I think it's time to play medieval on the code !
This makes a lot of sense. As things matured in the project we started
seeing more and more of the corner cases that we need to account for. As
you say, we did this incrementally as we encountered various situations.
Over time this strains how this area of the library was designed. Naturally
you cannot account for everything and over time various choices see
obsolete as you patch and patch and patch the code.
Now after seeing so much of the corner cases and how the design may not be
supporting it cleanly, and efficiently, then sure re-architect it know that
we have the tests, history and the knowledge.
Thanks a lot, Alex !!!
I have spent *so much* time in the past on those classes that I some
point I felt a bit depressed that I wasn't able to design it
correctly... Yes, it was like when a baby start to walk : you can't run
before being able to walk...
So, the main issue is the way AVA handles values. As soon as we *know*
what we should expect when we create an AVA, then suddenly it becomes
way easier. Basically, an AVA contains one type and one value. This
value can be a String or a byte[], depending on the type. Saddly, if the
AVA is not schema aware, we can't tell if the value is binary or String.
The AVA are not intended to be used in insolation in a RDN, they are
also representing a part of the entry (keep in mind that a RDN AVA
*must* be present in the entry). That means AVAs are dual objects (part
of the RDN but also part of the Entry).
That being said, the AVA's value is always used internally as an UTF-8
encoded String if it's a String, or as a byte[] if it's not a String. So
far, so good, as we store the value in Value<?> objects.
It starts to be a bit complex when we know that we must keep the value
as it was provided by the user, but the server requires that the value
is normalized before being able to deal with it (it's not mandatory that
we keep a normalized form of the value, but it's critical from a
performance POV).
That means we will keep two instances of the value : the user provided
one and the normalized one.
Wait ! The Value<?> class already does that for us ! Great... So we are
safe, aren't we ?
Not even close... The tricky part is that we must use AVAs in a RDN, and
RDN are often exposed as Strings, where some characters must be escaped.
For instance, '+', spaces at the end, any non ASCII chars, etc need to
be escaped.
That implies we need a escape( String ) method to do this work when we
want to produce a String from an AVA instance. This method already
exists, thanks god !
Now, come the Schame into the game. A Schema aware AVA will be able to
determinate if the value is binary or not, and will modify the interned
value by applying a normalization method (accordingly to the
AttributeType). This will impact the String representation of the AVA
when used in a RDN (not in the entry), when the user wants to manipulate
the RDN as a String.
Basically, we will have two forms for an AVA :
- a User Provided form (the standard form)
- a Normalized form which will differ depending on the fact that the AVA
is schema aware, or not.
Here is a small table representing the different forms for each of the
methods exposing an AVA content :
Considering that we want to create the following AVA :
"OU" with the value "Exemple + Rdn\u00E4 "
(note that the value has a space at the end, contains a '+' and a 'ä'
character - which is representing using the \u00E4 unicode sequence in a
Java string)
1) The Ava.getNormName() method will return :
- No Schema : "ou=Exemple \\+ Rdn\\C3\\A4\\ "
- With Schema : "2.5.4.11=exemple \\+ rdn\\C3\\A4"
As we can see, the getNormName() return a String so the value must be
escaped. As we dont know what kind of normalizer to use when we don't
have a schema, in the first case we still have 'ou' instead of
'2.5.4.11', and the upper case chars remain upper cased. The 'ä'
character is translated to its UTF-8 encoding and escaped counterpart,
ie C3 A4. Last, not least, the end space is kept when we don't have a
schema, but removed due to the normalization when we have a schema.
2) The Ava.getNormType() returns "ou" if we don't have a schema and
"2.5.4.11" with a schema
3) The Ava.getNormValue().getString() will return :
- No Schema : "Exemple + Rdnä "
- With Schema : "exemple + rdnä"
Same as (1), the normalization has removed the ending space and lower
cased everything. The big difference here is that we don't escape any
character, as we are dealing with the Value itself, not with an element
of the RDN. The escaped characters are only useful when used in a DN or
a RDN, when the AVA is seen as a combinaison of an AttributeType and a
Value.
4) The Ava.getUpName() will return "OU=Exemple \\+ Rdn\\C3\\A4\\ " in
both case. It's the original value, escaped (because it's a String,
likely to be used as a part of a DN or RDN).
5) The Ava.getUpType() returns "OU" in both cases
6) The Ava.getUpValue().toString() returns "Exemple + Rdnä " in both
cases. The ending space is kept, as it's the way the user typed it in.
No escaped chars, it's a Value, not an AVA.
7) The Ava.normalize() returns :
- No Schema : "ou=Exemple \\+ Rdn\\C3\\A4\\ "
- With Schema : "2.5.4.11=exemple \\+ rdn\\C3\\A4"
Here again, as the AVA has been normalized, the ending space has been
removed in the second case. The attribute has been replaced by its OID,
and when we have no schema, it has been lower cased. All the special
schars are escaped.
8) Last, not least, the toString() method returns the original value,
escaped : "OU=Exemple \\+ Rdn\\C3\\A4\\ "
Note : we could decide that the toString() method should return the
value the user provided, instead of escaping characters. That would make
a lot of sense, IMO.
AFAICT, shared is building fine with the changes I applied to get the
results I exposed.
I still have to move forward to get RDN and DN be consistent.
--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com