Hi guys, it has been years I'm thinking about using byte[] inside the server for values. I have tried more than once to get rid of the String, with no success so far : we are too dependant on Strings to get rid of that (like, the PrepareStrng method works on String, not on byte[], the very same for the various comparators, normalizers, syntaxc heckers).
Bottom line, we have to keep the values as Strings. But is this true for every values ? In fact, we always store the received attribute's values in two different format : - a normalized String (if it's a HR Attribute) which gets normalized yada yada - a UP String, which is the value as it has been provided by the user, and which is left untouched. Now, consider a add operation, folloxed by a search operation, from a specific attribute point of vue (say, the 'description' AT) User add : ---------- description:String ---> API ---> conversion to UTF-8 ---> Server Server AddHandler : ------------------- description:byte[] ---> decoder ---> conversion to String ---> creation of the normValue ---> storage on disk ---> conversion of upValue and NormValue to byte[] User search : ------------- send searchRequest ... wait for response Server SearchHandler : ---------------------- fetch the entry => deserialize the Up and Norm value of the description AT (ie, byte[] to String conversion) entry processing through the interceptors write the SearchResultEntry ---> conversion of the description AT UpValue to byte[] (we don't care about the normValue at this point) User search : ------------- ... convert the description Up value to String As we can see, in both operation, we are overdoing : there is no need to convert the UpValue to a String, as we will do a byte[] -> String -> byte[] of this UP value in the search. For the Add, it's slightly better (or less worse) : we can avoid a String--> byte[] conversion when storing the value. Making the UpValue a byte[] will save us a lot of wasted CPU, and probably a bit of space on disk, as a String requires 2 bytes per char to be serialized. This is something we have to work on before 2.0, as the underlying database will be impacted, as we will not serialize the UpValue as a String but as a byte[]. Thoughts ? -- Regards, Cordialement, Emmanuel Lécharny www.iktek.com
