On Thu, Aug 22, 2013 at 2:56 PM, Emmanuel Lécharny <[email protected]>wrote:
> Hi guys, > > it has been years I'm thinking about using byte[] inside the server for > values. I have tried more than once to get rid of the String, with no > success so far : we are too dependant on Strings to get rid of that > (like, the PrepareStrng method works on String, not on byte[], the very > same for the various comparators, normalizers, syntaxc heckers). > > Bottom line, we have to keep the values as Strings. > > But is this true for every values ? > > In fact, we always store the received attribute's values in two > different format : > - a normalized String (if it's a HR Attribute) which gets normalized > yada yada > - a UP String, which is the value as it has been provided by the user, > and which is left untouched. > > Now, consider a add operation, folloxed by a search operation, from a > specific attribute point of vue (say, the 'description' AT) > > User add : > ---------- > description:String ---> API ---> conversion to UTF-8 ---> Server > > Server AddHandler : > ------------------- > description:byte[] ---> decoder ---> conversion to String ---> creation > of the normValue ---> storage on disk ---> conversion of upValue and > NormValue to byte[] > > > User search : > ------------- > send searchRequest > ... > wait for response > > Server SearchHandler : > ---------------------- > fetch the entry => deserialize the Up and Norm value of the description > AT (ie, byte[] to String conversion) > entry processing through the interceptors > write the SearchResultEntry ---> conversion of the description AT > UpValue to byte[] (we don't care about the normValue at this point) > > User search : > ------------- > ... > convert the description Up value to String > > > As we can see, in both operation, we are overdoing : there is no need to > convert the UpValue to a String, as we will do a byte[] -> String -> > byte[] of this UP value in the search. For the Add, it's slightly better > (or less worse) : we can avoid a String--> byte[] conversion when > storing the value. > > > Making the UpValue a byte[] will save us a lot of wasted CPU, and > probably a bit of space on disk, as a String requires 2 bytes per char > to be serialized. > > even the byte[] will be of same size, I don't see where the gain is am I missing something? > This is something we have to work on before 2.0, as the underlying > database will be impacted, as we will not serialize the UpValue as a > String but as a byte[]. > > Thoughts ? > > -- > Regards, > Cordialement, > Emmanuel Lécharny > www.iktek.com > > -- Kiran Ayyagari http://keydap.com
