Mike asked "Mamta, are you close to an implmentation, maybe you could post a patch so that I could work off of that while discussion continues?" I hope to be able to post something in a days time if everything goes fine.
Mamta On 4/16/07, Mike Matrigali <[EMAIL PROTECTED]> wrote:
Below as quoted by Mamta are my views on this. I was hoping that compares involving collation chars would not require twice the number of objects being created. Below describes how store uses InstanceGetter currently to optimize allocation of objects. I was hoping to preserve this performance for current non-collation datatypes and also to avoid needing to provide any additional collation information after the initial dvf.instanceGetterFromIdentifiers call. Just so I know where we are, Dan do you have a problem with the proposed interfaces, ie. are they in the right place and taking the right arguments? If so maybe we could incrementally implement the interfaces so that I could continue the store side while the implmentation discussion continues. I would be ok with an initial interface change that only supported current collation, so that I could at least verify the store changes. Mamta, are you close to an implmentation, maybe you could post a patch so that I could work off of that while discussion continues? Mamta Satoor wrote: > Hi Dan, > > Here are my attempts to answers your questions. > > "Why use InstanceGetter here?" Because Store wants to call the > InstanceGetter once and call getInstance on them multiple times. This is > for efficiency reasons. This is what is currently done but through > interfaces on Monitor rather than DVF. Mike, maybe you can share your > thoughts too on why Store does this. > > > "It doesn't have to return another DVD, it can return itself if it is of > the correct type, thus no additional overhead for UCS_BASIC collation. > Thus this switch would happen once for the first collation, not every > collation, and of course not happen at all if no collation is involved." > I agree, but with InstanceGetter approach, it doesn't even have to > happen once because we will be generating the right DVD in first place. > > "Could you show an example of how the store will be calling the code you > are describing? Maybe that would help me out." > Store would call something like following(this is copied from what Mike > wrote in this same thread, dated April 12th, 2nd mail from Mike, point > 3.) Again, Mike if you have more to add from the Store point of view, > please do so. > > Store will call following once > InstanceGetter = dvf.instanceGetterFromIdentifiers(format id, > collation id) > > Store will call following many times: > dvd = InstanceGetter.getNewInstance() > > The reason for doing it this way is explained by Mike below > > "3) optimized allocation, caching some of the work. This is used > where one query may generate large number of rows - for instance > hash table scan and sorter calls. Here the idea is to do some > part of the work once leaving an InstanceGetter which then can > repeatedly give back new objects in the most optimized way: > > again at this point dvd can be used to correctly compare against other > dvd's in possible collate specific ways." > > thanks, > Mamta > On 4/14/07, *Daniel John Debrunner* <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> wrote: > > Mamta Satoor wrote: > > Hi Dan, > > > > The problem we are trying to solve is provide a way to Store so > that it > > can call a method (say it's called > > getInstanceGetterForFormatIDandCollationType) on DVF with format id & > > collation type and get an InstanceGetter for that combination. > > Why use InstanceGetter here? > > > Like Mike > > mentioned in his earlier mail (in this same thread, dated April 12th, > > 2nd mail from Mike) with point 3), Store will call this method > once and > > call getInstance on that InstanceGetter multiple times to get the > right > > DVD. If we don't change the InstanceGetter as I suggested, then that > > would mean that we will be creating 2 DVD objects for every character > > DVD through Store code. The worst part is we will be doing this > > unnecessary creation of 2 DVDs even for databases which want default > > collation. The 2 DVD creation I am talking about are first, through > > InstanceGetter, we will get say SQLChar. Then at the time of actual > > collation comparison, it will have to call something like > > StringDataValue.getCollationValue(int collationType) to get > another DVD > > to make sure that the collation is being performed with write DVD. > > It doesn't have to return another DVD, it can return itself if it is of > the correct type, thus no additional overhead for UCS_BASIC collation. > Thus this switch would happen once for the first collation, not every > collation, and of course not happen at all if no collation is involved. > > > What I am suggesting does not make InstanceGetter complicated. It is > > pretty simple implementation. All I am proposing is to have special > > InstanceGetter class for collation sensitive DVDs. This new > > InstanceGetter class will have RuleBasedCollator (which will be > set the > > first time this InstanceGetter is created for the given database > through > > the DVF) and it will have collation type(this collation type will > always > > be set to whatever collation type the > > getInstanceGetterForFormatIDandCollationType was called with. This > > collation type will determine which kind of DVD to generate ie > one with > > default collation or one with terriotry based collation). You > mentioned > > in your mail that "I got a little lost in the details". Please let me > > know where it was unclear and I can try to explain it better. > > Could you show an example of how the store will be calling the code you > are describing? Maybe that would help me out. > > > > > As for your question about "does it take account of the fact that > the > > registered format ids are system wide and there can be databases with > > different default collations in the same system?" My understanding is > > that there is one DVF per database and these InstanceGetters will be > > saved on DVF and hence I do not forsee any problems in having > multiple > > databases with different collations in same Derby system. > > Dan. > >
