On Tue, Aug 16, 2011 at 4:38 PM, Emmanuel Lécharny <[email protected]>wrote:
> On 8/16/11 3:27 PM, Alex Karasulu wrote: > >> On Tue, Aug 16, 2011 at 10:53 AM, Emmanuel Lécharny<[email protected]> >> **wrote: >> >> On 8/15/11 5:59 PM, Stefan Seelmann wrote: >>> >>> Now I have to update the parts that are a bit special, let me explain: >>>> In HBase partition I didn't use one-level and sub-level indices, but >>>> use the RDN index table instead. I also extended the search engine in >>>> that way that one-level and sub-level cursors get the search filter in >>>> order to perform filtering within the store instead of returning all >>>> candidates and evaluate them. >>>> >>>> Some toughts about this one-level/sub-level index. >>> >>> Using the Rdn index makes perfect sense : we have the Rdn -> parent >>> relation plus the parent -> children relation in this index, so there is >>> no >>> need to have a one level index (all the children are already listed in >>> the >>> RDN index for a specific entry). I'm a bit more concerned about the >>> sub-level processing : we have to recurse on all the children to get all >>> the >>> candidates. That's fine, we can easily implement that (and you already >>> did), >>> but what concerns me is that we don't have the count of all the entries, >>> we >>> will have to compute them. This count is necessary in the search engine >>> to >>> select the index we will use to walk the entries. >>> >>> One solution would be to store two more elements in the ParentIdAndRdn >>> data >>> structure : the number of children directly below the RDN, and the number >>> of >>> children and descendant. That would probably solve the issue I'm >>> mentioning. >>> Of course, that also means we wil have to update all the RDN hierarchy >>> from >>> top to bottom (but affecting only the RDN part of the entry DN) each time >>> we >>> add/move/delete an entry. Note that we already do that for the oneLevel >>> and >>> Sublevel index. >>> >>> >>> Good idea Emmanuel. >> > > Note that I just rephrased Stefan's idea here. It's not mine initially. > > >> This would be a neat solution to handling the sub level count problem. >> Let's >> experiment with this and see if it does intact lead to a speedup which I >> think it should but it's good to just see. I wish we had a nice lab for >> this. >> > HBase work done by Stefa is already an excellent lab :) > > Yes it is a very good exercise which shows the interfaces and design are holding up pretty well if these relatively minor issues are all we have to worry about. But what I meant was a lab where machines are ready to run tests on our nightly builds not just the experience of writing this partition :-). It would be neat to see the progression with theses changes over time. -- Best Regards, -- Alex
