Re: HBase partition integration in trunks ?

Emmanuel Lécharny Tue, 16 Aug 2011 04:27:45 -0700

On 8/15/11 5:59 PM, Stefan Seelmann wrote:

Now I have to update the parts that are a bit special, let me explain:
In HBase partition I didn't use one-level and sub-level indices, but
use the RDN index table instead. I also extended the search engine in
that way that one-level and sub-level cursors get the search filter in
order to perform filtering within the store instead of returning all
candidates and evaluate them.

Some toughts about this one-level/sub-level index.

Using the Rdn index makes perfect sense : we have the Rdn -> parentrelation plus the parent -> children relation in this index, so there isno need to have a one level index (all the children are already listedin the RDN index for a specific entry). I'm a bit more concerned aboutthe sub-level processing : we have to recurse on all the children to getall the candidates. That's fine, we can easily implement that (and youalready did), but what concerns me is that we don't have the count ofall the entries, we will have to compute them. This count is necessaryin the search engine to select the index we will use to walk the entries.

One solution would be to store two more elements in the ParentIdAndRdndata structure : the number of children directly below the RDN, and thenumber of children and descendant. That would probably solve the issueI'm mentioning. Of course, that also means we wil have to update all theRDN hierarchy from top to bottom (but affecting only the RDN part of theentry DN) each time we add/move/delete an entry. Note that we already dothat for the oneLevel and Sublevel index.

All in all, I do think this is feasable, and you probably already haveimplemented such logic in the HBase partition.

Can you tell me if what I wrote above makes sense for HBase but also forthe whole system ?

If we could get rid of the one-level/sub-level index, we would speed-upthe add/move/delete operation greatly (as we will spare two indexupdate), saving probably 25% of the time needed to update the backend(we will just have 5 index to update instead of 7). It might also speedup the search marginally, as we won't have to do look-up in theone-level or sub-level index to build the scope filter.



--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Re: HBase partition integration in trunks ?

Reply via email to