Hi,

I'm wondering how your design of the key for your index looks like?!

My own inital implementation for an inverted index is to create for
each distinct column a separate table.

Example (key - Col : Col : Col)

A - Id1 : Id2 : Id3
B - Id4 : Id5 : Id6
C - Id7 : Id8 : Id9

This is convenient, however, this can lead to problems as the number
of columns (holding the referenced key) can grow extremely large.

As far as I understand the contributed Indexer in HBase it maintains
the Indexes in this way (only had a quick look on it):

AId1
AId2
AId3
BId4
BId5
BId6
CId7
CId8
CId9

I would really like to hear some opinions on this?!

/SJ


which then contains the value as the key and the related columns
contain the keys to the occurences. The drawback is that the colu

On Fri, Jul 23, 2010 at 1:44 PM, Luke Forehand
<[email protected]> wrote:
> Jamie Cockrill <jamie.cockr...@...> writes:
>
>>
>> Luke,
>>
>> Apologies no, I've been rather sidelined by another issue at the
>> moment. It's always the same, you get to playing with something
>> interesting and you get pulled off to fight fires somewhere else. Once
>> I get back on the case I'll have a look, however it someone did
>> previously mention another library built  by the guys building Lily
>> that seems to aim to achieve the same goal. Reposted again here:
>>
>> http://lilycms.org/lily/roadmap/sketchbook/hbaseindexes.html
>>
>> I've not had time to look at it in detail, but it might be a good
>> starting point to get something up and going quickly if that's what
>> you're after.
>>
>> Ta,
>>
>> Jamie
>>
>> On 23 July 2010 16:58, Luke Forehand
>> <luke.foreh...@...> wrote:
>> > Jamie Cockrill <jamie.cockr...@...> writes:
>> >
>
> Jamie,
>
> Thanks for the lead, I'm taking a look at the hbaseindex src now.  I'm now
> leaning toward writing and maintaining my own secondary index rather than use
> the contrib IndexedTable stuff.  With IndexedTable I don't have enough control
> over the index row key construction, and that is important depending on how I
> want to scan/filter the indexed table.  Also, writing the index table from an
> existing huge table with IndexedTableAdmin takes too long and would be better
> suited as a Map Reduce job.  These are just a few observations I've made 
> after a
> somewhat cursory glance at the code.
>
> -Luke
>
>

Reply via email to