I added pointer to below into our book as 'intro to secondary indexing
in hbase'.
St.Ack

On Fri, Mar 25, 2011 at 8:39 AM, Buttler, David <[email protected]> wrote:
> Do you know what it means to make secondary indexing a feature?  There are 
> two reasonable outcomes:
> 1) adding ACID semantics (and thus killing scalability)
> 2) allowing the secondary index to be out of date (leading to every naïve 
> user claiming that there is a serious bug that must be fixed).
>
> Secondary indexes are basically another way of storing (part of) the data.  
> E.g. another table, sorted on the field(s) that you want to search on.  In 
> order to ensure consistency between the primary table and the secondary table 
> (index), you have to guarantee that when you mutate the primary table that 
> the secondary table is mutated in the same atomic transaction.  Since HBase 
> only has row-level locks, this can't be guaranteed across tables.
>
> The situation is not hopeless, because in many cases you don't need to have 
> perfectly consistent data and can afford to wait for cleanup tasks.  For some 
> applications, you can ensure that the index is updated close enough to the 
> table update (using external transactions, or something similar) that users 
> would never notice.  One way to implement an eventually consistent secondary 
> index would be to mimic the way cluster replication is done.
>
> However, what  I have described is difficult to do generically -- and there 
> are engineering tradeoffs that need to be made.  If you absolutely need a 
> transactional and consistent secondary index, I would suggest using Oracle, 
> MySQL, or another relational database, where this was designed in as a 
> primary feature.  Just don't complain that they are too slow or don't scale 
> as well as HBase.
>
> </rant>
>
> Sorry for the rant.  If you want to have a secondary index here is what you 
> need to do:
> Modify your application so that every time you write to the primary table, 
> you also write to a secondary table, keyed off of the values you want to 
> search on.  If you can't guarantee that the values form a secondary key (i.e. 
> are unique across your entire table), you can make your key a compound key 
> (see, for example, how "tsuna" designed OpenTSDB) with your primary key as a 
> component.
>
> Then, when you need to query, you can do range queries over the secondary 
> table to retrieve the keys in the primary table to return the full data row.
>
> Dave
>
> -----Original Message-----
> From: Wei Shung Chung [mailto:[email protected]]
> Sent: Friday, March 25, 2011 12:04 AM
> To: [email protected]
> Subject: Re: Stargate+hbase
>
> I need to use secondary indexing too, hopefully this important feature
> will be made available soon :)
>
> Sent from my iPhone
>
> On Mar 25, 2011, at 12:48 AM, Stack <[email protected]> wrote:
>
>> There is no native support for secondary indices in HBase (currently).
>> You will have to manage it yourself.
>> St.Ack
>>
>> On Thu, Mar 24, 2011 at 10:47 PM, sreejith P. K. <[email protected]
>> > wrote:
>>> I have tried secondary indexing. It seems I miss some points. Could
>>> you
>>> please explain how it is possible using secondary indexing?
>>>
>>>
>>> I have tried like,
>>>
>>>
>>>                Columnamilty1:kwd1
>>>                Columnamilty1:kwd2
>>> row1         Columnamilty1:kwd3
>>>                Columnamilty1:kwd2
>>>
>>>                Columnamilty1:kwd1
>>>                Columnamilty1:kwd2
>>> row2         Columnamilty1:kwd4
>>>                Columnamilty1:kwd5
>>>
>>>
>>> I need to get all rows which contain kwd1 and kwd2
>>>
>>> Please help.
>>> Thanks
>>>
>>>
>>> On Thu, Mar 24, 2011 at 9:57 PM, Jean-Daniel Cryans <[email protected]
>>> >wrote:
>>>
>>>> What you are asking for is a secondary index, and it doesn't exist
>>>> at
>>>> the moment in HBase (let alone REST). Googling a bit for "hbase
>>>> secondary indexing" will show you how people usually do it.
>>>>
>>>> J-D
>>>>
>>>> On Thu, Mar 24, 2011 at 6:18 AM, sreejith P. K. <[email protected]
>>>> >
>>>> wrote:
>>>>> Is it possible using stargate interface to hbase,  fetch all rows
>>>>> where
>>>> more
>>>>> than one column family:<qualifier> must be present?
>>>>>
>>>>> like :select  rows which contains keyword:a and keyword:b ?
>>>>>
>>>>> Thanks
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Sreejith PK
>>> Nesote Technologies (P) Ltd
>>>
>

Reply via email to