[Hadoop Wiki] Update of "Hbase/SecondaryIndexing" by jgray

Apache Wiki Mon, 28 Feb 2011 10:43:44 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "Hbase/SecondaryIndexing" page has been changed by jgray.
http://wiki.apache.org/hadoop/Hbase/SecondaryIndexing

--------------------------------------------------

New page:
= HBase Secondary Indexing =

This is a design document around different approaches to secondary indexing in 
HBase.

== Eventually Consistent Secondary Indexes using Coprocessors ==

The basic idea is to use an additional (secondary) table for each index on the 
main (primary) table.  A coprocessor binding to a family would be used to 
define a given secondary index on that family (or specific column(s) within it).

When a Put comes in to the primary table, the following would happen:

1. Generate WALEdit for primary table
2. Generate a new, special kind of WALEdit for secondary table update
3. 



Open questions:

* How to deal with creation of secondary tables
* 


Future work:

* Declaration of indexes via API or shell syntax rather than programatically 
with a coprocessor-per-index
* Creation of indexes on existing tables (build of indexes based on current 
data and kept up to date)


== Secondary Indexes using Optimistic Concurrency Control ==

These are implemented by Transactional HBase / IndexedTable.

Currently this lives here:  
https://github.com/hbase-trx/hbase-transactional-tableindexed


== In-memory Secondary Indexes for Indexed Scans ==

This was implemented once but I'm not sure where it lives anymore.

[Hadoop Wiki] Update of "Hbase/SecondaryIndexing" by jgray

Reply via email to