Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "Hbase/SecondaryIndexing" page has been changed by jgray. http://wiki.apache.org/hadoop/Hbase/SecondaryIndexing -------------------------------------------------- New page: = HBase Secondary Indexing = This is a design document around different approaches to secondary indexing in HBase. == Eventually Consistent Secondary Indexes using Coprocessors == The basic idea is to use an additional (secondary) table for each index on the main (primary) table. A coprocessor binding to a family would be used to define a given secondary index on that family (or specific column(s) within it). When a Put comes in to the primary table, the following would happen: 1. Generate WALEdit for primary table 2. Generate a new, special kind of WALEdit for secondary table update 3. Open questions: * How to deal with creation of secondary tables * Future work: * Declaration of indexes via API or shell syntax rather than programatically with a coprocessor-per-index * Creation of indexes on existing tables (build of indexes based on current data and kept up to date) == Secondary Indexes using Optimistic Concurrency Control == These are implemented by Transactional HBase / IndexedTable. Currently this lives here: https://github.com/hbase-trx/hbase-transactional-tableindexed == In-memory Secondary Indexes for Indexed Scans == This was implemented once but I'm not sure where it lives anymore.
