Check out transactional hbase under contrib. It includes an indexed hbase, a generalized means of keeping up secondary indexes that uses transactional hbase keeping up primary and the indexed table; i.e. if insert into primary or index fails, the insert is "rolled back".
St.Ack On Fri, Sep 11, 2009 at 7:09 AM, Matt Corgan <[email protected]> wrote: > Does anyone have any tips or strategies for keeping an index in sync with > its data? I'd of course update the index immediately after the data, but > over time there will inevitably be inconsistencies. Do people just run > periodic clean-up jobs? > > On a related note, how important is batching updates from a performance > standpoint? In MySQL it is significant, but the write path in HBase seems > so fast that it may not matter much except for network latency. Would you > recommend updating 1000 data rows, then applying the 1000 index updates, > or interleaving the updates row-by-row? > > Congrats on the new release! Looks awesome. > Matt >
