On Sat, Feb 12, 2011 at 10:31 PM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:

> Right, the concepts aren't that hard (write ahead log etc), however to
> keep the data transactionally consistent with another datastore across
> servers [I believe] is a little more difficult?


I assume you don't really need ACID transactions, but only the guarantee
that when you update an HBase row, its index will eventually be updated too?
(possibly with a little "RT" delay).

[As you probably know, ] the basic solution to do this across systems is a
write-ahead-log outside of these systems, i.e. the sequence to perform an
update would be:
 (1) write update to the WAL
 (2) perform update on HBase
 (3) perform update on Lucene

If it fails anywhere in between, one can always replay from the WAL. If you
add a write-ahead-log just to e.g. Katta, that won't help yet with the
consistency across the systems, as it could fail between doing the update to
HBase and writing to the Katta-WAL.

We do have something like this in Lily (http://lilyproject.org, check the
'rowlog' thing), though it is somewhat different than above; to the "WAL" we
only write the ID of the row, since we consider the update to the HBase row
to be the main action and all what follows just secondary side-effects (i.e.
there's no rollback).

Slightly similar ideas can be found in Google's percolator paper.


>  Also with RT there
> needs to be a primary data store somewhere outside of Lucene,
> otherwise we'd be storing the same data twice, eg, in HBase and
> Lucene, that's inefficient.  I'm guessing it'll be easier to keep
> Lucene indexes in parallel with HBase regions across servers, and then
> use the Coprocessor architecture etc, to keep them in'sync, on the
> same server.  When a region is split, we'd need to also split the
> Lucene index, this'd be the only 'new' technology that'd need to be
> created on the Lucene side.
>

That would definitely be interesting, but I guess for it to work with good
performance the ordering of the HBase row keys should be the same as that of
the Lucene doc IDs (so that posting lists can be split in the middle rather
than having to rearrange everything), and I don't see how that could be the
case.

Another issue is that maybe the scalability needs for search might be
different. An HBase region is always only active in one region server, there
are no active replica's, while often for search you need replicas to scale,
since a search will typically hit all partitions.

-- 
Bruno Dumon
Outerthought
http://outerthought.org/

Reply via email to