Re: Does HBase support single-row transaction?

Bryan Duxbury Tue, 27 May 2008 12:58:11 -0700

It seems like if you wanted to do some manner of multi-rowtransactional put, the only real way to manage it is with deletes.That is, if the first put succeeds but the second fails, you can"invert" the first put into a bunch of deletes.

Trying to make the regions themselves maintain the transactionalstate seems like a terrible idea. You'd have to not allow a region toget migrated to another server if it's serving a transaction. Thiswould introduce a lot of potential performance problems, I think.

Can you help me understand why atomic transactions are needed? Can'tthe atomicity problems be sort of resolved by the whole rowversioning thing? Other databases that do transactions and rollbacksuse versioning to accomplish that, I think.


-Bryan

On May 27, 2008, at 12:29 PM, Clint Morgan wrote:

Zookeeper makes good sense for distributed locking to get isolation.
But we still need transaction start, commit, and rollback to get
atomicity. I think this properly belongs in hbase.

So suppose I want to read two rows, and then update them as an
isolated, atomic action:

try {
  getZookeeperLock(table)
  tranId = table.beginTransaction();
row1 = table.get() // Normal get, but isolated due to distributedlock
  row2 = table.get()
  BatchUpdate b1 = new BatchUpdate(row1)
  b1.put(...)
  table.addUpdate(tranId, b1);
  BatchUpdate b2 = new BatchUpdate(row2)
  b2.put(...);
  table.addUpdate(tranId, b2);
  table.commit(tranId);
} catch(Exception e) {
  table.rollback(tranId);
} finally {
  releaseZookeeperLock(table)
}

So then on the hbase side we hold on to the batchUpdates until the
table.commit is called. Then we roll through and apply the updates.

I'm sure rollback()/commit() is tricky to implement, as the updates
could be on different region servers, so we need a failure on one to
trigger a rollback on others. We could use timestamp/old versions to
implement rollback on batchUpdates we have already applied.

Alternatively, this may all be implemented above hbase. The client
keeps track of updates, and trys to roll back using timestamps.
Problem here is if the client dies midway through we have half the
transaction committed and loose atomicity/consistency.

We will eventually want/need atomic transactions on hbase, so I'll
look into this further. Any input would be appreciated. Would be
interesting to know how/what google provides...

cheers,
-clint
On Sun, May 11, 2008 at 7:48 AM, Bryan Duxbury <[EMAIL PROTECTED]>wrote:
Currently, it's not on our list of things to do. There are anumber ofreasons why it would be better to use Zookeeper here than to tryand build
it into HBase.
That said, I think you could get everything you need if you triedZookeeper,using that to acquire locks on the row you need a transaction on.It'ssupposedly very high performance and supports your use caseprecisely.
-Bryan

On May 10, 2008, at 11:52 PM, Zhou Wei wrote:
Bryan Duxbury 写道:
startUpdate is deprecated in TRUNK. Also, it doesn't do what youarethinking it does. Committing a BatchUpdate is atomic across thewhole row,however. There is currently no way to make a get and a committransactional,though there is an issue open for write-if-not-modified-sincesupport. Ifthis is something you need we can talk about how it might besupported.
Thanks for answering my questions.
So currently HBase is not suitable for transactional webapplications.
A simple counting transaction can not work by concurrent accesses:
transaction{
get(x);
x++;
write(x);
}
In my opinion, "write-if-not-modified-since" support may not bethe best
idea of implement single-row transaction.
Because if write can not be performed, application has to tryagain and
again, or just return error and leave user to choose again or abort.
Probably locking, waiting and scheduling at region server might be
preferable in this case.
Is the single-row transaction feature currently in the roadmap ofHBase?
Zhou
-Bryan

On May 7, 2008, at 7:48 PM, Zhou Wei wrote:
Hi
Does HBase support single-row transaction as described in Bigtable
paper?

"Bigtable supports single-row transactions, which can be
used to perform atomic read-modify-write sequences on
data stored under a single row key." --Bigtable paper

If so, how can I define a transaction in HBase,
is it looks like this:

lid=startUpdate
get(lid)
..
put(lid)
...
commit(lid)

Are these transactions isolated with each other?
If not, is there a way to achieve that?

Thanks

Zhou

Re: Does HBase support single-row transaction?

Reply via email to