I see what you're saying. I need to think on this. Stack, care to weigh in?

-Bryan

On May 27, 2008, at 1:56 PM, Clint Morgan wrote:

Responses inline:

2008/5/27 Bryan Duxbury <[EMAIL PROTECTED]>:
It seems like if you wanted to do some manner of multi-row transactional put, the only real way to manage it is with deletes. That is, if the first put succeeds but the second fails, you can "invert" the first put into a
bunch of deletes.

Yes, this is what I was thinking by using the timestamp/multiple
versions. To roll back you delete everything you wrote and then we get
back to the previous version. Alternatively you could save the
original values before they are overwritten.

Trying to make the regions themselves maintain the transactional state seems like a terrible idea. You'd have to not allow a region to get migrated to another server if it's serving a transaction. This would introduce a lot of
potential performance problems, I think.

I'm envisioning transactions being relatively short-lived: 100 ms to a
few seconds. I don't see this getting in the way of eg region
migration any more than scanners do. But maybe I'm missing something.

So the transactional state for a region is (roughly) a transaction
lease, and a collection of the corresponding BatchUpdates.

Can you help me understand why atomic transactions are needed? Can't the atomicity problems be sort of resolved by the whole row versioning thing?

Simply, we need to ensure that all updates happen together. Otherwise,
the data is in an inconsistent state. Take the standard example of
debiting one account and crediting another. If only one of these rows
gets updated, then the resulting table is corrupted and will not make
sense to the application. (Money has been created or destroyed)

So that is why one needs atomicity: the application-level semantics demand it.

When we encounter an exception midway through the transaction, we can
recover the old state of the modified row(s) by reverting to the
previous version. So the question is who recognizes this and does the
rollback? I'd like hbase to do it because it seems like a logical
place to put the behavior. So if the client crashed halfway through
the transaction, then when his transaction lease expires, hbase will
revert the relevant BatchUpdates. And the integrity of our table is
preserved!

Other databases that do transactions and rollbacks use versioning to
accomplish that, I think.

I don't know much about this. But however other (R)DBMS implement it,
it is provided as a primitive rather than implemented on top of
underlying versioning functionality (by users). This way the database
will maintain the consistency rather than the user having to recognize
problems and revert the state itself.

-clint

Reply via email to