On Nov 5, 2011, at 05:37 , lars hofhansl wrote:

Cool stuff Daniel,

Hi Lars,

Thanks for the good points.


Was looking through the code a bit. Seems like you make a best effort to push 
as much of
the filtering of KVs of uncommitted transactions to HBase and then do some 
filtering on the client
not a bad approach. (I hope I didn't misunderstand the approach, only looked 
through the code for
1/2 hour or so).

Putting it more accurately, the uncommitted KVs are stored at HBase, but it is 
the client's job to filter them using the commit information that it has 
received from the status oracle. According to snapshot isolation guarantee, all 
the versions that are inserted with a timestamp larger than the transaction 
start timestamp must be ignored, which is done by setting the time range on the 
client's get request sent to HBase. Since the uncommitted changes of the 
aborted transactions are eventually removed from HBase, the client rarely needs 
to fetch more than a version to reach a KV that is committed before the 
transaction starts (the first property of snapshot isolation).



One thing I was wondering: Why bookkeeper? Why not store the WAL itself in 
HBase? That way
you might not even need a separate server.

Did you see: HBaseSI (http://www.cs.uwaterloo.ca/~c15zhang/HBaseSI.pdf), they 
also do MVCC
on top of unaltered HBase/schema, although from reading that paper I get the 
impression that it
would not scale to scans touching many rows (which is where your client side 
filtering comes in).

Thanks for the link. We had seen the other paper of the same authors (Grid2010) 
that shares the same bottlenecks with the recent work.
As you pointed out correctly, the question is about performance. You could see 
the scalability bottleneck of 400 TPS in the evaluation section of this paper. 
Our approach, however, provides snapshot isolation with a negligible overhead 
on region servers, and could scale up to tens of thousands write transactions 
per second. If you are interested, a summary of techniques that we used to 
achieve this performance is published at SOSP'11, poster section.
http://sigops.org/sosp/sosp11/posters/summaries/sosp11-final12.pdf


-- Lars


----- Original Message -----
From: Daniel Gómez Ferro <[email protected]<mailto:[email protected]>>
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>; 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Cc: Maysam Yabandeh <[email protected]<mailto:[email protected]>>; Flavio 
Junqueira <[email protected]<mailto:[email protected]>>; Benjamin Reed 
<[email protected]<mailto:[email protected]>>; Ivan Kelly 
<[email protected]<mailto:[email protected]>>
Sent: Friday, November 4, 2011 4:24 AM
Subject: Omid: Transactional Support for HBase

(I apologize for resending but I forgot to add the user list.)

Hi all,

It is my pleasure to announce the open source release of Omid, a project whose 
goal is to add lock-free transactional support on top of HBase. The current 
release includes CrSO, a client-replicated status oracle that detects the 
write-write conflicts to provide Snapshot Isolation. CrSO has the following 
appealing properties:

1) It does not need any modification into the HBase code nor the table scheme.
2) The overhead on HBase DataNodes is negligible (only after an abort)
3) It scales up to 50,000 write transactions per second (TPS) and a thousand of 
client connections.

We have setup a github project: https://github.com/dgomezferro/omid

More information is available at the wiki: 
https://github.com/dgomezferro/omid/wiki

If you are interested, installation and running instructions are available on 
the README: https://github.com/dgomezferro/omid/blob/master/README.md

Please do not hesitate to contact us in the case of any question.

Best Regards,
Daniel Gómez Ferro


Reply via email to