Another question: I assume this will not work out of the box with deletes? Deletes always cover all key values in the past (from their timestamps on backwards), so once a delete marker is placed there is no way to get back any of a puts it affects.
HBase trunk has HBASE-4536 to allow time-range scans to work with deleted rows (but needs to be enabled for a column family - I still think it should be the default, but anyway). -- Lars ________________________________ From: Flavio Junqueira <[email protected]> To: Daniel Gómez Ferro <[email protected]> Cc: "[email protected]" <[email protected]>; lars hofhansl <[email protected]>; "[email protected]" <[email protected]>; Maysam Yabandeh <[email protected]>; Benjamin Reed <[email protected]>; Ivan Kelly <[email protected]> Sent: Sunday, November 6, 2011 7:14 AM Subject: Re: Omid: Transactional Support for HBase A quick note on Omid for the ones following on github: the repository we will be working with is the fork under the Yahoo! account: https://github.com/yahoo/omid/ -Flavio On Nov 5, 2011, at 9:36 PM, Daniel Gómez Ferro wrote: > >On Nov 5, 2011, at 05:37 , lars hofhansl wrote: > >Cool stuff Daniel, >> > >Hi Lars, > >Thanks for the good points. > > > >>Was looking through the code a bit. Seems like you make a best effort to push >>as much of >>the filtering of KVs of uncommitted transactions to HBase and then do some >>filtering on the client >>not a bad approach. (I hope I didn't misunderstand the approach, only looked >>through the code for >>1/2 hour or so). >> > >Putting it more accurately, the uncommitted KVs are stored at HBase, but it is >the client's job to filter them using the commit information that it has >received from the status oracle. According to snapshot isolation guarantee, >all the versions that are inserted with a timestamp larger than the >transaction start timestamp must be ignored, which is done by setting the time >range on the client's get request sent to HBase. Since the uncommitted changes >of the aborted transactions are eventually removed from HBase, the client >rarely needs to fetch more than a version to reach a KV that is committed >before the transaction starts (the first property of snapshot isolation). > > >> >>One thing I was wondering: Why bookkeeper? Why not store the WAL itself in >>HBase? That way >>you might not even need a separate server. >> >>Did you see: HBaseSI (http://www.cs.uwaterloo.ca/~c15zhang/HBaseSI.pdf), they >>also do MVCC >>on top of unaltered HBase/schema, although from reading that paper I get the >>impression that it >>would not scale to scans touching many rows (which is where your client side >>filtering comes in). >> > > >Thanks for the link. We had seen the other paper of the same authors >(Grid2010) that shares the same bottlenecks with the recent work. >As you pointed out correctly, the question is about performance. You could see >the scalability bottleneck of 400 TPS in the evaluation section of this paper. >Our approach, however, provides snapshot isolation with a negligible overhead >on region servers, and could scale up to tens of thousands write transactions >per second. If you are interested, a summary of techniques that we used to >achieve this performance is published at SOSP'11, poster section. >http://sigops.org/sosp/sosp11/posters/summaries/sosp11-final12.pdf > > >>-- Lars >> >> >>----- Original Message ----- >>From: Daniel Gómez Ferro <[email protected]> >>To: "[email protected]" <[email protected]>; "[email protected]" >><[email protected]> >>Cc: Maysam Yabandeh <[email protected]>; Flavio Junqueira >><[email protected]>; Benjamin Reed <[email protected]>; Ivan Kelly >><[email protected]> >>Sent: Friday, November 4, 2011 4:24 AM >>Subject: Omid: Transactional Support for HBase >> >>(I apologize for resending but I forgot to add the user list.) >> >>Hi all, >> >>It is my pleasure to announce the open source release of Omid, a project >>whose goal is to add lock-free transactional support on top of HBase. The >>current release includes CrSO, a client-replicated status oracle that detects >>the write-write conflicts to provide Snapshot Isolation. CrSO has the >>following appealing properties: >> >>1) It does not need any modification into the HBase code nor the table scheme. >>2) The overhead on HBase DataNodes is negligible (only after an abort) >>3) It scales up to 50,000 write transactions per second (TPS) and a thousand >>of client connections. >> >>We have setup a github project: https://github.com/dgomezferro/omid >> >>More information is available at the wiki: >>https://github.com/dgomezferro/omid/wiki >> >>If you are interested, installation and running instructions are available on >>the README: https://github.com/dgomezferro/omid/blob/master/README.md >> >>Please do not hesitate to contact us in the case of any question. >> >>Best Regards, >>Daniel Gómez Ferro >> >> > flavio junqueira research scientist [email protected] direct +34 93-183-8828 avinguda diagonal 177, 8th floor, barcelona, 08018, es phone (408) 349 3300 fax (408) 349 3301
