Re: Omid: Transactional Support for HBase

Daniel Gómez Ferro Tue, 08 Nov 2011 02:38:35 -0800

On Nov 8, 2011, at 10:48 , Daniel Gómez Ferro wrote:

> Hi Jignesh
> 
> On Nov 7, 2011, at 21:44 , Jignesh Patel wrote:
> 
>> Looks like this transaction is limited for one row. Is that correct?
>> 
> 
> No, it's not. Transactions can span multiple rows.
> 
>> Another thing I don't have zookeepr installed as I am running in
>> pseudo distibuted mode. The document doesn't say anything about
>> integrating in pseudo distributed mode.
>> 
> 
> Currently Omid requires both ZooKeeper and BookKeeper to operate, but we 
> provide some scripts to launch them locally if you just want to try it. I've 
> just pushed a change so you don't need to install anything manually, just 
> download/checkout Omid, run 'mvn package' and follow the instructions to run 
> the benchmark locally.


Please remember that the repository we are using now is 
https://github.com/yahoo/omid/ 

> 
> If people still find cumbersome or difficult to run ZK/BK we could provide an 
> option to disable the replication to the WAL.
> 
> Daniel
> 
>> -Jignesh
>> 
>> 2011/11/7 Daniel Gómez Ferro <[email protected]>:
>>> 
>>> On Nov 6, 2011, at 21:53 , lars hofhansl wrote:
>>> 
>>>> Another question: I assume this will not work out of the box with deletes?
>>> 
>>> Hi,
>>> 
>>> Our current approach does support deletes (i.e., user requested deletes). 
>>> Right now we use empty values as delete marks: when the user calls 
>>> TransactionalTable.delete() we insert empty values at the specified 
>>> timestamp. At the filtering time, we keep track of these delete marks and 
>>> we can discard the ones that are uncommitted or fall outside our time range 
>>> of interest. When a transaction aborts, the cleanup procedure deletes the 
>>> specific values inserted by the transactions (in contrast to all versions). 
>>> This way we don't insert delete tombstones that mask previous values.
>>> 
>>> The drawbacks of this approach are that (i) we give a special meaning to 
>>> the empty values, and (ii) to delete the whole column family (in contrast 
>>> with a column) we have to perform a get beforehand to obtain the column 
>>> qualifiers.
>>> 
>>>> 
>>>> Deletes always cover all key values in the past (from their timestamps on 
>>>> backwards), so once a delete marker is placed there is no way to get back 
>>>> any of a puts it affects.
>>>> 
>>>> HBase trunk has HBASE-4536 to allow time-range scans to work with deleted 
>>>> rows (but needs to be enabled for a column family - I still think it 
>>>> should be the default, but anyway).
>>>> 
>>> 
>>> I think this feature would be very useful, and enables a cleaner 
>>> implementation. It would be great if the flag was enabled by default, we 
>>> want the user to change as little as possible his setup, but it's not a big 
>>> deal.
>>> 
>>>> -- Lars
>>>> 
>>>> ________________________________
>>>> From: Flavio Junqueira <[email protected]>
>>>> To: Daniel Gómez Ferro <[email protected]>
>>>> Cc: "[email protected]" <[email protected]>; lars hofhansl 
>>>> <[email protected]>; "[email protected]" <[email protected]>; 
>>>> Maysam Yabandeh <[email protected]>; Benjamin Reed 
>>>> <[email protected]>; Ivan Kelly <[email protected]>
>>>> Sent: Sunday, November 6, 2011 7:14 AM
>>>> Subject: Re: Omid: Transactional Support for HBase
>>>> 
>>>> 
>>>> A quick note on Omid for the ones following on github: the repository we 
>>>> will be working with is the fork under the Yahoo! account:
>>>> 
>>>> 
>>>> https://github.com/yahoo/omid/
>>>> 
>>>> -Flavio
>>>> 
>>>> 
>>>> On Nov 5, 2011, at 9:36 PM, Daniel Gómez Ferro wrote:
>>>> 
>>>> 
>>>>> 
>>>>> On Nov 5, 2011, at 05:37 , lars hofhansl wrote:
>>>>> 
>>>>> Cool stuff Daniel,
>>>>>> 
>>>>> 
>>>>> Hi Lars,
>>>>> 
>>>>> Thanks for the good points.
>>>>> 
>>>>> 
>>>>> 
>>>>>> Was looking through the code a bit. Seems like you make a best effort to 
>>>>>> push as much of
>>>>>> the filtering of KVs of uncommitted transactions to HBase and then do 
>>>>>> some filtering on the client
>>>>>> not a bad approach. (I hope I didn't misunderstand the approach, only 
>>>>>> looked through the code for
>>>>>> 1/2 hour or so).
>>>>>> 
>>>>> 
>>>>> Putting it more accurately, the uncommitted KVs are stored at HBase, but 
>>>>> it is the client's job to filter them using the commit information that 
>>>>> it has received from the status oracle. According to snapshot isolation 
>>>>> guarantee, all the versions that are inserted with a timestamp larger 
>>>>> than the transaction start timestamp must be ignored, which is done by 
>>>>> setting the time range on the client's get request sent to HBase. Since 
>>>>> the uncommitted changes of the aborted transactions are eventually 
>>>>> removed from HBase, the client rarely needs to fetch more than a version 
>>>>> to reach a KV that is committed before the transaction starts (the first 
>>>>> property of snapshot isolation).
>>>>> 
>>>>> 
>>>>>> 
>>>>>> One thing I was wondering: Why bookkeeper? Why not store the WAL itself 
>>>>>> in HBase? That way
>>>>>> you might not even need a separate server.
>>>>>> 
>>>>>> Did you see: HBaseSI (http://www.cs.uwaterloo.ca/~c15zhang/HBaseSI.pdf), 
>>>>>> they also do MVCC
>>>>>> on top of unaltered HBase/schema, although from reading that paper I get 
>>>>>> the impression that it
>>>>>> would not scale to scans touching many rows (which is where your client 
>>>>>> side filtering comes in).
>>>>>> 
>>>>> 
>>>>> 
>>>>> Thanks for the link. We had seen the other paper of the same authors 
>>>>> (Grid2010) that shares the same bottlenecks with the recent work.
>>>>> As you pointed out correctly, the question is about performance. You 
>>>>> could see the scalability bottleneck of 400 TPS in the evaluation section 
>>>>> of this paper. Our approach, however, provides snapshot isolation with a 
>>>>> negligible overhead on region servers, and could scale up to tens of 
>>>>> thousands write transactions per second. If you are interested, a summary 
>>>>> of techniques that we used to achieve this performance is published at 
>>>>> SOSP'11, poster section.
>>>>> http://sigops.org/sosp/sosp11/posters/summaries/sosp11-final12.pdf
>>>>> 
>>>>> 
>>>>>> -- Lars
>>>>>> 
>>>>>> 
>>>>>> ----- Original Message -----
>>>>>> From: Daniel Gómez Ferro <[email protected]>
>>>>>> To: "[email protected]" <[email protected]>; 
>>>>>> "[email protected]" <[email protected]>
>>>>>> Cc: Maysam Yabandeh <[email protected]>; Flavio Junqueira 
>>>>>> <[email protected]>; Benjamin Reed <[email protected]>; Ivan Kelly 
>>>>>> <[email protected]>
>>>>>> Sent: Friday, November 4, 2011 4:24 AM
>>>>>> Subject: Omid: Transactional Support for HBase
>>>>>> 
>>>>>> (I apologize for resending but I forgot to add the user list.)
>>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> It is my pleasure to announce the open source release of Omid, a project 
>>>>>> whose goal is to add lock-free transactional support on top of HBase. 
>>>>>> The current release includes CrSO, a client-replicated status oracle 
>>>>>> that detects the write-write conflicts to provide Snapshot Isolation. 
>>>>>> CrSO has the following appealing properties:
>>>>>> 
>>>>>> 1) It does not need any modification into the HBase code nor the table 
>>>>>> scheme.
>>>>>> 2) The overhead on HBase DataNodes is negligible (only after an abort)
>>>>>> 3) It scales up to 50,000 write transactions per second (TPS) and a 
>>>>>> thousand of client connections.
>>>>>> 
>>>>>> We have setup a github project: https://github.com/dgomezferro/omid
>>>>>> 
>>>>>> More information is available at the wiki: 
>>>>>> https://github.com/dgomezferro/omid/wiki
>>>>>> 
>>>>>> If you are interested, installation and running instructions are 
>>>>>> available on the README: 
>>>>>> https://github.com/dgomezferro/omid/blob/master/README.md
>>>>>> 
>>>>>> Please do not hesitate to contact us in the case of any question.
>>>>>> 
>>>>>> Best Regards,
>>>>>> Daniel Gómez Ferro
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> flavio
>>>> junqueira
>>>> 
>>>> research scientist
>>>> 
>>>> [email protected]
>>>> direct +34 93-183-8828
>>>> 
>>>> avinguda diagonal 177, 8th floor, barcelona, 08018, es
>>>> phone (408) 349 3300    fax (408) 349 3301
>>> 
>>> 
>

Re: Omid: Transactional Support for HBase

Reply via email to