Re: HBase reads, isolation levels and RegionScanner internal locking

lars hofhansl Sat, 13 Sep 2014 22:11:38 -0700

Thanks Michael, for your "RDBMS school[ing]". (Did I mention I used to work at 
various RDBMS companies before I came to HBase?)


Vladimir, to answer your question:
- HBase *always* locks a row for writes. Other writes to the same row will 
queue behind this lock.
- READ_[UN]COMITTED here only refers the whether one can see the result of 
prior inflight MVCC transactions. It does not affect the need for the per row 
write lock.
- The MVCC transactions in HBase are strictly serialize (which allows for a 
really simple and elegant implementations, that is valid as long as each 
individual transaction is short)
- READ_UNCOMMITTED will allow a client to see a partially updated row. It has 
no performence benefit as such, you just can see the results of other 
transactions earlier.
- HBase also has various region-level internal (JVM level) read and write locks 
that never outlive an RPC request, such as HRegion.lock and HRegion.updatesLock

I assume you refer to the latter region level locking...?

All updates (put, append, increment, delete, etc) take a *read* lock on the 
updatesLock to guard against concurrent flushes (which takes out a write lock). 
You want this one.
Whenever a region operation is started we take out a read lock on HRegion.lock 
to guard against concurrent bulk file operations on that region. This might be 
a lock we can remove with some refactoring.

HBase never locks a row for read. (It does take out some internal locks for the 
duration of an RPC for internal management, but a row itself is never locked 
for read. And certainly not across RPC requests.)

Does that make sense?

-- Lars


----- Original Message -----
From: Michael Segel <[email protected]>
To: [email protected]
Cc: 
Sent: Friday, September 12, 2014 10:17 AM
Subject: Re: HBase reads, isolation levels and RegionScanner internal locking

Vlad, 

I understand. 
However several of the HBase committers aren’t really schooled in RDBMS design.

And again, the older (going back to 0.23 ) use of the term RLL isn’t relational 
RLL and when you start to talk about isolation you’re getting in to the RDBMS 
RLL 

So you really need to define what you mean when you say RLL. I don’t want to 
assume one thing when you meant another. 

Just like talking about salts.  ;-) 





On Sep 12, 2014, at 5:53 PM, Vladimir Rodionov <[email protected]> wrote:

> Michael, this is HBase developers mailing list.
> 
> -Vladimir
> 
> 
> 
> 
> On Fri, Sep 12, 2014 at 12:08 AM, Michael Segel <[email protected]>
> wrote:
> 
>> Silly question…
>> 
>> HBase uses the term RLL (row level locking) to make the writes to a row
>> atomic.
>> 
>> When you start to get in to isolation, RLL takes on a different meaning.
>> 
>> So now you have to better define what do you mean by locking. Are you
>> taking about HBase RLL,
>> or are you talking about Transactional RLL ( RDBMS RLL) ?
>> 
>> 
>> On Sep 11, 2014, at 11:58 PM, Vladimir Rodionov <[email protected]>
>> wrote:
>> 
>>> Hi, all
>>> 
>>> We have two isolation levels in (used to be in Scan) in Query now. See:
>>> https://issues.apache.org/jira/browse/HBASE-11936
>>> 
>>> I moved isolation levels API from Scan upward to Query class. The reason:
>>> this API was not available for Get operations. The rationals? Improve
>>> performance of get and multi-gets over the same region.
>>> 
>>> As many of you aware, RegionScannerImpl is heavily synchronized on
>> internal
>>> region's lock.  Now some questions:
>>> 
>>> 1. Is it safe to bypass this locking (in next() call) in READ_UNCOMMITTED
>>> mode?
>>> We will do all necessary checks, of course, before calling nextRaw().
>>> 2. What was the reason of this locking in a first place for reads in
>>> READ_COMMITTED mode? Except obvious - no-dirty-reads allowed? Can someone
>>> tell me what else bad can happen?
>>> 
>>> -Vladimir
>> 
>>

Re: HBase reads, isolation levels and RegionScanner internal locking

Reply via email to