Re: HBase reads, isolation levels and RegionScanner internal locking

Vladimir Rodionov Fri, 12 Sep 2014 13:22:34 -0700

All row mutate operations in HBase are atomic: puts/deletes/increments/appends.
Atomicity in HBase has the same meaning as in RDBMs exactly - operations
completes as a whole or does not at all.  There is an additional guarantee
in HBase that all reads are ROW - atomic as well - one will never read
partial result of atomic mutate operation (on a ROW). READ_UNCOMMITTED will
weaken this read atomicity guarantee and it will be possible to read
partial results of a row mutation operation,


so we discard row read atomicity but still have cell atomicity (hopefully?).

This is my understanding of what READ_UNCOMMITTED means in HBase.

PS

I created :
https://issues.apache.org/jira/browse/HBASE-11965

-Vladimir


On Fri, Sep 12, 2014 at 12:28 PM, Stack <[email protected]> wrote:

> On Fri, Sep 12, 2014 at 11:25 AM, Vladimir Rodionov <
> [email protected]>
> wrote:
>
> > Michael,
> >
> > This is not a row-level locking - it is region-wide lock. This is a major
> > reason of  the following performance problems:
> >
> >
> Pardon my misreading as row-scoping (I'd just come off reading Michael
> Segel's note).
>
>
>
> > 1) Multi gets are bad if inside the same region
> > 2) Multiple scanners over the same region are bad
> > 3) Scan during compaction are bad.
> >
>
>
>
> > I need some input from HBase folks here:
> >
> > 1) READ_UNCOMMITTED safe if lock free?
> >
>
>
> And rely on MVCC only? That'd be cool Vladimir.  When you say
> READ_UNCOMMITTED, are you row or cell-scoped?  Are you thinking that you'd
> make it so the row lock was also optional?
>
>
>
> > 2) Confirmation that region-wide lock is for read consistency only.
> >
> >
> >
> The region lock, IIRC, was added so we can close the region cleanly.  All
> gets/puts/etc. take the lock and hold it while operating to prevent the
> region being closed out from under them.  It was put in place long ago and
> not revisited since.
>
> Hope this helps.
>
>
> Related, there is Liang Xie's effort over in HDFS:
> https://issues.apache.org/jira/browse/HDFS-6735
> St.Ack.
>
>
>
> > On Fri, Sep 12, 2014 at 11:04 AM, Stack <[email protected]> wrote:
> >
> > > On Thu, Sep 11, 2014 at 3:58 PM, Vladimir Rodionov <
> > [email protected]
> > > >
> > > wrote:
> > >
> > > > Hi, all
> > > >
> > > > We have two isolation levels in (used to be in Scan) in Query now.
> See:
> > > > https://issues.apache.org/jira/browse/HBASE-11936
> > > >
> > > > I moved isolation levels API from Scan upward to Query class. The
> > reason:
> > > > this API was not available for Get operations. The rationals? Improve
> > > > performance of get and multi-gets over the same region.
> > > >
> > > > As many of you aware, RegionScannerImpl is heavily synchronized on
> > > internal
> > > > region's lock.  Now some questions:
> > > >
> > > > 1. Is it safe to bypass this locking (in next() call) in
> > READ_UNCOMMITTED
> > > > mode?
> > >
> > > We will do all necessary checks, of course, before calling nextRaw().
> > > > 2. What was the reason of this locking in a first place for reads in
> > > > READ_COMMITTED mode? Except obvious - no-dirty-reads allowed? Can
> > someone
> > > > tell me what else bad can happen?
> > > >
> > >
> > >
> > > There is only the obvious (that I know of) Vladimir.  We've been so
> > fixated
> > > on ensuring consistent view on a row, we've not done the work to allow
> > > other read types. I'm not sure what would happen if you were to skirt
> row
> > > lock.  Try hacking on TestAtomicOperation to undo lock and see what
> > > happens?
> > >
> > > St.Ack
> > >
> >
>

Re: HBase reads, isolation levels and RegionScanner internal locking

Reply via email to