Re: consistency, availability and partition pattern of HBase

Amandeep Khurana Wed, 08 Aug 2012 22:50:15 -0700

HDFS also chooses to degrade availability in the face of partitions.

On Thu, Aug 9, 2012 at 11:08 AM, Lin Ma <[email protected]> wrote:


> Amandeep, thanks for your comments, and I will definitely read the paper
> you suggested.
>
> For Hadoop itself, what do you think its CAP features? Which one of the
> CAP is sacrificed?
>
> regards,
> Lin
>
> On Thu, Aug 9, 2012 at 1:34 PM, Amandeep Khurana <[email protected]> wrote:
>
>> Firstly, I recommend you read the GFS and Bigtable papers. That'll give
>> you
>> a good understanding of the architecture. Adhoc question on the mailing
>> list won't.
>>
>> I'll try to answer some of your questions briefly. Think of HBase as a
>> database layer over an underlying filesystem (the same way MySQL is over
>> ext2/3/4 etc). The filesystem for HBase in this case is HDFS. HDFS
>> replicates data for redundancy and fault tolerance. HBase has region
>> servers that serve the regions. Regions form tables. Region servers
>> persist
>> their data on HDFS. Now, every region is served by one and only one region
>> server. So, HBase is not replicating anything. Replication is handled at
>> the storage layer. If a region server goes down, all its regions now need
>> to be served by some other region server. During this period of region
>> assignment, the clients experience degraded availability if they try to
>> interact with any of those regions.
>>
>> Coming back to CAP. HBase chooses to degrade availability in the face of
>> partitions. "Partition" is a very general term here and does not
>> necessarily mean network partitions. Any node falling off the HBase
>> cluster
>> can be considered to be a partition. So, when failures happen, HBase
>> degrades availability but does not give up consistency. Consistency in
>> this
>> context is sort of the equivalent of atomicity in ACID. In the context of
>> HBase, any copy of data that is written to HBase will be visible to all
>> clients. There is no concept of multiple different versions that the
>> clients need to reconcile between. When you read, you always get the same
>> version of the row you are reading. In other words, HBase is strongly
>> consistent.
>>
>> Hope that clears things up a bit.
>>
>> On Thu, Aug 9, 2012 at 8:02 AM, Lin Ma <[email protected]> wrote:
>>
>> > Thank you Lars.
>> >
>> > Is the same data store duplicated copy across region server? If so, if
>> one
>> > primary server for the region dies, client just need to read from the
>> > secondary server for the same region. Why there is data is unavailable
>> > time?
>> >
>> > BTW: please feel free to correct me for any wrong knowledge about HBase.
>> >
>> > regards,
>> > Lin
>> >
>> > On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <[email protected]>
>> wrote:
>> >
>> > > After a write completes the next read (regardless of the location it
>> is
>> > > issued from) will see the latest value.
>> > > This is because at any given time exactly RegionServer is responsible
>> for
>> > > a specific Key
>> > > (through assignment of key ranges to regions and regions to
>> > RegionServers).
>> > >
>> > >
>> > > As Mohit said, the trade off is that data is unavailable if a
>> > RegionServer
>> > > dies until another RegionServer picks up the regions (and by extension
>> > the
>> > > key range)
>> > >
>> > > -- Lars
>> > >
>> > >
>> > > ----- Original Message -----
>> > > From: Lin Ma <[email protected]>
>> > > To: [email protected]
>> > > Cc:
>> > > Sent: Wednesday, August 8, 2012 8:47 AM
>> > > Subject: Re: consistency, availability and partition pattern of HBase
>> > >
>> > > And consistency is not sacrificed? i.e. all distributed clients'
>> update
>> > > will results in sequential / real time update? Once update is done by
>> one
>> > > client, all other client could see results immediately?
>> > >
>> > > regards,
>> > > Lin
>> > >
>> > > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia <
>> [email protected]
>> > > >wrote:
>> > >
>> > > > I think availability is sacrificed in the sense that if region
>> server
>> > > > fails clients will have data inaccessible for the time region comes
>> up
>> > on
>> > > > some other server, not to confuse with data loss.
>> > > >
>> > > > Sent from my iPad
>> > > >
>> > > > On Aug 7, 2012, at 11:56 PM, Lin Ma <[email protected]> wrote:
>> > > >
>> > > > > Thank you Wei!
>> > > > >
>> > > > > Two more comments,
>> > > > >
>> > > > > 1. How about Hadoop's CAP characters do you think about?
>> > > > > 2. For your comments, if HBase implements "per key sequential
>> > > > consistency",
>> > > > > what are the missing characters for consistency? Cross-key update
>> > > > > sequences? Could you show me an example about what you think are
>> > > missed?
>> > > > > thanks.
>> > > > >
>> > > > > regards,
>> > > > > Lin
>> > > > >
>> > > > > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <[email protected]> wrote:
>> > > > >
>> > > > >> Hi Lin,
>> > > > >>
>> > > > >> In the CAP theorem
>> > > > >> Consistency stands for atomic consistency, i.e., each CRUD
>> operation
>> > > > >> occurs sequentially in a global, real-time clock
>> > > > >> Availability means each server if not partitioned can accept
>> > requests
>> > > > >>
>> > > > >> Partition means network partition
>> > > > >>
>> > > > >> As far as I understand (although I do not see any official
>> > > > documentation),
>> > > > >> HBase achieved "per key sequential consistency", i.e., for a
>> > specific
>> > > > key,
>> > > > >> there is an agreed sequence, for all operations on it. This is
>> > weaker
>> > > > than
>> > > > >> strong or sequential consistency, but stronger than "eventual
>> > > > >> consistency".
>> > > > >>
>> > > > >> BTW: CAP was proposed by Prof. Eric Brewer...
>> > > > >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29
>> > > > >>
>> > > > >> Best Regards,
>> > > > >> Wei
>> > > > >>
>> > > > >> Wei Tan
>> > > > >> Research Staff Member
>> > > > >> IBM T. J. Watson Research Center
>> > > > >> 19 Skyline Dr, Hawthorne, NY  10532
>> > > > >> [email protected]; 914-784-6752
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >> From:   Lin Ma <[email protected]>
>> > > > >> To:    [email protected],
>> > > > >> Date:   08/07/2012 09:30 PM
>> > > > >> Subject:        consistency, availability and partition pattern
>> of
>> > > HBase
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >> Hello guys,
>> > > > >>
>> > > > >> According to the notes by Werner*, "*He presented the CAP
>> theorem,
>> > > which
>> > > > >> states that of three properties of shared-data systems—data
>> > > consistency,
>> > > > >> system availability, and tolerance to network partition—only two
>> can
>> > > be
>> > > > >> achieved at any given time." =>
>> > > > >>
>> > >
>> http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
>> > > > >>
>> > > > >> But it seems HBase could achieve all of the 3 features at the
>> same
>> > > time.
>> > > > >> Does it mean HBase breaks the rule by Werner. :-)
>> > > > >>
>> > > > >> If not, which one is sacrificed -- consistency (by using HDFS),
>> > > > >> availability (by using Zookeeper) or partition (by using region /
>> > > column
>> > > > >> family) ? And why?
>> > > > >>
>> > > > >> regards,
>> > > > >> Lin
>> > > > >>
>> > > > >>
>> > > > >>
>> > > >
>> > >
>> > >
>> >
>>
>
>

Re: consistency, availability and partition pattern of HBase

Reply via email to