HDFS also chooses to degrade availability in the face of partitions. On Thu, Aug 9, 2012 at 11:08 AM, Lin Ma <[email protected]> wrote:
> Amandeep, thanks for your comments, and I will definitely read the paper > you suggested. > > For Hadoop itself, what do you think its CAP features? Which one of the > CAP is sacrificed? > > regards, > Lin > > On Thu, Aug 9, 2012 at 1:34 PM, Amandeep Khurana <[email protected]> wrote: > >> Firstly, I recommend you read the GFS and Bigtable papers. That'll give >> you >> a good understanding of the architecture. Adhoc question on the mailing >> list won't. >> >> I'll try to answer some of your questions briefly. Think of HBase as a >> database layer over an underlying filesystem (the same way MySQL is over >> ext2/3/4 etc). The filesystem for HBase in this case is HDFS. HDFS >> replicates data for redundancy and fault tolerance. HBase has region >> servers that serve the regions. Regions form tables. Region servers >> persist >> their data on HDFS. Now, every region is served by one and only one region >> server. So, HBase is not replicating anything. Replication is handled at >> the storage layer. If a region server goes down, all its regions now need >> to be served by some other region server. During this period of region >> assignment, the clients experience degraded availability if they try to >> interact with any of those regions. >> >> Coming back to CAP. HBase chooses to degrade availability in the face of >> partitions. "Partition" is a very general term here and does not >> necessarily mean network partitions. Any node falling off the HBase >> cluster >> can be considered to be a partition. So, when failures happen, HBase >> degrades availability but does not give up consistency. Consistency in >> this >> context is sort of the equivalent of atomicity in ACID. In the context of >> HBase, any copy of data that is written to HBase will be visible to all >> clients. There is no concept of multiple different versions that the >> clients need to reconcile between. When you read, you always get the same >> version of the row you are reading. In other words, HBase is strongly >> consistent. >> >> Hope that clears things up a bit. >> >> On Thu, Aug 9, 2012 at 8:02 AM, Lin Ma <[email protected]> wrote: >> >> > Thank you Lars. >> > >> > Is the same data store duplicated copy across region server? If so, if >> one >> > primary server for the region dies, client just need to read from the >> > secondary server for the same region. Why there is data is unavailable >> > time? >> > >> > BTW: please feel free to correct me for any wrong knowledge about HBase. >> > >> > regards, >> > Lin >> > >> > On Thu, Aug 9, 2012 at 9:31 AM, lars hofhansl <[email protected]> >> wrote: >> > >> > > After a write completes the next read (regardless of the location it >> is >> > > issued from) will see the latest value. >> > > This is because at any given time exactly RegionServer is responsible >> for >> > > a specific Key >> > > (through assignment of key ranges to regions and regions to >> > RegionServers). >> > > >> > > >> > > As Mohit said, the trade off is that data is unavailable if a >> > RegionServer >> > > dies until another RegionServer picks up the regions (and by extension >> > the >> > > key range) >> > > >> > > -- Lars >> > > >> > > >> > > ----- Original Message ----- >> > > From: Lin Ma <[email protected]> >> > > To: [email protected] >> > > Cc: >> > > Sent: Wednesday, August 8, 2012 8:47 AM >> > > Subject: Re: consistency, availability and partition pattern of HBase >> > > >> > > And consistency is not sacrificed? i.e. all distributed clients' >> update >> > > will results in sequential / real time update? Once update is done by >> one >> > > client, all other client could see results immediately? >> > > >> > > regards, >> > > Lin >> > > >> > > On Wed, Aug 8, 2012 at 11:17 PM, Mohit Anchlia < >> [email protected] >> > > >wrote: >> > > >> > > > I think availability is sacrificed in the sense that if region >> server >> > > > fails clients will have data inaccessible for the time region comes >> up >> > on >> > > > some other server, not to confuse with data loss. >> > > > >> > > > Sent from my iPad >> > > > >> > > > On Aug 7, 2012, at 11:56 PM, Lin Ma <[email protected]> wrote: >> > > > >> > > > > Thank you Wei! >> > > > > >> > > > > Two more comments, >> > > > > >> > > > > 1. How about Hadoop's CAP characters do you think about? >> > > > > 2. For your comments, if HBase implements "per key sequential >> > > > consistency", >> > > > > what are the missing characters for consistency? Cross-key update >> > > > > sequences? Could you show me an example about what you think are >> > > missed? >> > > > > thanks. >> > > > > >> > > > > regards, >> > > > > Lin >> > > > > >> > > > > On Wed, Aug 8, 2012 at 12:18 PM, Wei Tan <[email protected]> wrote: >> > > > > >> > > > >> Hi Lin, >> > > > >> >> > > > >> In the CAP theorem >> > > > >> Consistency stands for atomic consistency, i.e., each CRUD >> operation >> > > > >> occurs sequentially in a global, real-time clock >> > > > >> Availability means each server if not partitioned can accept >> > requests >> > > > >> >> > > > >> Partition means network partition >> > > > >> >> > > > >> As far as I understand (although I do not see any official >> > > > documentation), >> > > > >> HBase achieved "per key sequential consistency", i.e., for a >> > specific >> > > > key, >> > > > >> there is an agreed sequence, for all operations on it. This is >> > weaker >> > > > than >> > > > >> strong or sequential consistency, but stronger than "eventual >> > > > >> consistency". >> > > > >> >> > > > >> BTW: CAP was proposed by Prof. Eric Brewer... >> > > > >> http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29 >> > > > >> >> > > > >> Best Regards, >> > > > >> Wei >> > > > >> >> > > > >> Wei Tan >> > > > >> Research Staff Member >> > > > >> IBM T. J. Watson Research Center >> > > > >> 19 Skyline Dr, Hawthorne, NY 10532 >> > > > >> [email protected]; 914-784-6752 >> > > > >> >> > > > >> >> > > > >> >> > > > >> From: Lin Ma <[email protected]> >> > > > >> To: [email protected], >> > > > >> Date: 08/07/2012 09:30 PM >> > > > >> Subject: consistency, availability and partition pattern >> of >> > > HBase >> > > > >> >> > > > >> >> > > > >> >> > > > >> Hello guys, >> > > > >> >> > > > >> According to the notes by Werner*, "*He presented the CAP >> theorem, >> > > which >> > > > >> states that of three properties of shared-data systems—data >> > > consistency, >> > > > >> system availability, and tolerance to network partition—only two >> can >> > > be >> > > > >> achieved at any given time." => >> > > > >> >> > > >> http://www.allthingsdistributed.com/2008/12/eventually_consistent.html >> > > > >> >> > > > >> But it seems HBase could achieve all of the 3 features at the >> same >> > > time. >> > > > >> Does it mean HBase breaks the rule by Werner. :-) >> > > > >> >> > > > >> If not, which one is sacrificed -- consistency (by using HDFS), >> > > > >> availability (by using Zookeeper) or partition (by using region / >> > > column >> > > > >> family) ? And why? >> > > > >> >> > > > >> regards, >> > > > >> Lin >> > > > >> >> > > > >> >> > > > >> >> > > > >> > > >> > > >> > >> > >
