On Thu, Nov 26, 2009 at 6:19 PM, Ryan Rawson <ryano...@gmail.com> wrote: > Probably around the same time as hadoop 0.21, in other words a few > more months. There may be chances to run RCs before then though. >
Thanks for the quick reply Ryan. I am eagerly looking forward to trying the RCs; as I am looking forward to a deployment around April next year the timing looks just cool! Thanks, - Imran > -ryan > > On Thu, Nov 26, 2009 at 3:15 AM, Imran M Yousuf <imyou...@gmail.com> wrote: >> On Thu, Nov 26, 2009 at 12:05 PM, Jean-Daniel Cryans >> <jdcry...@apache.org> wrote: >> <snip /> >>> >>> Be also aware that we are planning to include a master-slave >>> replication between datacenters in 0.21. >>> >> >> From this discussion and a presentation of Ryan Rawson and Jonathan >> Gray I am really looking forward to release 0.21, any idea on the >> timeline? >> >> - Imran >> >>> J-D >>> >>> On Wed, Nov 25, 2009 at 8:45 PM, Murali Krishna. P >>> <muralikpb...@yahoo.com> wrote: >>>> Thanks JD for the detailed reply. >>>> >>>> Does the underlying java api currently block in case if region is not >>>> available ? I would like to get an immediate retry indication for the java >>>> call in such cases so that I can redirect the request to the duplicate >>>> table in the other data center. Can this be supported? >>>> >>>> Thanks, >>>> Murali Krishna >>>> >>>> >>>> >>>> >>>> ________________________________ >>>> From: Andrew Purtell <apurt...@apache.org> >>>> To: hbase-user@hadoop.apache.org >>>> Sent: Thu, 26 November, 2009 12:17:30 AM >>>> Subject: Re: HBase High Availability >>>> >>>> First, there is work under way for 0.21 which will shorten the time >>>> necessary for region redeployment. Part of the delay in 0.20 is less than >>>> ideal performance in that regard by the master. >>>> >>>> Beyond that, just as a general operational principle, I recommend that you >>>> host no more than 200-250 regions per region server. The Bigtable paper >>>> talks about each tablet server hosting only 100 regions, with only 200 MB >>>> of data each. While that is not cost effective for folks who do not build >>>> their own hardware in bulk, it should cause you to think about why: >>>> - Limiting the number of regions per tablet server limits time to >>>> recovery upon node failure -- you can engineer this to be within some >>>> threshold >>>> - Limiting the amount of data per region means that servers with >>>> reasonable RAM can cache and serve a lot of the data out of memory for >>>> sub-disk data access latencies >>>> >>>> So the advice here is to opt for more servers, not less; more RAM, not >>>> less; and smaller disk, not larger. >>>> >>>> You should also consider the impact of server failure on HDFS -- loss of >>>> block replicas. For each under-replicated block, HDFS must work to make >>>> additional copies. This can come at a bad time if loss of the blocks in >>>> the first place was due to overloading. >>>> Smaller disks mean fewer lost block replicas. For example, attach 4 x 160 >>>> GB drives as JBOD (as opposed to 4 x 1 TB or similar). Losing one disk >>>> means a loss of 160 GB worth of block replicas only (as opposed to 1 TB). >>>> Loss of a whole server means losing only 640 GB worth of block replicas >>>> (as opposed to 4 TB). >>>> You can also consider attaching 6 or 8 or even more modest sized disks per >>>> server to increase the I/O parallelism (number of spindles) while also >>>> constraining the amount of block replica loss per disk failure. >>>> >>>> Even so, blocked reads and writes over some interval during region >>>> redeployment due to server failure or load rebalancing is part of the >>>> Bigtable architecture and so HBase, unless we take additional steps such >>>> as setting up active-passive region server pairs, but that would have >>>> complications which affect consistency and performance and might not >>>> provide enough benefit anyway (still there is time needed to detect >>>> failure and fall over). This is not an unavailability of the Bigtable >>>> service. Other regions are not affected. This is graceful/proportional >>>> service degradation in the face of partial failures. There are other >>>> alternatives to Bigtable which degrade differently given partial failures. >>>> Such options can give you no waiting on the write path at any time and >>>> possibly no waiting on the read path but you will lose strong consistency >>>> as the trade off. So you may get stale answers over some (unbounded, iirc) >>>> period, but this is the choice you make. >>>> >>>> HBase also has options like Stargate or the Thrift connector which can >>>> block and retry on behalf of your clients so they are never blocked for >>>> writes. For read path options I could look at having Stargate serve >>>> (possibly stale) answers out of a cache -- with some flag that indicates >>>> noncanonical state -- if that would be useful, and/or return immediate >>>> "try again" indication, so your clients are at least not stalled. >>>> >>>> Best regards, >>>> >>>> - Andy >>>> >>>> >>>> >>>> >>>> ________________________________ >>>> From: Murali Krishna. P <muralikpb...@yahoo.com> >>>> To: hbase-user@hadoop.apache.org >>>> Sent: Wed, November 25, 2009 1:31:45 AM >>>> Subject: HBase High Availability >>>> >>>> Hi, >>>> This is regarding the region unavailability when a region server goes >>>> down. There will be cases where we have thousands of regions per RS and it >>>> takes considerable amount of time to redistribute the regions when a node >>>> fails. The service will be unavailable during that period. I am evaluating >>>> HBase for an application where we need to guarantee close to 100% >>>> availability (namenode is still SPOF, leave that). >>>> >>>> One simple idea would be to replicate the regions in memory. Can we >>>> load the same region in multiple region servers? I am not sure about the >>>> feasibility yet, there will be issues like consistency across these in >>>> memory replicas. Wanted to know whether there were any thoughts / work >>>> already going on this area? I saw some related discussion here >>>> http://osdir.com/ml/hbase-user-hadoop-apache/2009-09/msg00118.html, not >>>> sure what is the state. >>>> >>>> Same needs to be done with the master as well or is it already done with >>>> ZK? How fast is the master re-election and catalog load currently ? Do we >>>> always have multiple masters in ready to run state? >>>> >>>> >>>> Thanks, >>>> Murali Krishna >>> >> >> >> >> -- >> Imran M Yousuf >> Entrepreneur & Software Engineer >> Smart IT Engineering >> Dhaka, Bangladesh >> Email: im...@smartitengineering.com >> Blog: http://imyousuf-tech.blogs.smartitengineering.com/ >> Mobile: +880-1711402557 >> > -- Imran M Yousuf Entrepreneur & Software Engineer Smart IT Engineering Dhaka, Bangladesh Email: im...@smartitengineering.com Blog: http://imyousuf-tech.blogs.smartitengineering.com/ Mobile: +880-1711402557