> I have a hbase table which created via mapred tool, took almost 1 hour to > load by the loadtable.rb script and to be available for serving. The scale > was 8k regions per server. I am on 0.20.1, r822817 though. I am yet to test > the failure case, but it will take around 1hour/ no.of RS to redistribute? As > you said, i will try this in 0.21.
No. The data is on HDFS and already replicated 3 times. The time of redistribution is zk-session-expiration-time + WAL-splitting-time. The first is currently 60 seconds (we are working on many different solutions to lower that to 10 secs) and the second depends on how many edits you have in the WAL which is currently limited to 32 files of max 62MB (configurable). So if you need a max of 1-2 minutes of unavailability, you have to play with these. J-D On Wed, Nov 25, 2009 at 3:29 AM, Murali Krishna. P <muralikpb...@yahoo.com> wrote: > Hi Ryan, > Thanks for the quick response. > > We are planing have this in 2 or 3 data centers for BCP and latency > reasons. Currently application runs in a non-scalable cluster, essentially we > have the data partitioned across multiple fixed columns. The entire cluster > of machines can be considered as m row vs n column setup. The rows ensures > the availability and columns (machines) ensures distribution/partition. As > you see, even if m-1 box goes down in a column, the availability is 100%. (By > availability, I meant all the documents in the cluster are available) So, > suppose we need 20 machines to hold the data, with 40 machines, we are highly > available. Obviosly we have the data in the other data center, but with a > higher latency. We will take the hit only if all of the replicas fails in one > data center. > > > I have a hbase table which created via mapred tool, took almost 1 hour to > load by the loadtable.rb script and to be available for serving. The scale > was 8k regions per server. I am on 0.20.1, r822817 though. I am yet to test > the failure case, but it will take around 1hour/ no.of RS to redistribute? As > you said, i will try this in 0.21. > > I understand that it is difficult to achive availability in a distributed > system, but the problem is solved in hdfs by replicating data across data > nodes. That is why I thought we should do the same thing in hbase with > replicated regions. We need to ensure that the data is never missing. May be > we can afford 1-2 mins not mare than that. I can definitely redirect the > request to the other data center if there is an region unavailable exception, > but has a higher latency as mentioned. > > The muti RS solution will require double the memory which is not a problem > for users who needs region repl > one. Another idea was that the hbase should > create 2 records for each key, like key#1 and key#2 and some how ensure that > they go to different region servers. If key#1 request fails, key#2 can be > used and vise versa. In a steady state, we can round robin to distribute the > load. But the problem here is that , we are replicating the data also, which > will need double the disk space? So, if some how we can replicate the in > region index but not the data, it will be great. > > Thanks, > Murali Krishna > > > > > ________________________________ > From: Ryan Rawson <ryano...@gmail.com> > To: hbase-user@hadoop.apache.org > Sent: Wed, 25 November, 2009 3:25:20 PM > Subject: Re: HBase High Availability > > With multiple masters, the election is mediated by zookeeper and the > idle masters are awaiting the relection cycle. > > The problems with brining regions up after a failure isnt the actual > speed of loading them, but bugs with the master. This is being fixed > in 0.21. It will allow us to much more rapidly bring regions back > online after a failure. > > As for loading a region across multiple servers, this would have to be > thought about quite carefully to see if it is possible. Right now > there is a substantial amount of state loaded that would be changed by > other servers, and you would still have to reload that state anyways. > > We also need to ask ourselves, what does "availability" mean anyways? > For example, if a regionserver failed, does that mean hbase is > offline? The answer would have to be a "no", but certain sections of > data might be offline temporarily. Thus HBase has 100% uptime by this > definition, correct? > > In the annals of distributed computing, you are only protected with > minimal downtime from limited hardware failures. Once you take out > too many nodes, things start failing, that is a given. HBase solves > the data scalability problem, it solves the limited machine failure > problem. > > I highly suggest this presentation: > http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf > > BTW, what is your budget for "near 100% uptime" anyways? How many > datacenters did you plan on using? > > On Wed, Nov 25, 2009 at 1:31 AM, Murali Krishna. P > <muralikpb...@yahoo.com> wrote: >> Hi, >> This is regarding the region unavailability when a region server goes >> down. There will be cases where we have thousands of regions per RS and it >> takes considerable amount of time to redistribute the regions when a node >> fails. The service will be unavailable during that period. I am evaluating >> HBase for an application where we need to guarantee close to 100% >> availability (namenode is still SPOF, leave that). >> >> One simple idea would be to replicate the regions in memory. Can we load >> the same region in multiple region servers? I am not sure about the >> feasibility yet, there will be issues like consistency across these in >> memory replicas. Wanted to know whether there were any thoughts / work >> already going on this area? I saw some related discussion here >> http://osdir.com/ml/hbase-user-hadoop-apache/2009-09/msg00118.html, not sure >> what is the state. >> >> Same needs to be done with the master as well or is it already done with >> ZK? How fast is the master re-election and catalog load currently ? Do we >> always have multiple masters in ready to run state? >> >> >> Thanks, >> Murali Krishna >> >