On 9/4/12 3:07 PM, "Stack" <st...@duboce.net> wrote:
>On Tue, Sep 4, 2012 at 2:52 PM, Gen Liu <ge...@zynga.com> wrote: >> We are running into a case that if the region server that serves meta >>table is down, all request will timeouts because region lookup is not >>available. > >Only requests to .META. fail (and most of the time, .META. info is >cached so should be relatively rare to do .META. lookups). It should >not be all requests. We get a lot of region lookup error in the client side: (zlive-hbase-08.int.zynga.com is not dead, we killed another server) 2012-09-04 14:28:11,829 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation : locateRegionInMeta parentTable=-ROOT-, metaLocation={region=-ROOT-,,0.70236052, hostname=zlive-hbase-08.int.zynga.com, port=60020}, attempt=8 of 10 failed; retrying after sleep of 16000 because: Connection refused > >> It seems that regions that serve root and meta are the single point of >>failure in HBase. > >They can be offline if a server crashes but they should be back on >line soon enough; is this not your experience? We set hbase.regionserver.maxlogs=256 to enable big memstore flush to lower compaction stress, so the log split takes about 5-10 minutes. I think META will come back after the log split. Is there a way to specify where should HBase put root and meta table? > >> Is there a way to get rid of it? Does HBase give a higher recover >>priority to meta and root table? >> > >HBase gets .META. and -ROOT- back on line ahead of all other regions, yes. > >St.Ack