Zookeeper tuning, was: YouAreDeadException

Ted Yu Sat, 15 Jan 2011 18:38:32 -0800

I want region server to be more durable.
If zookeeper.session.timeout is set high, it takes master long to discover
dead region server.


Can you share zookeeper tuning experiences ?

Thanks

On Sat, Jan 15, 2011 at 5:14 PM, Stack <[email protected]> wrote:

> Yes.
>
> Currently, there are two heartbeats: the zk client one and then the
> hbase which used to be what we relied on figuring whether a
> regionserver is alive but now its just used to post the master the
> regionserver stats such as requests per second.  This latter is going
> away in 0.92 (Pre-0.90.0 regionserver and master would swap 'messages'
> on the back of the heartbeat -- i.e. open this region, I've just split
> region X, etc. but now 90% of this stuff is done via zk.  In 0.92.
> we'll finish the cleanup).
>
> Hope this helps,
> St.Ack
>
> On Sat, Jan 15, 2011 at 5:03 PM, Ted Yu <[email protected]> wrote:
> > For #1, I assume I should look for 'received expired from ZooKeeper,
> > aborting'
> >
> > On Sat, Jan 15, 2011 at 5:02 PM, Ted Yu <[email protected]> wrote:
> >
> >> For #1, what string should I look for in region server log ?
> >> For #4, what's the rationale behind sending YADE after receiving
> heartbeat
> >> ? I thought heartbeat means the RS is alive.
> >>
> >> Thanks
> >>
> >>
> >> On Sat, Jan 15, 2011 at 4:49 PM, Stack <[email protected]> wrote:
> >>
> >>> FYI Ted, the YourAreDeadException usually happens in following context:
> >>>
> >>> 1. Regionserver has some kinda issue -- long GC pause for instance --
> >>> and it stops tickling zk.
> >>> 2. Master gets zk session expired event.  Starts up recovery of the
> hung
> >>> region.
> >>> 3. Regionserver recovers but has not yet processed its session expired
> >>> event.  It heartbeats the Master as though nothing wrong.
> >>> 4. Master is mid-recovery or beyond server recovery and on receipt of
> >>> the heartbeat in essence tells the regionserver to 'go away' by
> >>> sending him the YouAreDeadException.
> >>> 5. By now the regionserver will have gotten its session expired
> >>> notification and will have started an abort so the YADE is not news
> >>> when it receives the exception.
> >>>
> >>> St.Ack
> >>>
> >>>
> >>> On Fri, Jan 14, 2011 at 7:49 PM, Ted Yu <[email protected]> wrote:
> >>> > Thanks for your analysis, Ryan.
> >>> > The dev cluster has half as many nodes as our staging cluster. Each
> node
> >>> has
> >>> > half the number of cores as the node in staging.
> >>> >
> >>> > I agree with your conclusion.
> >>> >
> >>> > I will report back after I collect more data - the flow uses hbase
> >>> heavily
> >>> > toward the end.
> >>> >
> >>> > On Fri, Jan 14, 2011 at 6:20 PM, Ryan Rawson <[email protected]>
> >>> wrote:
> >>> >
> >>> >> I'm seeing not much in the way of errors, timeouts, all to one
> machine
> >>> >> ending with .80, so that is probably your failed node.
> >>> >>
> >>> >> Other than that, the log doesnt seem to say too much.  Searching for
> >>> >> strings like FATAL and Exception is the way to go here.
> >>> >>
> >>> >> Also things like this:
> >>> >> 2011-01-14 23:38:52,936 INFO
> >>> >> org.apache.hadoop.hbase.master.AssignmentManager: Region has been
> >>> >> PENDING_OPEN for too long, reassigning region=
> >>> >>
> >>> >>
> >>>
> NIGHTLYDEVGRIDSGRIDSQL-THREEGPPSPEECHCALLS-1294897314309,@v[h\xE2%\x83\xD4\xAC@v
> >>> >> [h\xE2%\x83\xD4\xAC@v[h\xE2%\x83\xD4\xAC@v[h\xDC,129489731602
> >>> >> 7.2c40637c6c648a67162cc38d8c6d8ee9.
> >>> >>
> >>> >>
> >>> >> Guessing, I'd probably say your nodes hit some performance wall,
> with
> >>> >> io-wait, or networking, or something, and Regionserver processes
> >>> >> stopped responding, but did not time out from zookeeper yet... so
> you
> >>> >> would run into a situation where some nodes are unresponsive, so any
> >>> >> data hosted there would be difficult to talk to.  Until the
> >>> >> regionserver times out it's zookeeper node, the master doesnt know
> >>> >> about the fault of the regionserver.
> >>> >>
> >>> >> The master web UI is probably inaccessible because the META table is
> >>> >> on a regionserver that went AWOL.  You should check your load, your
> >>> >> ganglia graphs.  Also remember, despite having lots of disks, each
> >>> >> node is a gigabit ethernet which means about 110-120 MB/sec.  It's
> >>> >> quite possible you are running into network limitations, remember
> that
> >>> >> regionservers must write to 2 additional datanodes, and there will
> be
> >>> >> overlap, thus you have to share some of that 110-120MB/sec per node
> >>> >> figure with other nodes, not to mention that you also need to factor
> >>> >> inbound bandwidth (from client->hbase regionserver) and outbound
> >>> >> bandwidth (from datanode replica 1 -> dn replica 2).
> >>> >>
> >>> >> -ryan
> >>> >>
> >>> >> On Fri, Jan 14, 2011 at 3:57 PM, Ted Yu <[email protected]>
> wrote:
> >>> >> > Now I cannot access master web UI, This happened after I doubled
> the
> >>> >> amount
> >>> >> > of data processed in our flow.
> >>> >> >
> >>> >> > I am attaching master log.
> >>> >> >
> >>> >> > On Fri, Jan 14, 2011 at 3:10 PM, Ryan Rawson <[email protected]>
> >>> wrote:
> >>> >> >>
> >>> >> >> This is the cause:
> >>> >> >>
> >>> >> >> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING
> region
> >>> >> server
> >>> >> >> serverName=sjc1-hadoop1.sjc1.carrieriq.com,60020,1294856823378,
> >>> >> >> load=(requests=0, regions=6, usedHeap=514, maxHeap=3983):
> >>> >> >> regionserver:60020-0x12d7b7b1c760004
> >>> >> regionserver:60020-0x12d7b7b1c760004
> >>> >> >> received expired from ZooKeeper, aborting
> >>> >> >> org.apache.zookeeper.KeeperException$SessionExpiredException:
> >>> >> >>
> >>> >> >> Why did the session expire?  Typically it's GC, what does your GC
> >>> logs
> >>> >> >> say?  Otherwise, network issues perhaps?  Swapping?  Other
> machine
> >>> >> >> related systems problems?
> >>> >> >>
> >>> >> >> -ryan
> >>> >> >>
> >>> >> >>
> >>> >> >> On Fri, Jan 14, 2011 at 3:04 PM, Ted Yu <[email protected]>
> >>> wrote:
> >>> >> >> > I ran 0.90 RC3 in dev cluster.
> >>> >> >> >
> >>> >> >> > I saw the following in region server log:
> >>> >> >> >
> >>> >> >> > Caused by: org.apache.hadoop.ipc.RemoteException:
> >>> >> >> > org.apache.hadoop.hbase.YouAreDeadException: Server REPORT
> >>> rejected;
> >>> >> >> > currently processing sjc1-hadoop1.sjc1.carrieriq.com
> >>> >> ,60020,1294856823378
> >>> >> >> > as
> >>> >> >> > dead server
> >>> >> >> >    at
> >>> >> >> >
> >>> >> >> >
> >>> >>
> >>>
> org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:197)
> >>> >> >> >    at
> >>> >> >> >
> >>> >> >> >
> >>> >>
> >>>
> org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:247)
> >>> >> >> >    at
> >>> >> >> >
> >>> >> >> >
> >>> >>
> >>>
> org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:648)
> >>> >> >> >    at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown
> Source)
> >>> >> >> >    at
> >>> >> >> >
> >>> >> >> >
> >>> >>
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>> >> >> >    at java.lang.reflect.Method.invoke(Method.java:597)
> >>> >> >> >    at
> >>> >> >> >
> >>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
> >>> >> >> >    at
> >>> >> >> >
> >>> >> >> >
> >>> >>
> >>>
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1036)
> >>> >> >> >
> >>> >> >> >    at
> >>> >> org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:753)
> >>> >> >> >    at
> >>> >> >> >
> >>> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
> >>> >> >> >    at $Proxy0.regionServerReport(Unknown Source)
> >>> >> >> >    at
> >>> >> >> >
> >>> >> >> >
> >>> >>
> >>>
> org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:702)
> >>> >> >> >    ... 2 more
> >>> >> >> > 2011-01-13 03:55:08,982 INFO org.apache.zookeeper.ZooKeeper:
> >>> >> Initiating
> >>> >> >> > client connection,
> >>> >> >> > connectString=sjc1-hadoop0.sjc1.carrieriq.com:2181
> >>> >> sessionTimeout=90000
> >>> >> >> > watcher=hconnection
> >>> >> >> > 2011-01-13 03:55:08,914 FATAL
> >>> >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING
> >>> region
> >>> >> >> > server
> >>> >> >> > serverName=sjc1-hadoop1.sjc1.carrieriq.com
> ,60020,1294856823378,
> >>> >> >> > load=(requests=0, regions=6, usedHeap=514, maxHeap=3983):
> >>> >> >> > regionserver:60020-0x12d7b7b1c760004
> >>> >> >> > regionserver:60020-0x12d7b7b1c760004
> >>> >> >> > received expired from ZooKeeper, aborting
> >>> >> >> > org.apache.zookeeper.KeeperException$SessionExpiredException:
> >>> >> >> > KeeperErrorCode = Session expired
> >>> >> >> >    at
> >>> >> >> >
> >>> >> >> >
> >>> >>
> >>>
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:328)
> >>> >> >> >    at
> >>> >> >> >
> >>> >> >> >
> >>> >>
> >>>
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:246)
> >>> >> >> >    at
> >>> >> >> >
> >>> >> >> >
> >>> >>
> >>>
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
> >>> >> >> >    at
> >>> >> >> >
> >>> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
> >>> >> >> >
> >>> >> >> > ---------------
> >>> >> >> >
> >>> >> >> > And the following from master log:
> >>> >> >> >
> >>> >> >> > 2011-01-13 03:52:42,003 INFO
> >>> >> >> > org.apache.hadoop.hbase.zookeeper.RegionServerTracker:
> >>> RegionServer
> >>> >> >> > ephemeral node deleted, processing expiration [
> >>> >> >> > sjc1-hadoop1.sjc1.carrieriq.com,60020,1294856823378]
> >>> >> >> > 2011-01-13 03:52:42,005 DEBUG
> >>> >> >> > org.apache.hadoop.hbase.master.ServerManager:
> >>> >> >> > Added=sjc1-hadoop1.sjc1.carrieriq.com,60020,1294856823378 to
> dead
> >>> >> >> > servers,
> >>> >> >> > submitted shutdown handler to be executed, root=false,
> meta=false
> >>> >> >> > 2011-01-13 03:52:42,005 INFO
> >>> >> >> > org.apache.hadoop.hbase.master.handler.ServerShutdownHandler:
> >>> >> Splitting
> >>> >> >> > logs
> >>> >> >> > for sjc1-hadoop1.sjc1.carrieriq.com,60020,1294856823378
> >>> >> >> > 2011-01-13 03:52:42,092 INFO
> >>> >> >> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter:
> Splitting 1
> >>> >> >> > hlog(s)
> >>> >> >> > in hdfs://
> >>> >> >> >
> >>> >> >> >
> >>> >>
> >>>
> sjc1-hadoop0.sjc1.carrieriq.com:9000/hbase/.logs/sjc1-hadoop1.sjc1.carrieriq.com,60020,1294856823378
> >>> >> >> > 2011-01-13 03:52:42,093 DEBUG
> >>> >> >> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer
> >>> thread
> >>> >> >> > Thread[WriterThread-0,5,main]: starting
> >>> >> >> > 2011-01-13 03:52:42,094 DEBUG
> >>> >> >> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Writer
> >>> thread
> >>> >> >> > Thread[WriterThread-1,5,main]: starting
> >>> >> >> > 2011-01-13 03:52:42,096 DEBUG
> >>> >> >> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter:
> Splitting
> >>> hlog
> >>> >> 1
> >>> >> >> > of
> >>> >> >> > 1: hdfs://
> >>> >> >> >
> >>> >> >> >
> >>> >>
> >>>
> sjc1-hadoop0.sjc1.carrieriq.com:9000/hbase/.logs/sjc1-hadoop1.sjc1.carrieriq.com,60020,1294856823378/sjc1-hadoop1.sjc1.carrieriq.com%3A60020.1294860449407
> >>> >> ,
> >>> >> >> > length=0
> >>> >> >> >
> >>> >> >> > Please advise what could be the cause.
> >>> >> >> >
> >>> >> >> > Thanks
> >>> >> >> >
> >>> >> >
> >>> >> >
> >>> >>
> >>> >
> >>>
> >>
> >>
> >
>

Zookeeper tuning, was: YouAreDeadException

Reply via email to