Hello,
I have an HBase 0.90.3 fleet with more than 100 hosts distributed
across 3 datacenters. Yesterday, we experienced some network issues
for several minutes and the region servers from one data center lost
the connectivity with the namenode. They started the shutdown sequence
but about 20 hosts were unable to complete it successfully. This is
bad for us because we have to restart them manually.
I checked the logs and it looks like the embedded Jetty does not close
all the threads. Here is a sample from the logs.
[FATAL] (IPC Server handler 5 on 60020)
org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region
server serverName=pa-hbase-datanode-na-7031,60020,1316498497389,
load=(requests=19902, regions=32, usedHeap=582, maxHeap=1820): File
System not available { java.io.IOException: File system is not
available …..
[INFO] (IPC Server handler 5 on 60020)
org.apache.hadoop.ipc.HBaseServer: IPC Server handler 5 on 60020:
exiting
[WARN] (regionserver60020) org.mortbay.log: 1 threads could not be stopped
[INFO] (regionserver60020)
org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020
exiting
This is a sample from the stack trace dump on that host. There is one
Jetty thread in runnable state.
Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.1-b02 mixed mode):
"1692369893@qtp-1688716382-1 - Acceptor0
[email protected]:60030" prio=10 tid=0x00002aaab064e000
nid=0x2841 runnable [0x0000000042cf8000]
java.lang.Thread.State: RUNNABLE
at
org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:724)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Is my assumption correct? Should I open a JIRA?
Regards,
Bogdan