[
https://issues.apache.org/jira/browse/HBASE-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12589758#action_12589758
]
Jim Kellerman commented on HBASE-10:
------------------------------------
Proposal:
The only time any of the flusher, compactor or log roller threads needs to be
interrupted is if it is dormant waiting for something to do. Interrupting one
of these threads at any other time will probably produce undesirable results
like the DFSClient issue. If the thread is not dormant, it will finish what it
is doing, go to the top of the loop and exit because the stop flag is set.
So instead of using synchronized, use a ReentrantLock. The interruptor will use
tryLock and if it fails, it knows the thread is working and does not need to be
interrupted because it is doing something and when it finishes it checks the
stop flag and exits.
If the tryLock succeeds, it interrupts the thread to wake it from its sleep.
Nothing else needs to be done because the join at the end of HRegionServer.run
will take care of coordinating the region server exit.
> HRegionServer hangs upon exit due to DFSClient Exception
> --------------------------------------------------------
>
> Key: HBASE-10
> URL: https://issues.apache.org/jira/browse/HBASE-10
> Project: Hadoop HBase
> Issue Type: Bug
> Components: regionserver
> Environment: CentOS 5
> Reporter: Chris Kline
> Assignee: Jim Kellerman
> Priority: Minor
> Fix For: 0.2.0
>
>
> Several HRegionServers hang around indefinitely well after the HMaster has
> exited. This was triggered executing $HBASE_HOME/bin/stop-hbase.sh. The
> HMaster exists fine, but here is what happens on one of the HRegionServers:
> 2008-01-02 18:54:01,907 INFO org.apache.hadoop.hbase.HRegionServer: Got
> regionserver stop message
> 2008-01-02 18:54:01,907 INFO org.apache.hadoop.hbase.Leases:
> regionserver/0.0.0.0:60020 closing leases
> 2008-01-02 18:54:01,907 INFO org.apache.hadoop.hbase.Leases$LeaseMonitor:
> regionserver/0.0.0.0:60020.leaseChecker exiting
> 2008-01-02 18:54:01,908 INFO org.apache.hadoop.hbase.Leases:
> regionserver/0.0.0.0:60020 closed leases
> 2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: Stopping server on
> 60020
> 2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 2 on 60020: exiting
> 2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 0 on 60020: exiting
> 2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 7 on 60020: exiting
> 2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 3 on 60020: exiting
> 2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 5 on 60020: exiting
> 2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 9 on 60020: exiting
> 2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 6 on 60020: exiting
> 2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 4 on 60020: exiting
> 2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 1 on 60020: exiting
> 2008-01-02 18:54:01,909 INFO org.apache.hadoop.ipc.Server: Stopping IPC
> Server listener on 60020
> 2008-01-02 18:54:01,908 INFO org.apache.hadoop.ipc.Server: IPC Server handler
> 8 on 60020: exiting
> 2008-01-02 18:54:01,909 INFO org.apache.hadoop.hbase.HRegionServer: Stopping
> infoServer
> 2008-01-02 18:54:01,909 DEBUG org.mortbay.util.Container: Stopping [EMAIL
> PROTECTED]
> 2008-01-02 18:54:01,909 DEBUG org.mortbay.util.ThreadedServer: closing
> ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=60030]
> 2008-01-02 18:54:01,909 DEBUG org.mortbay.util.ThreadedServer: IGNORED
> java.net.SocketException: Socket closed
> at java.net.PlainSocketImpl.socketAccept(Native Method)
> at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
> at java.net.ServerSocket.implAccept(ServerSocket.java:453)
> at java.net.ServerSocket.accept(ServerSocket.java:421)
> at
> org.mortbay.util.ThreadedServer.acceptSocket(ThreadedServer.java:432)
> at
> org.mortbay.util.ThreadedServer$Acceptor.run(ThreadedServer.java:631)
> 2008-01-02 18:54:01,910 INFO org.mortbay.util.ThreadedServer: Stopping
> Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=60030]
> 2008-01-02 18:54:01,910 DEBUG org.mortbay.util.ThreadedServer: Self connect
> to close listener /127.0.0.1:60030
> 2008-01-02 18:54:01,911 DEBUG org.mortbay.util.ThreadedServer: problem
> stopping acceptor /127.0.0.1:
> 2008-01-02 18:54:01,911 DEBUG org.mortbay.util.ThreadedServer: problem
> stopping acceptor /127.0.0.1:
> java.net.ConnectException: Connection refused
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
> at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
> at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
> at java.net.Socket.connect(Socket.java:519)
> at java.net.Socket.connect(Socket.java:469)
> at java.net.Socket.<init>(Socket.java:366)
> at java.net.Socket.<init>(Socket.java:209)
> at
> org.mortbay.util.ThreadedServer$Acceptor.forceStop(ThreadedServer.java:682)
> at org.mortbay.util.ThreadedServer.stop(ThreadedServer.java:557)
> at org.mortbay.http.SocketListener.stop(SocketListener.java:211)
> at org.mortbay.http.HttpServer.doStop(HttpServer.java:781)
> at org.mortbay.util.Container.stop(Container.java:154)
> at org.apache.hadoop.hbase.util.InfoServer.stop(InfoServer.java:237)
> at org.apache.hadoop.hbase.HRegionServer.run(HRegionServer.java:835)
> at java.lang.Thread.run(Thread.java:619)
> 2008-01-02 18:54:01,911 INFO org.mortbay.http.SocketListener: Stopped
> SocketListener on 0.0.0.0:60030
> 2008-01-02 18:54:01,912 DEBUG org.mortbay.util.Container: Stopping
> HttpContext[/static,/static]
> 2008-01-02 18:54:01,912 DEBUG org.mortbay.http.handler.AbstractHttpHandler:
> Stopped org.mortbay.http.handler.ResourceHandler in
> HttpContext[/static,/static]
> 2008-01-02 18:54:02,039 INFO org.mortbay.util.Container: Stopped
> HttpContext[/static,/static]
> 2008-01-02 18:54:02,039 DEBUG org.mortbay.util.Container: Stopping
> HttpContext[/logs,/logs]
> 2008-01-02 18:54:02,039 DEBUG org.mortbay.http.handler.AbstractHttpHandler:
> Stopped org.mortbay.http.handler.ResourceHandler in HttpContext[/logs,/logs]
> 2008-01-02 18:54:02,154 INFO org.mortbay.util.Container: Stopped
> HttpContext[/logs,/logs]
> 2008-01-02 18:54:02,154 DEBUG org.mortbay.util.Container: Stopping
> WebApplicationContext[/,/]
> 2008-01-02 18:54:02,154 DEBUG org.mortbay.util.Container: Stopping [EMAIL
> PROTECTED]
> 2008-01-02 18:54:02,155 INFO org.mortbay.util.Container: Stopped [EMAIL
> PROTECTED]
> 2008-01-02 18:54:02,277 DEBUG
> org.mortbay.jetty.servlet.AbstractSessionManager: Session scavenger exited
> 2008-01-02 18:54:02,278 DEBUG org.mortbay.util.Container: remove component:
> [EMAIL PROTECTED]
> 2008-01-02 18:54:02,278 INFO org.mortbay.util.Container: Stopped
> WebApplicationContext[/,/]
> 2008-01-02 18:54:02,278 INFO org.mortbay.util.Container: Stopped [EMAIL
> PROTECTED]
> 2008-01-02 18:54:02,278 DEBUG org.apache.hadoop.hbase.HRegionServer: closing
> region spider_pages,10_131455761,1198140179439
> 2008-01-02 18:54:02,278 INFO org.apache.hadoop.hbase.HRegionServer:
> regionserver/0.0.0.0:60020.cacheFlusher exiting
> 2008-01-02 18:54:02,278 INFO org.apache.hadoop.hbase.HRegionServer:
> regionserver/0.0.0.0:60020.compactor exiting
> 2008-01-02 18:54:02,278 INFO org.apache.hadoop.hbase.HRegionServer:
> regionserver/0.0.0.0:60020.splitter exiting
> 2008-01-02 18:54:02,279 DEBUG org.apache.hadoop.hbase.HStore: closed
> spider_pages,10_131455761,1198140179439/search (1501227429/search)
> 2008-01-02 18:54:02,279 DEBUG org.apache.hadoop.hbase.HStore: closed
> spider_pages,10_131455761,1198140179439/profile (1501227429/profile)
> 2008-01-02 18:54:02,279 DEBUG org.apache.hadoop.hbase.HStore: closed
> spider_pages,10_131455761,1198140179439/meta (1501227429/meta)
> 2008-01-02 18:54:02,279 INFO org.apache.hadoop.hbase.HRegion: closed
> spider_pages,10_131455761,1198140179439
> 2008-01-02 18:54:02,279 DEBUG org.apache.hadoop.hbase.HRegionServer: closing
> region spider_pages,10_486594261,1198319654267
> 2008-01-02 18:54:02,280 DEBUG org.apache.hadoop.hbase.HStore: closed
> spider_pages,10_486594261,1198319654267/search (364081590/search)
> 2008-01-02 18:54:02,280 DEBUG org.apache.hadoop.hbase.HStore: closed
> spider_pages,10_486594261,1198319654267/profile (364081590/profile)
> 2008-01-02 18:54:02,280 DEBUG org.apache.hadoop.hbase.HStore: closed
> spider_pages,10_486594261,1198319654267/meta (364081590/meta)
> 2008-01-02 18:54:02,280 INFO org.apache.hadoop.hbase.HRegion: closed
> spider_pages,10_486594261,1198319654267
> ...
> ... this closing of regions goes on for a while
> ...
> ... the following continues until a kill -9
> ...
> 2008-01-02 20:39:20,552 INFO org.apache.hadoop.fs.DFSClient: Could not obtain
> block blk_5124700261538503923 from any node: java.io.IOException: No live
> nodes contain current block
> 2008-01-02 20:40:23,556 INFO org.apache.hadoop.fs.DFSClient: Could not obtain
> block blk_5124700261538503923 from any node: java.io.IOException: No live
> nodes contain current block
> 2008-01-02 20:41:26,560 INFO org.apache.hadoop.fs.DFSClient: Could not obtain
> block blk_5124700261538503923 from any node: java.io.IOException: No live
> nodes contain current block
> 2008-01-02 20:42:29,566 INFO org.apache.hadoop.fs.DFSClient: Could not obtain
> block blk_5124700261538503923 from any node: java.io.IOException: No live
> nodes contain current block
> 2008-01-02 20:43:32,571 INFO org.apache.hadoop.fs.DFSClient: Could not obtain
> block blk_5124700261538503923 from any node: java.io.IOException: No live
> nodes contain current block
> ...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.