Sorry, but I don't see the attachment getting through. Here is the full stack
trace in case:
2008-08-27 16:12:36,822 INFO org.apache.hadoop.mapred.StatusHttpServer: Process
Thread Dump: jsp requested
36 active threads
Thread 247379 (Thread-247366):
State: TIMED_WAITING
Blocked count: 0
Waited count: 552
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1824)
Thread 51 (IPC Server handler 15 on 60020):
State: TIMED_WAITING
Blocked count: 1856
Waited count: 11461
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 50 (IPC Server handler 14 on 60020):
State: TIMED_WAITING
Blocked count: 1772
Waited count: 11460
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 49 (IPC Server handler 13 on 60020):
State: TIMED_WAITING
Blocked count: 1780
Waited count: 11461
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 48 (IPC Server handler 12 on 60020):
State: TIMED_WAITING
Blocked count: 1926
Waited count: 11461
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 47 (IPC Server handler 11 on 60020):
State: TIMED_WAITING
Blocked count: 2153
Waited count: 11463
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 46 (IPC Server handler 10 on 60020):
State: TIMED_WAITING
Blocked count: 1707
Waited count: 11461
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 45 (IPC Server handler 9 on 60020):
State: TIMED_WAITING
Blocked count: 2011
Waited count: 11461
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 44 (IPC Server handler 8 on 60020):
State: TIMED_WAITING
Blocked count: 1952
Waited count: 11463
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 43 (IPC Server handler 7 on 60020):
State: TIMED_WAITING
Blocked count: 1714
Waited count: 11461
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 42 (IPC Server handler 6 on 60020):
State: TIMED_WAITING
Blocked count: 2073
Waited count: 11462
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 41 (IPC Server handler 5 on 60020):
State: TIMED_WAITING
Blocked count: 1802
Waited count: 11463
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 40 (IPC Server handler 4 on 60020):
State: TIMED_WAITING
Blocked count: 1691
Waited count: 11462
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 39 (IPC Server handler 3 on 60020):
State: TIMED_WAITING
Blocked count: 1878
Waited count: 11464
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 38 (IPC Server handler 2 on 60020):
State: TIMED_WAITING
Blocked count: 2057
Waited count: 11462
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 37 (IPC Server handler 1 on 60020):
State: TIMED_WAITING
Blocked count: 1929
Waited count: 11463
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 36 (IPC Server handler 0 on 60020):
State: TIMED_WAITING
Blocked count: 1781
Waited count: 11462
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:866)
Thread 12 (IPC Server listener on 60020):
State: RUNNABLE
Blocked count: 0
Waited count: 0
Stack:
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
org.apache.hadoop.ipc.Server$Listener.run(Server.java:299)
Thread 14 (IPC Server Responder):
State: RUNNABLE
Blocked count: 0
Waited count: 0
Stack:
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
org.apache.hadoop.ipc.Server$Responder.run(Server.java:445)
Thread 35 (SocketListener0-1):
State: RUNNABLE
Blocked count: 1
Waited count: 42174
Stack:
sun.management.ThreadImpl.getThreadInfo0(Native Method)
sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:147)
sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:123)
org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:114)
org.apache.hadoop.util.ReflectionUtils.logThreadInfo(ReflectionUtils.java:168)
org.apache.hadoop.mapred.StatusHttpServer$StackServlet.doGet(StatusHttpServer.java:259)
javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
org.mortbay.http.HttpServer.service(HttpServer.java:954)
org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
Thread 34 (SocketListener0-0):
State: RUNNABLE
Blocked count: 2
Waited count: 42174
Stack:
java.net.SocketInputStream.socketRead0(Native Method)
java.net.SocketInputStream.read(SocketInputStream.java:129)
org.mortbay.util.LineInput.fill(LineInput.java:469)
org.mortbay.util.LineInput.fillLine(LineInput.java:547)
org.mortbay.util.LineInput.readLineBuffer(LineInput.java:293)
org.mortbay.util.LineInput.readLineBuffer(LineInput.java:277)
org.mortbay.http.HttpRequest.readHeader(HttpRequest.java:238)
org.mortbay.http.HttpConnection.readRequest(HttpConnection.java:861)
org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:907)
org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
Thread 33 (Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=60030]):
State: RUNNABLE
Blocked count: 0
Waited count: 0
Stack:
java.net.PlainSocketImpl.socketAccept(Native Method)
java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
java.net.ServerSocket.implAccept(ServerSocket.java:453)
java.net.ServerSocket.accept(ServerSocket.java:421)
org.mortbay.util.ThreadedServer.acceptSocket(ThreadedServer.java:432)
org.mortbay.util.ThreadedServer$Acceptor.run(ThreadedServer.java:631)
Thread 32 (SessionScavenger):
State: TIMED_WAITING
Blocked count: 0
Waited count: 14061
Stack:
java.lang.Thread.sleep(Native Method)
org.mortbay.jetty.servlet.AbstractSessionManager$SessionScavenger.run(AbstractSessionManager.java:587)
Thread 15 (regionserver/0:0:0:0:0:0:0:0:60020.leaseChecker):
State: TIMED_WAITING
Blocked count: 0
Waited count: 42176
Stack:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)
java.util.concurrent.DelayQueue.poll(DelayQueue.java:201)
org.apache.hadoop.hbase.Leases.run(Leases.java:75)
Thread 11 (regionserver/0:0:0:0:0:0:0:0:60020.worker):
State: TIMED_WAITING
Blocked count: 0
Waited count: 44769
Stack:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:395)
org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:805)
java.lang.Thread.run(Thread.java:619)
Thread 9 (regionserver/0:0:0:0:0:0:0:0:60020.compactor):
State: TIMED_WAITING
Blocked count: 217143
Waited count: 393898
Stack:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:395)
org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:76)
Thread 8 (regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher):
State: TIMED_WAITING
Blocked count: 9627
Waited count: 228686
Stack:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:395)
org.apache.hadoop.hbase.regionserver.Flusher.run(Flusher.java:87)
Thread 10 (regionserver/0:0:0:0:0:0:0:0:60020.logRoller):
State: TIMED_WAITING
Blocked count: 0
Waited count: 43612
Stack:
java.lang.Object.wait(Native Method)
org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:63)
Thread 30 ([EMAIL PROTECTED]):
State: TIMED_WAITING
Blocked count: 0
Waited count: 435045
Stack:
java.lang.Thread.sleep(Native Method)
org.apache.hadoop.dfs.DFSClient$LeaseChecker.run(DFSClient.java:763)
java.lang.Thread.run(Thread.java:619)
Thread 28 (org.apache.hadoop.io.ObjectWritable Connection Culler):
State: TIMED_WAITING
Blocked count: 11
Waited count: 421031
Stack:
java.lang.Thread.sleep(Native Method)
org.apache.hadoop.ipc.Client$ConnectionCuller.run(Client.java:435)
Thread 19 (org.apache.hadoop.io.ObjectWritable Connection Culler):
State: TIMED_WAITING
Blocked count: 148
Waited count: 421026
Stack:
java.lang.Thread.sleep(Native Method)
org.apache.hadoop.ipc.Client$ConnectionCuller.run(Client.java:435)
Thread 18 (DestroyJavaVM):
State: RUNNABLE
Blocked count: 0
Waited count: 0
Stack:
Thread 17 (regionserver/0:0:0:0:0:0:0:0:60020):
State: TIMED_WAITING
Blocked count: 5
Waited count: 280887
Stack:
java.lang.Thread.sleep(Native Method)
org.apache.hadoop.hbase.util.Sleeper.sleep(Sleeper.java:72)
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:418)
java.lang.Thread.run(Thread.java:619)
Thread 4 (Signal Dispatcher):
State: RUNNABLE
Blocked count: 0
Waited count: 0
Stack:
Thread 3 (Finalizer):
State: WAITING
Blocked count: 516
Waited count: 3543
Waiting on [EMAIL PROTECTED]
Stack:
java.lang.Object.wait(Native Method)
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
Thread 2 (Reference Handler):
State: WAITING
Blocked count: 220
Waited count: 3530
Waiting on [EMAIL PROTECTED]
Stack:
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:485)
java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
--- On Sun, 8/31/08, Andrew Purtell <[EMAIL PROTECTED]> wrote:
From: Andrew Purtell <[EMAIL PROTECTED]>
Subject: Re: Cut a 0.2.1 release candidate?
To: [email protected]
Date: Sunday, August 31, 2008, 12:43 PM
The log...
--- On Sun, 8/31/08, Andrew Purtell <[EMAIL PROTECTED]> wrote:
From: Andrew Purtell <[EMAIL PROTECTED]>
Subject: Re: Cut a 0.2.1 release candidate?
To: [email protected]
Date: Sunday, August 31, 2008, 12:41 PM
+1, especially the locking evaluation and de-entanglement
done by Jim and J-D as part of 810.
We might be seeing regionserver deadlocks with 0.2.0. See
attached partial log, including stack trace requested from
the UI, from a regionserver that reports to the master but
does not handle requests from clients, hanging them. There
is no hint that anything is amiss in the log. All of the
IPC handlers are blocked. Also I'm not sure what to make of
the high counts on CompactSplitThread.compactionQueue :
Thread 9 (regionserver/0:0:0:0:0:0:0:0:60020.compactor):
State: TIMED_WAITING
Blocked count: 217143
Waited count: 393898
Stack:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos
(LockSupport.java:198)
java.util.concurrent.locks.AbstractQueuedSynchronizer
$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)
java.util.concurrent.LinkedBlockingQueue.poll
(LinkedBlockingQueue.java:395)
org.apache.hadoop.hbase.regionserver.CompactSplitThread.run
(CompactSplitThread.java:76)
but do not think I have enough solid information to file a
JIRA and investigate/fix this yet.
I'm hoping we just won't see this with 0.2.1. :-)
- Andy