Line 58 and line 79 are the threads that I found suspicious. http://pastebin.com/W1E2nCZq
The other stack traces from the other two region servers look identical to this one. BTW - I have made the config changes per Ryan Rawson's suggestion (thanks!) and I've processed ~7 GB of the 15 GB without hangup thus far so I'm crossing my fingers. -Luke On 7/16/10 11:48 AM, "Stack" <[email protected]> wrote: Would you mind pastebinning the stacktrace? It doesn't looks like https://issues.apache.org/jira/browse/HDFS-88 (HBASE-667) going by the below, an issue that HADOOP-5859 purportedly fixes -- I see you commented on it -- but our Todd thinks otherwise (He has a 'real' fix up in another issue that I currently can't put my finger on). St.Ack On Fri, Jul 16, 2010 at 7:19 AM, Luke Forehand <[email protected]> wrote: > > I grepped yesterday's logs on all servers for "Blocking updates" and there > was no trace. I believe I had encountered the blocking updates problem > earlier in the project but throttled down the import speed which seemed to > fix that. > > I just double checked and all three region servers were idle. Something > interesting that I noticed however, was that each regionserver had a > particular ResponseProcessor thread running for a specific block, and that > thread was stuck in a running state during the entirety of the hang. Also a > DataStreamer thread for the block associated with the ResponseProcessor was > in a wait state. This makes me think that each server was stuck operating on > a specific block. > > "ResponseProcessor for block blk_1926230463847049982_2694658" - Thread > t...@61160 > java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215) > at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) > at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) > - locked sun.nio.ch.uti...@196fbfd0 > - locked java.util.collections$unmodifiable...@7799fdbb > - locked sun.nio.ch.epollselectori...@1ee13d55 > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) > at > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:332) > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) > at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155) > at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128) > at java.io.DataInputStream.readFully(DataInputStream.java:178) > at java.io.DataInputStream.readLong(DataInputStream.java:399) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2399) > > Locked ownable synchronizers: > - None > > "DataStreamer for file > /hbase/.logs/dn01.colo.networkedinsights.com,60020,1279222293084/hlog.dat.1279228611023 > block blk_1926230463847049982_2694658" - Thread t...@61158 > java.lang.Thread.State: TIMED_WAITING on java.util.linkedl...@475b455c > at java.lang.Object.wait(Native Method) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2247) > > Locked ownable synchronizers: > - None >
