[ 
https://issues.apache.org/jira/browse/HADOOP-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534464
 ] 

stack commented on HADOOP-2040:
-------------------------------

The hudson patch build #940 is hung in TestHBaseCluster.  Here is the 
end-of-test thread dump (just before it prints test completed w/o error):
{code}
    [junit] Process Thread Dump: Temporary end-of-test thread dump debugging 
HADOOP-2040: testHBaseCluster
    [junit] 6 active threads
    [junit] Thread 53 (process reaper):
    [junit]   State: RUNNABLE
    [junit]   Blocked count: 0
    [junit]   Waited count: 0
    [junit]   Stack:
    [junit]     java.lang.UNIXProcess.waitForProcessExit(Native Method)
    [junit]     java.lang.UNIXProcess.access$900(UNIXProcess.java:17)
    [junit]     java.lang.UNIXProcess$2$1.run(UNIXProcess.java:86)
    [junit] Thread 30 (org.apache.hadoop.io.ObjectWritable Connection Culler):
    [junit]   State: TIMED_WAITING
    [junit]   Blocked count: 0
    [junit]   Waited count: 0
    [junit]   Stack:
    [junit]     java.lang.Thread.sleep(Native Method)
    [junit]     
org.apache.hadoop.ipc.Client$ConnectionCuller.run(Client.java:404)
    [junit] Thread 4 (Signal Dispatcher):
    [junit]   State: RUNNABLE
    [junit]   Blocked count: 0
    [junit]   Waited count: 0
    [junit]   Stack:
    [junit] Thread 3 (Finalizer):
    [junit]   State: WAITING
    [junit]   Blocked count: 137
    [junit]   Waited count: 21
    [junit]   Waiting on [EMAIL PROTECTED]
    [junit]   Stack:
    [junit]     java.lang.Object.wait(Native Method)
    [junit]     java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
    [junit]     java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
    [junit]     java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
    [junit] Thread 2 (Reference Handler):
    [junit]   State: WAITING
    [junit]   Blocked count: 209
    [junit]   Waited count: 17
    [junit]   Waiting on [EMAIL PROTECTED]
    [junit]   Stack:
    [junit]     java.lang.Object.wait(Native Method)
    [junit]     java.lang.Object.wait(Object.java:474)
    [junit]     java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
    [junit] Thread 1 (main):
    [junit]   State: RUNNABLE
    [junit]   Blocked count: 44
    [junit]   Waited count: 4095
    [junit]   Stack:
    [junit]     sun.management.ThreadImpl.getThreadInfo0(Native Method)
    [junit]     sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:144)
    [junit]     sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:120)
    [junit]     
org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:114)
    [junit]     
org.apache.hadoop.hbase.HBaseClusterTestCase.tearDown(HBaseClusterTestCase.java:94)
    [junit]     junit.framework.TestCase.runBare(TestCase.java:130)
    [junit]     junit.framework.TestResult$1.protect(TestResult.java:106)
    [junit]     junit.framework.TestResult.runProtected(TestResult.java:124)
    [junit]     junit.framework.TestResult.run(TestResult.java:109)
    [junit]     junit.framework.TestCase.run(TestCase.java:118)
    [junit]     junit.framework.TestSuite.runTest(TestSuite.java:208)
    [junit]     junit.framework.TestSuite.run(TestSuite.java:203)
    [junit]     
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
    [junit]     
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
    [junit]     
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 64.096 sec
{code}
Looking at previous unit tests, the odd-man-out is the unix process waiting on 
process end.  Whats that from?

I can't get a thread dump at [hudson] dateSat Oct 13 04:17:30 GMT 2007.  
Killing current test... so build can move on.

> [hbase] TestHStoreFile/TestBloomFilter hang occasionally on hudson AFTER test 
> has finished
> ------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2040
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>         Attachments: endoftesttd.patch
>
>
> Weird.  Last night TestBloomFilter was hung after junit had printed test had 
> completed without error.  Just now, I noticed a hung TestHStore -- again 
> after junit had printed out test had succeeded (Nigel Daley has reported he's 
> seen at least two hangs in TestHStoreFile, perhaps in same location).
> Last night and just now I was unable to get a thread dump.
> Here is log from around this evenings hang:
> {code}
> ...
>     [junit] 2007-10-12 04:19:28,477 INFO  [main] 
> org.apache.hadoop.hbase.TestHStoreFile.testOutOfRangeMidkeyHalfMapFile(TestHStoreFile.java:366):
>  Last bottom when key > top: zz/zz/1192162768317
>     [junit] 2007-10-12 04:19:28,493 WARN  [IPC Server handler 0 on 36620] 
> org.apache.hadoop.dfs.FSDirectory.unprotectedDelete(FSDirectory.java:400): 
> DIR* FSDirectory.unprotectedDelete: failed to remove 
> /testOutOfRangeMidkeyHalfMapFile because it does not exist
>     [junit] Shutting down the Mini HDFS Cluster
>     [junit] Shutting down DataNode 1
>     [junit] Shutting down DataNode 0
>     [junit] 2007-10-12 04:19:29,316 WARN  [EMAIL PROTECTED] 
> org.apache.hadoop.dfs.PendingReplicationBlocks$PendingReplicationMonitor.run(PendingReplicationBlocks.java:186):
>  PendingReplicationMonitor thread received exception. 
> java.lang.InterruptedException: sleep interrupted
>     [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 16.274 sec
>     [junit] Running org.apache.hadoop.hbase.TestHTable
>     [junit] Starting DataNode 0 with dfs.data.dir: 
> /export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data1,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data2
>     [junit] Starting DataNode 1 with dfs.data.dir: 
> /export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data3,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data4
>     [junit] 2007-10-12 05:21:48,332 INFO  [main] 
> org.apache.hadoop.hbase.HMaster.<init>(HMaster.java:862): Root region dir: 
> /hbase/hregion_-ROOT-,,0
> ...
> {code}
> Notice the hour of elapsed (hung) time in above.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to