[ 
https://issues.apache.org/jira/browse/HADOOP-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534225
 ] 

stack commented on HADOOP-2040:
-------------------------------

Looks like it hung again in same build -- #931 -- but this time in a test that 
hasn't been prone to hanging, TestListTables.  Again I can't get a thread dump 
but log is interesting on the way out:

{code}
    [junit] Shutting down the Mini HDFS Cluster
    [junit] Shutting down DataNode 1
    [junit] 2007-10-12 05:23:16,082 WARN  [DataNode: 
[/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data3,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data4]]
 org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:596): 
java.io.IOException: java.lang.InterruptedException
    [junit] Shutting down DataNode 0
    [junit]     at 
org.apache.hadoop.fs.ShellCommand.runCommand(ShellCommand.java:59)
    [junit]     at org.apache.hadoop.fs.ShellCommand.run(ShellCommand.java:42)
    [junit]     at org.apache.hadoop.fs.DU.getUsed(DU.java:52)
    [junit]     at 
org.apache.hadoop.dfs.FSDataset$FSVolume.getDfsUsed(FSDataset.java:299)
    [junit]     at 
org.apache.hadoop.dfs.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:396)
    [junit]     at 
org.apache.hadoop.dfs.FSDataset.getDfsUsed(FSDataset.java:495)
    [junit]     at 
org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:520)
    [junit]     at org.apache.hadoop.dfs.DataNode.run(DataNode.java:1494)
    [junit]     at java.lang.Thread.run(Thread.java:595)

    [junit] 2007-10-12 05:23:16,349 WARN  [DataNode: 
[/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data1,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data2]]
 org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:596): 
java.io.InterruptedIOException
    [junit]     at java.net.SocketOutputStream.socketWrite0(Native Method)
    [junit]     at 
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
    [junit]     at 
java.net.SocketOutputStream.write(SocketOutputStream.java:136)
    [junit]     at 
org.apache.hadoop.ipc.Client$Connection$2.write(Client.java:192)
    [junit]     at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
    [junit]     at 
java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
    [junit]     at java.io.DataOutputStream.flush(DataOutputStream.java:106)
    [junit]     at 
org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:327)
    [junit]     at org.apache.hadoop.ipc.Client.call(Client.java:474)
    [junit]     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
    [junit]     at org.apache.hadoop.dfs.$Proxy1.sendHeartbeat(Unknown Source)
    [junit]     at 
org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:520)
    [junit]     at org.apache.hadoop.dfs.DataNode.run(DataNode.java:1494)
    [junit]     at java.lang.Thread.run(Thread.java:595)

    [junit] 2007-10-12 05:23:16,351 WARN  [EMAIL PROTECTED] 
org.apache.hadoop.dfs.PendingReplicationBlocks$PendingReplicationMonitor.run(PendingReplicationBlocks.java:186):
 PendingReplicationMonitor thread received exception. 
java.lang.InterruptedException: sleep interrupted
    [junit] 2007-10-12 05:23:16,610 INFO  [main] 
org.apache.hadoop.hbase.MiniHBaseCluster.shutdown(MiniHBaseCluster.java:424): 
Shutting down FileSystem
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 36.108 sec
{code}

It reports tests succeeded but just before hand its reporting and interrupted 
flush.  I wonder if interrupt broke the flush.  It would be interesting to know 
(for HADOOP-1924).

> [hbase] TestHStoreFile/TestBloomFilter hang occasionally on hudson AFTER test 
> has finished
> ------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2040
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>
> Weird.  Last night TestBloomFilter was hung after junit had printed test had 
> completed without error.  Just now, I noticed a hung TestHStore -- again 
> after junit had printed out test had succeeded (Nigel Daley has reported he's 
> seen at least two hangs in TestHStoreFile, perhaps in same location).
> Last night and just now I was unable to get a thread dump.
> Here is log from around this evenings hang:
> {code}
> ...
>     [junit] 2007-10-12 04:19:28,477 INFO  [main] 
> org.apache.hadoop.hbase.TestHStoreFile.testOutOfRangeMidkeyHalfMapFile(TestHStoreFile.java:366):
>  Last bottom when key > top: zz/zz/1192162768317
>     [junit] 2007-10-12 04:19:28,493 WARN  [IPC Server handler 0 on 36620] 
> org.apache.hadoop.dfs.FSDirectory.unprotectedDelete(FSDirectory.java:400): 
> DIR* FSDirectory.unprotectedDelete: failed to remove 
> /testOutOfRangeMidkeyHalfMapFile because it does not exist
>     [junit] Shutting down the Mini HDFS Cluster
>     [junit] Shutting down DataNode 1
>     [junit] Shutting down DataNode 0
>     [junit] 2007-10-12 04:19:29,316 WARN  [EMAIL PROTECTED] 
> org.apache.hadoop.dfs.PendingReplicationBlocks$PendingReplicationMonitor.run(PendingReplicationBlocks.java:186):
>  PendingReplicationMonitor thread received exception. 
> java.lang.InterruptedException: sleep interrupted
>     [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 16.274 sec
>     [junit] Running org.apache.hadoop.hbase.TestHTable
>     [junit] Starting DataNode 0 with dfs.data.dir: 
> /export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data1,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data2
>     [junit] Starting DataNode 1 with dfs.data.dir: 
> /export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data3,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data4
>     [junit] 2007-10-12 05:21:48,332 INFO  [main] 
> org.apache.hadoop.hbase.HMaster.<init>(HMaster.java:862): Root region dir: 
> /hbase/hregion_-ROOT-,,0
> ...
> {code}
> Notice the hour of elapsed (hung) time in above.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to