[ 
https://issues.apache.org/jira/browse/HADOOP-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541361
 ] 

stack commented on HADOOP-2040:
-------------------------------

Looking more at this hang from last night, fs was sick from near the get-go:
{code}
    [junit] Starting DataNode 0 with dfs.data.dir: 
/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data1,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data2
    [junit] Starting DataNode 1 with dfs.data.dir: 
/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data3,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data4
    [junit] 2007-11-09 08:15:10,894 INFO  [main] 
org.apache.hadoop.hbase.HMaster.<init>(HMaster.java:895): Root region dir: 
/hbase/hregion_-70236052
    [junit] 2007-11-09 08:15:10,982 INFO  [main] 
org.apache.hadoop.hbase.HMaster.<init>(HMaster.java:904): bootstrap: creating 
ROOT and first META regions
    [junit] 2007-11-09 08:15:11,255 WARN  [main] 
org.apache.hadoop.util.NativeCodeLoader.<clinit>(NativeCodeLoader.java:51): 
Unable to load native-hadoop library for your platform... using builtin-java 
classes where applicable
    [junit] 2007-11-09 08:15:11,268 INFO  [main] 
org.apache.hadoop.hbase.HLog.rollWriter(HLog.java:298): new log writer created 
at /hbase/hregion_-70236052/log/hlog.dat.000
    [junit] 2007-11-09 08:15:11,341 DEBUG [main] 
org.apache.hadoop.hbase.HStore.<init>(HStore.java:182): starting -70236052/info 
(no reconstruction log)
    [junit] 2007-11-09 08:15:11,346 DEBUG [main] 
org.apache.hadoop.hbase.HStore.<init>(HStore.java:218): maximum sequence id for 
hstore -70236052/info is -1
    [junit] 2007-11-09 08:15:11,348 DEBUG [main] 
org.apache.hadoop.hbase.HRegion.<init>(HRegion.java:289): Next sequence id for 
region -ROOT-,,0 is 0
    [junit] 2007-11-09 08:15:11,351 INFO  [main] 
org.apache.hadoop.hbase.HRegion.<init>(HRegion.java:315): region -ROOT-,,0 
available
    [junit] 2007-11-09 08:15:11,368 INFO  [main] 
org.apache.hadoop.hbase.HLog.rollWriter(HLog.java:298): new log writer created 
at /hbase/hregion_1028785192/log/hlog.dat.000
    [junit] 2007-11-09 08:15:11,379 DEBUG [main] 
org.apache.hadoop.hbase.HStore.<init>(HStore.java:182): starting 
1028785192/info (no reconstruction log)
    [junit] 2007-11-09 08:15:11,382 DEBUG [main] 
org.apache.hadoop.hbase.HStore.<init>(HStore.java:218): maximum sequence id for 
hstore 1028785192/info is -1
    [junit] 2007-11-09 08:15:11,384 DEBUG [main] 
org.apache.hadoop.hbase.HRegion.<init>(HRegion.java:289): Next sequence id for 
region .META.,,1 is 0
    [junit] 2007-11-09 08:15:11,391 INFO  [main] 
org.apache.hadoop.hbase.HRegion.<init>(HRegion.java:315): region .META.,,1 
available
    [junit] 2007-11-09 08:15:11,426 DEBUG [main] 
org.apache.hadoop.hbase.HRegion.internalFlushcache(HRegion.java:847): Started 
memcache flush for region -ROOT-,,0. Size 86.0
    [junit] 2007-11-09 08:15:11,428 DEBUG [main] 
org.apache.hadoop.hbase.HRegion.internalFlushcache(HRegion.java:876): 
Snapshotted memcache for region -ROOT-,,0 with sequence id 1 and entries 1
    [junit] 2007-11-09 08:15:11,519 WARN  [IPC Server handler 2 on 58346] 
org.apache.hadoop.dfs.ReplicationTargetChooser.chooseTarget(ReplicationTargetChooser.java:177):
 Not able to place enough replicas, still in need of 1
    [junit] 2007-11-09 08:15:11,988 WARN  [IPC Server handler 5 on 58346] 
org.apache.hadoop.dfs.ReplicationTargetChooser.chooseTarget(ReplicationTargetChooser.java:177):
 Not able to place enough replicas, still in need of 1
    [junit] 2007-11-09 08:15:12,036 WARN  [IPC Server handler 0 on 58346] 
org.apache.hadoop.dfs.ReplicationTargetChooser.chooseTarget(ReplicationTargetChooser.java:177):
 Not able to place enough replicas, still in need of 1
    [junit] 2007-11-09 08:15:12,098 DEBUG [main] 
org.apache.hadoop.hbase.HStore.flushCacheHelper(HStore.java:504): Added 
-70236052/info/4372126676279784460 with sequence id 1 and size 210.0
....
{code}
Failure looks unrelated though the du's at end of test might be whats running 
the hung unix processes.

> [hbase] TestHStoreFile/TestBloomFilter hang occasionally on hudson AFTER test 
> has finished
> ------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2040
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2040
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>         Attachments: endoftesttd.patch
>
>
> Weird.  Last night TestBloomFilter was hung after junit had printed test had 
> completed without error.  Just now, I noticed a hung TestHStore -- again 
> after junit had printed out test had succeeded (Nigel Daley has reported he's 
> seen at least two hangs in TestHStoreFile, perhaps in same location).
> Last night and just now I was unable to get a thread dump.
> Here is log from around this evenings hang:
> {code}
> ...
>     [junit] 2007-10-12 04:19:28,477 INFO  [main] 
> org.apache.hadoop.hbase.TestHStoreFile.testOutOfRangeMidkeyHalfMapFile(TestHStoreFile.java:366):
>  Last bottom when key > top: zz/zz/1192162768317
>     [junit] 2007-10-12 04:19:28,493 WARN  [IPC Server handler 0 on 36620] 
> org.apache.hadoop.dfs.FSDirectory.unprotectedDelete(FSDirectory.java:400): 
> DIR* FSDirectory.unprotectedDelete: failed to remove 
> /testOutOfRangeMidkeyHalfMapFile because it does not exist
>     [junit] Shutting down the Mini HDFS Cluster
>     [junit] Shutting down DataNode 1
>     [junit] Shutting down DataNode 0
>     [junit] 2007-10-12 04:19:29,316 WARN  [EMAIL PROTECTED] 
> org.apache.hadoop.dfs.PendingReplicationBlocks$PendingReplicationMonitor.run(PendingReplicationBlocks.java:186):
>  PendingReplicationMonitor thread received exception. 
> java.lang.InterruptedException: sleep interrupted
>     [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 16.274 sec
>     [junit] Running org.apache.hadoop.hbase.TestHTable
>     [junit] Starting DataNode 0 with dfs.data.dir: 
> /export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data1,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data2
>     [junit] Starting DataNode 1 with dfs.data.dir: 
> /export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data3,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data4
>     [junit] 2007-10-12 05:21:48,332 INFO  [main] 
> org.apache.hadoop.hbase.HMaster.<init>(HMaster.java:862): Root region dir: 
> /hbase/hregion_-ROOT-,,0
> ...
> {code}
> Notice the hour of elapsed (hung) time in above.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to