[ https://issues.apache.org/jira/browse/HADOOP-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541361 ]
stack commented on HADOOP-2040: ------------------------------- Looking more at this hang from last night, fs was sick from near the get-go: {code} [junit] Starting DataNode 0 with dfs.data.dir: /export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data1,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data2 [junit] Starting DataNode 1 with dfs.data.dir: /export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data3,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data4 [junit] 2007-11-09 08:15:10,894 INFO [main] org.apache.hadoop.hbase.HMaster.<init>(HMaster.java:895): Root region dir: /hbase/hregion_-70236052 [junit] 2007-11-09 08:15:10,982 INFO [main] org.apache.hadoop.hbase.HMaster.<init>(HMaster.java:904): bootstrap: creating ROOT and first META regions [junit] 2007-11-09 08:15:11,255 WARN [main] org.apache.hadoop.util.NativeCodeLoader.<clinit>(NativeCodeLoader.java:51): Unable to load native-hadoop library for your platform... using builtin-java classes where applicable [junit] 2007-11-09 08:15:11,268 INFO [main] org.apache.hadoop.hbase.HLog.rollWriter(HLog.java:298): new log writer created at /hbase/hregion_-70236052/log/hlog.dat.000 [junit] 2007-11-09 08:15:11,341 DEBUG [main] org.apache.hadoop.hbase.HStore.<init>(HStore.java:182): starting -70236052/info (no reconstruction log) [junit] 2007-11-09 08:15:11,346 DEBUG [main] org.apache.hadoop.hbase.HStore.<init>(HStore.java:218): maximum sequence id for hstore -70236052/info is -1 [junit] 2007-11-09 08:15:11,348 DEBUG [main] org.apache.hadoop.hbase.HRegion.<init>(HRegion.java:289): Next sequence id for region -ROOT-,,0 is 0 [junit] 2007-11-09 08:15:11,351 INFO [main] org.apache.hadoop.hbase.HRegion.<init>(HRegion.java:315): region -ROOT-,,0 available [junit] 2007-11-09 08:15:11,368 INFO [main] org.apache.hadoop.hbase.HLog.rollWriter(HLog.java:298): new log writer created at /hbase/hregion_1028785192/log/hlog.dat.000 [junit] 2007-11-09 08:15:11,379 DEBUG [main] org.apache.hadoop.hbase.HStore.<init>(HStore.java:182): starting 1028785192/info (no reconstruction log) [junit] 2007-11-09 08:15:11,382 DEBUG [main] org.apache.hadoop.hbase.HStore.<init>(HStore.java:218): maximum sequence id for hstore 1028785192/info is -1 [junit] 2007-11-09 08:15:11,384 DEBUG [main] org.apache.hadoop.hbase.HRegion.<init>(HRegion.java:289): Next sequence id for region .META.,,1 is 0 [junit] 2007-11-09 08:15:11,391 INFO [main] org.apache.hadoop.hbase.HRegion.<init>(HRegion.java:315): region .META.,,1 available [junit] 2007-11-09 08:15:11,426 DEBUG [main] org.apache.hadoop.hbase.HRegion.internalFlushcache(HRegion.java:847): Started memcache flush for region -ROOT-,,0. Size 86.0 [junit] 2007-11-09 08:15:11,428 DEBUG [main] org.apache.hadoop.hbase.HRegion.internalFlushcache(HRegion.java:876): Snapshotted memcache for region -ROOT-,,0 with sequence id 1 and entries 1 [junit] 2007-11-09 08:15:11,519 WARN [IPC Server handler 2 on 58346] org.apache.hadoop.dfs.ReplicationTargetChooser.chooseTarget(ReplicationTargetChooser.java:177): Not able to place enough replicas, still in need of 1 [junit] 2007-11-09 08:15:11,988 WARN [IPC Server handler 5 on 58346] org.apache.hadoop.dfs.ReplicationTargetChooser.chooseTarget(ReplicationTargetChooser.java:177): Not able to place enough replicas, still in need of 1 [junit] 2007-11-09 08:15:12,036 WARN [IPC Server handler 0 on 58346] org.apache.hadoop.dfs.ReplicationTargetChooser.chooseTarget(ReplicationTargetChooser.java:177): Not able to place enough replicas, still in need of 1 [junit] 2007-11-09 08:15:12,098 DEBUG [main] org.apache.hadoop.hbase.HStore.flushCacheHelper(HStore.java:504): Added -70236052/info/4372126676279784460 with sequence id 1 and size 210.0 .... {code} Failure looks unrelated though the du's at end of test might be whats running the hung unix processes. > [hbase] TestHStoreFile/TestBloomFilter hang occasionally on hudson AFTER test > has finished > ------------------------------------------------------------------------------------------ > > Key: HADOOP-2040 > URL: https://issues.apache.org/jira/browse/HADOOP-2040 > Project: Hadoop > Issue Type: Bug > Components: contrib/hbase > Reporter: stack > Priority: Minor > Attachments: endoftesttd.patch > > > Weird. Last night TestBloomFilter was hung after junit had printed test had > completed without error. Just now, I noticed a hung TestHStore -- again > after junit had printed out test had succeeded (Nigel Daley has reported he's > seen at least two hangs in TestHStoreFile, perhaps in same location). > Last night and just now I was unable to get a thread dump. > Here is log from around this evenings hang: > {code} > ... > [junit] 2007-10-12 04:19:28,477 INFO [main] > org.apache.hadoop.hbase.TestHStoreFile.testOutOfRangeMidkeyHalfMapFile(TestHStoreFile.java:366): > Last bottom when key > top: zz/zz/1192162768317 > [junit] 2007-10-12 04:19:28,493 WARN [IPC Server handler 0 on 36620] > org.apache.hadoop.dfs.FSDirectory.unprotectedDelete(FSDirectory.java:400): > DIR* FSDirectory.unprotectedDelete: failed to remove > /testOutOfRangeMidkeyHalfMapFile because it does not exist > [junit] Shutting down the Mini HDFS Cluster > [junit] Shutting down DataNode 1 > [junit] Shutting down DataNode 0 > [junit] 2007-10-12 04:19:29,316 WARN [EMAIL PROTECTED] > org.apache.hadoop.dfs.PendingReplicationBlocks$PendingReplicationMonitor.run(PendingReplicationBlocks.java:186): > PendingReplicationMonitor thread received exception. > java.lang.InterruptedException: sleep interrupted > [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 16.274 sec > [junit] Running org.apache.hadoop.hbase.TestHTable > [junit] Starting DataNode 0 with dfs.data.dir: > /export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data1,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data2 > [junit] Starting DataNode 1 with dfs.data.dir: > /export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data3,/export/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/contrib/hbase/test/data/dfs/data/data4 > [junit] 2007-10-12 05:21:48,332 INFO [main] > org.apache.hadoop.hbase.HMaster.<init>(HMaster.java:862): Root region dir: > /hbase/hregion_-ROOT-,,0 > ... > {code} > Notice the hour of elapsed (hung) time in above. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.