[ https://issues.apache.org/jira/browse/HBASE-8419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack resolved HBASE-8419. -------------------------- Resolution: Cannot Reproduce Don't see these hangs anymore. We've rigged ec2 and apache builds to log zombies if it sees them. I think this stuff fixed by a combo of issues including upgrade to hadoop-2.0.5-alpha. Will open new issue if find hang again. > Hadoop2 MR tests fail with delete failing/hanging threads present > ----------------------------------------------------------------- > > Key: HBASE-8419 > URL: https://issues.apache.org/jira/browse/HBASE-8419 > Project: HBase > Issue Type: Sub-task > Reporter: Jonathan Hsieh > Fix For: 0.95.2 > > > In flaky failure on hadoop2 runs of such as: > * TestImportTsv/testBulkOutputWithoutAnExistingTable > * TestImportTsv/testMROnTable > * TestImportExport/testWithFilter > * (and many others) > We have logs with hanging threads and failed file deletes that look like this. > {code} > 2013-04-24 06:05:01,807 WARN [ContainersLauncher #0] > nodemanager.DefaultContainerExecutor(193): Exit code from task is : 137 > 2013-04-24 06:05:06,520 INFO [pool-1-thread-1] hbase.ResourceChecker(171): > after: mapreduce.TestImportExport#testExportScannerBatching Thread=539 (was > 534) > Potentially hanging thread: hbase-table-pool-25-thread-1 > sun.misc.Unsafe.park(Native Method) > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196) > > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:424) > ... > <threads seemingly related to dfs connection> > {code} > {code}2013-04-24 06:03:28,351 WARN [DeletionService #0] > nodemanager.DefaultContainerExecutor(276): delete returned false for path: > [/var/lib/jenkins/workspace/apache-hbase-trunk-hadoop2/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-0_0/usercache/jenkins/appcache/application_1366808588748_0001/container_1366808588748_0001_01_000001] > 2013-04-24 06:03:28,353 WARN [DeletionService #1] > nodemanager.DefaultContainerExecutor(276): delete returned false for path: > [/var/lib/jenkins/workspace/apache-hbase-trunk-hadoop2/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-0_1/usercache/jenkins/appcache/application_1366808588748_0001/container_1366808588748_0001_01_000001] > 2013-04-24 06:03:28,353 WARN [DeletionService #2] > nodemanager.DefaultContainerExecutor(276): delete returned false for path: > [/var/lib/jenkins/workspace/apache-hbase-trunk-hadoop2/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-0_2/usercache/jenkins/appcache/application_1366808588748_0001/container_1366808588748_0001_01_000001] > 2013-04-24 06:03:28,354 WARN [DeletionService #0] > nodemanager.DefaultContainerExecutor(276): delete returned false for path: > [/var/lib/jenkins/workspace/apache-hbase-trunk-hadoop2/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-0_3/usercache/jenkins/appcache/application_1366808588748_0001/container_1366808588748_0001_01_000001] > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira