[
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14952092#comment-14952092
]
stack commented on HBASE-14420:
-------------------------------
Looking at last 30 hadoop qa runs, these are which hung and failed:
1 Hanging test :
org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence
2 Hanging test : org.apache.hadoop.hbase.client.TestShell
1 Hanging test : org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence
1 Hanging test : org.apache.hadoop.hbase.filter.TestFuzzyRowFilterEndToEnd
2 Hanging test : org.apache.hadoop.hbase.io.encoding.TestDataBlockEncoders
2 Hanging test : org.apache.hadoop.hbase.mapreduce.TestCellCounter
2 Hanging test : org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestHashTable
1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestImportExport
1 Hanging test :
org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesUseSecurityEndPoint
1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat
1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1
1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestWALPlayer
1 Hanging test : org.apache.hadoop.hbase.mob.compactions.TestMobCompactor
1 Hanging test : org.apache.hadoop.hbase.mob.mapreduce.TestMobSweeper
1 Hanging test : org.apache.hadoop.hbase.regionserver.TestClusterId
1 Hanging test : org.apache.hadoop.hbase.regionserver.TestHRegion
1 Hanging test : org.apache.hadoop.hbase.regionserver.TestHRegionFileSystem
1 Hanging test : org.apache.hadoop.hbase.regionserver.TestMultiColumnScanner
1 Hanging test :
org.apache.hadoop.hbase.regionserver.TestPerColumnFamilyFlush
1 Hanging test :
org.apache.hadoop.hbase.regionserver.compactions.TestCompactionWithThroughputController
1 Hanging test : org.apache.hadoop.hbase.replication.TestMasterReplication
1 Hanging test :
org.apache.hadoop.hbase.replication.TestPerTableCFReplication
1 Hanging test :
org.apache.hadoop.hbase.replication.multiwal.TestReplicationSyncUpToolWithMultipleWAL
1 Hanging test :
org.apache.hadoop.hbase.replication.regionserver.TestReplicationWALReaderManager
1 Hanging test : org.apache.hadoop.hbase.thrift.TestThriftServer
1 Hanging test : org.apache.hadoop.hbase.util.TestHBaseFsck
1 Hanging test : org.apache.hadoop.hbase.wal.TestWALFiltering
1 Hanging test : org.apache.hadoop.hbase.wal.TestWALSplit
1 Hanging test : org.apache.hadoop.hbase.wal.TestWALSplitCompressed
2 Failing test :
org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence
4 Failing test : org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence
2 Failing test : org.apache.hadoop.hbase.master.TestZKLessAMOnCluster
1 Failing test :
org.apache.hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures
1 Failing test : org.apache.hadoop.hbase.mob.mapreduce.TestMobSweeper
1 Failing test :
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
1 Failing test :
org.apache.hadoop.hbase.security.token.TestGenerateDelegationToken
> Zombie Stomping Session
> -----------------------
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
> Issue Type: Umbrella
> Components: test
> Reporter: stack
> Assignee: stack
> Priority: Critical
> Attachments: hangers.txt, none_fix (1).txt, none_fix.txt,
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies.
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native
> threads). Having to do multiple test runs in the hope that we can get a
> non-zombie-making build or making (arbitrary) rulings that the zombies are
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier
> this week. Will hang sub-issues of this one. Am running builds back-to-back
> on little cluster to turn out the monsters.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)