[
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14961987#comment-14961987
]
stack commented on HBASE-14420:
-------------------------------
Says:
{code}
laked tests:
org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence.testOnlineSnapshotDeleteIndependent(org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence)
Run 1:
TestSnapshotCloneIndependence.testOnlineSnapshotDeleteIndependent:191->runTestSnapshotDeleteIndependent:459
expected:<17576> but was:<14046>
Run 2:
TestSnapshotCloneIndependence.testOnlineSnapshotDeleteIndependent:191->runTestSnapshotDeleteIndependent:459
expected:<17576> but was:<14046>
Run 3: PASS
org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer.testRegionReplicationOnMidClusterSameHosts(org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer)
Run 1:
TestStochasticLoadBalancer.testRegionReplicationOnMidClusterSameHosts:454->BalancerTestBase.testWithCluster:444->BalancerTestBase.assertClusterAsBalanced:203
null
Run 2: PASS
org.apache.hadoop.hbase.regionserver.TestWALLockup.testLockupWhenSyncInMiddleOfZigZagSetup(org.apache.hadoop.hbase.regionserver.TestWALLockup)
Run 1: TestWALLockup.testLockupWhenSyncInMiddleOfZigZagSetup:245 �
TestTimedOut test ...
Run 2: PASS
{code}
TestSnapshotCloneIndependence#testOnlineSnapshotDeleteIndependent was disabled
last night.
Load balancer is showing up from time to time still.
I see this: ERROR] Failed to execute goal
org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test
(secondPartTestsExecution) on project hbase-server: There was a timeout or
other error in the fork -> [Help 1]
org.apache.hadoop.hbase.TestChoreService does not show up in list above. The
test that failed is all timer based... let me just disable it.
Let me put timeout on the encoding failure.
> Zombie Stomping Session
> -----------------------
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
> Issue Type: Umbrella
> Components: test
> Reporter: stack
> Assignee: stack
> Priority: Critical
> Attachments: hangers.txt, none_fix (1).txt, none_fix.txt,
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt,
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt,
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt,
> none_fix.txt, none_fix.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies.
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native
> threads). Having to do multiple test runs in the hope that we can get a
> non-zombie-making build or making (arbitrary) rulings that the zombies are
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier
> this week. Will hang sub-issues of this one. Am running builds back-to-back
> on little cluster to turn out the monsters.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)