[
https://issues.apache.org/jira/browse/HBASE-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265481#comment-14265481
]
Manukranth Kolloju commented on HBASE-12771:
--------------------------------------------
While testing for FastFail, I was randomly picking and killing the META server
and didn't account for the extra latencies that it would add. This was causing
the test to fail in branch 1. Seems like the region map relearning when META
goes down is faster in trunk than in branch 1. As a result, the test was never
failing in trunk but somehow failing in branch 1. Anyone knows why killing META
server is adding about an extra PAUSE_TIME worth delay in
RegionServerCallable.prepare ?
> TestFailFast#testFastFail failing
> ---------------------------------
>
> Key: HBASE-12771
> URL: https://issues.apache.org/jira/browse/HBASE-12771
> Project: HBase
> Issue Type: Bug
> Components: test
> Affects Versions: 1.0.0
> Reporter: stack
> Assignee: Manukranth Kolloju
>
> Fails on our internal rig and from time to time on apache. Here is latest:
> {code}
> org.apache.hadoop.hbase.client.TestFastFail.testFastFail
> Failing for the past 1 build (Since Failed#654 )
> Took 7.1 sec.
> Error Message
> Only few thread should ideally be waiting for the dead regionserver to be
> coming back. numBlockedWorkers:155 threads that retried : 10
> Stacktrace
> java.lang.AssertionError: Only few thread should ideally be waiting for the
> dead regionserver to be coming back. numBlockedWorkers:155 threads that
> retried : 10
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at
> org.apache.hadoop.hbase.client.TestFastFail.testFastFail(TestFastFail.java:270)
> {code}
> Opening this issue so can start tracking the fails. Looking in log, nothing
> obvious. Will be back to this one.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)