[
https://issues.apache.org/jira/browse/HBASE-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743665#comment-14743665
]
stack commented on HBASE-14422:
-------------------------------
Sorry about that [~chenheng]
Here is from before I started messing:
{code}
Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 124.342 sec <<<
FAILURE! - in org.apache.hadoop.hbase.client.TestFastFailWithoutTestUtil
testPreemptiveFastFailException50Times(org.apache.hadoop.hbase.client.TestFastFailWithoutTestUtil)
Time elapsed: 120.015 sec <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 120000
milliseconds
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:425)
at java.util.concurrent.FutureTask.get(FutureTask.java:187)
at
org.apache.hadoop.hbase.client.TestFastFailWithoutTestUtil.testPreemptiveFastFailException(TestFastFailWithoutTestUtil.java:451)
at
org.apache.hadoop.hbase.client.TestFastFailWithoutTestUtil.testPreemptiveFastFailException50Times(TestFastFailWithoutTestUtil.java:339)
{code}
It happened repeatedly for me across machines on my little test rig. I could
not make it happen locally.
Looking at history on apache, it passes near always:
https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/15579/testReport/org.apache.hadoop.hbase.client/TestFastFailWithoutTestUtil/history/
Same for
https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/15579/testReport/org.apache.hadoop.hbase.client/TestFastFail/history/
On my local rig it fails every second or third run. It makes it so my test runs
stop early.
> Fix TestFastFailWithoutTestUtil
> -------------------------------
>
> Key: HBASE-14422
> URL: https://issues.apache.org/jira/browse/HBASE-14422
> Project: HBase
> Issue Type: Task
> Components: test
> Reporter: stack
> Priority: Minor
> Labels: beginner
>
> TestFastFailWithoutTestUtil has a unit test that does
> testInterceptorIntercept50Times Usually it passes but on occasion, the
> latching between thread 1 and thread 2 goes awry and the test hangs and the
> test hangs out. Depends on the hardware but it seems to happen about one in
> four runs here on an internal rig.
> HBASE-14421 changed the wait-on-latch to timeout and do a thread dump and
> just let the test keep going.
> This issue is about digging in on figuring why the hang up on latches and
> then fixing it so the test doesn't have to have the latch timeout. Hopefully
> the threaddump helps.
> This one could be hard to fix since it not easy to reproduce. Marking it
> beginner anyways.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)