[ 
https://issues.apache.org/jira/browse/HBASE-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743665#comment-14743665
 ] 

stack commented on HBASE-14422:
-------------------------------

Sorry about that [~chenheng]

Here is from before I started messing:

{code}
Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 124.342 sec <<< 
FAILURE! - in org.apache.hadoop.hbase.client.TestFastFailWithoutTestUtil
testPreemptiveFastFailException50Times(org.apache.hadoop.hbase.client.TestFastFailWithoutTestUtil)
  Time elapsed: 120.015 sec  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 120000 
milliseconds
        at sun.misc.Unsafe.park(Native Method)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:425)
        at java.util.concurrent.FutureTask.get(FutureTask.java:187)
        at 
org.apache.hadoop.hbase.client.TestFastFailWithoutTestUtil.testPreemptiveFastFailException(TestFastFailWithoutTestUtil.java:451)
        at 
org.apache.hadoop.hbase.client.TestFastFailWithoutTestUtil.testPreemptiveFastFailException50Times(TestFastFailWithoutTestUtil.java:339)
{code}

It happened repeatedly for me across machines on my little test rig. I could 
not make it happen locally. 

Looking at history on apache, it passes near always: 
https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/15579/testReport/org.apache.hadoop.hbase.client/TestFastFailWithoutTestUtil/history/

Same for 
https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/15579/testReport/org.apache.hadoop.hbase.client/TestFastFail/history/

On my local rig it fails every second or third run. It makes it so my test runs 
stop early.

> Fix TestFastFailWithoutTestUtil
> -------------------------------
>
>                 Key: HBASE-14422
>                 URL: https://issues.apache.org/jira/browse/HBASE-14422
>             Project: HBase
>          Issue Type: Task
>          Components: test
>            Reporter: stack
>            Priority: Minor
>              Labels: beginner
>
> TestFastFailWithoutTestUtil has a unit test that does 
> testInterceptorIntercept50Times Usually it passes but on occasion, the 
> latching between thread 1 and thread 2 goes awry and the test hangs and the 
> test hangs out. Depends on the hardware but it seems to happen about one in 
> four runs here on an internal rig.
> HBASE-14421 changed the wait-on-latch to timeout and do a thread dump and 
> just let the test keep going.
> This issue is about digging in on figuring why the hang up on latches and 
> then fixing it so the test doesn't have to have the latch timeout. Hopefully 
> the threaddump helps.
> This one could be hard to fix since it not easy to reproduce. Marking it 
> beginner anyways.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to