[jira] [Commented] (DRILL-5155) TestDrillbitResilience unit test is not resilient

Paul Rogers (JIRA) Sun, 02 Jul 2017 21:44:22 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071915#comment-16071915
 ]


Paul Rogers commented on DRILL-5155:
------------------------------------

Additional issues. After enabling the managed version of the external sort, two 
tests within the test suite behave randomly.

When run in the debugger (directly in Eclipse or using a remote debug when run 
from Maven), the tests pass. Run as part of the Drill test suite, or as a 
standalone test in Maven, the tests fail.

{code}
  
TestDrillbitResilience.interruptingBlockedMergingRecordBatch:784->interruptingBlockedFragmentsWaitingForData:814->assertCancelledWithoutException:545->assertStateCompleted:531
 Query state is incorrect (expected: CANCELED, actual: FAILED) AND/OR 
Exception thrown: org.apache.drill.common.exceptions.UserRemoteException: 
SYSTEM ERROR: AssertionError
{code}

And

{code}
memoryLeaksWhenCancelled(org.apache.drill.exec.server.TestDrillbitResilience)  
Time elapsed: 50.019 sec  <<< ERROR!
java.lang.Exception: test timed out after 50000 milliseconds
{code}

Sometimes the following fails, though most often it works:

{code}
Running 
org.apache.drill.exec.server.TestDrillbitResilience#failsAfterMSorterSorting
org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: Connection 
/172.30.1.212:58698 <--> /172.30.1.212:31013 (user client) closed unexpectedly. 
Drillbit down?
{code}

In another instance, a test failed because of a *negative* memory leak (test 
leaked -500 bytes, because start was greater than end...)

The conclusion is that the Drillbit is very fragile; the tests pass, but likely 
due to luck. Change anything and the tests fail.

> TestDrillbitResilience unit test is not resilient
> -------------------------------------------------
>
>                 Key: DRILL-5155
>                 URL: https://issues.apache.org/jira/browse/DRILL-5155
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.9.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>
> The unit test {{TestDrillbitResilience}} plays quite rough with a set of 
> Drillbits, forcing a number of error conditions to see if the Drillbits can 
> recover. The test cases are good, but they interact with each other to make 
> the test as a whole quite fragile. The failure of any one test tends to cause 
> others to fail. When tests are run individually, they may run. But, when run 
> as a suite, they fail due to cross-interactions.
> Restructure the test to make the tests more independent so that one test does 
> not change the state of the cluster expected by a different test.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5155) TestDrillbitResilience unit test is not resilient

Reply via email to