[
https://issues.apache.org/jira/browse/DRILL-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071915#comment-16071915
]
Paul Rogers commented on DRILL-5155:
------------------------------------
Additional issues. After enabling the managed version of the external sort, two
tests within the test suite behave randomly.
When run in the debugger (directly in Eclipse or using a remote debug when run
from Maven), the tests pass. Run as part of the Drill test suite, or as a
standalone test in Maven, the tests fail.
{code}
TestDrillbitResilience.interruptingBlockedMergingRecordBatch:784->interruptingBlockedFragmentsWaitingForData:814->assertCancelledWithoutException:545->assertStateCompleted:531
Query state is incorrect (expected: CANCELED, actual: FAILED) AND/OR
Exception thrown: org.apache.drill.common.exceptions.UserRemoteException:
SYSTEM ERROR: AssertionError
{code}
And
{code}
memoryLeaksWhenCancelled(org.apache.drill.exec.server.TestDrillbitResilience)
Time elapsed: 50.019 sec <<< ERROR!
java.lang.Exception: test timed out after 50000 milliseconds
{code}
Sometimes the following fails, though most often it works:
{code}
Running
org.apache.drill.exec.server.TestDrillbitResilience#failsAfterMSorterSorting
org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: Connection
/172.30.1.212:58698 <--> /172.30.1.212:31013 (user client) closed unexpectedly.
Drillbit down?
{code}
In another instance, a test failed because of a *negative* memory leak (test
leaked -500 bytes, because start was greater than end...)
The conclusion is that the Drillbit is very fragile; the tests pass, but likely
due to luck. Change anything and the tests fail.
> TestDrillbitResilience unit test is not resilient
> -------------------------------------------------
>
> Key: DRILL-5155
> URL: https://issues.apache.org/jira/browse/DRILL-5155
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.9.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Priority: Minor
>
> The unit test {{TestDrillbitResilience}} plays quite rough with a set of
> Drillbits, forcing a number of error conditions to see if the Drillbits can
> recover. The test cases are good, but they interact with each other to make
> the test as a whole quite fragile. The failure of any one test tends to cause
> others to fail. When tests are run individually, they may run. But, when run
> as a suite, they fail due to cross-interactions.
> Restructure the test to make the tests more independent so that one test does
> not change the state of the cluster expected by a different test.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)