[
https://issues.apache.org/jira/browse/DRILL-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vitalii Diravka updated DRILL-8030:
-----------------------------------
Description:
DRILL-7908 fixes distributed deadlocks in _TestDrillbitResilience_ and add
better timing for simulation the different Drill states. But sometimes tests
indicate memory leak.
They are not there, looks like Drill just check actual memory to early, when
dot all fragments are closed, so adding timeout before final
_countAllocatedMemory_ fixes the issue.
The other reason of test failures - the queries were not in expected state
before cancelling (for instance in STARTING state instead of RUNNING), so
adding timeout before starting cancellation thread allows to wait the proper
drill query state, which is expected to be for Drill in test case before
cancellation.
I don't have anymore test failures with NUM_RUNS = 1000 (@RepeatedTest) for
the problematic test cases.
The other test case which failed is:
{code:java}
Error: Failures:
3540Error:
TestDrillbitResilience.foreman_runTryEnd:289->testForeman:973->assertFailsWithException:960->assertFailsWithException:954
Query state should be FAILED (and not COMPLETED). ==> expected: <COMPLETED>
but was: <FAILED>
{code}
was:
DRILL-7908 fixes distributed deadlocks in _TestDrillbitResilience_ and add
better timing for simulation the different Drill states. But sometimes tests
indicate memory leak.
They are not there, looks like Drill just check actual memory to early, when
dot all fragments are closed, so adding timeout before final
_countAllocatedMemory_ fixes the issue.
The other reason of test failures - the queries were not in expected state
before cancelling (for instance in STARTING state instead of RUNNING), so
adding timeout before starting cancellation thread allows to wait the proper
drill query state, which is expected to be for Drill in test case before
cancellation.
I don't have anymore test failures with NUM_RUNS = 1000 (@RepeatedTest) for the
problematic test cases.
> Memory leak in TestDrillbitResilience
> -------------------------------------
>
> Key: DRILL-8030
> URL: https://issues.apache.org/jira/browse/DRILL-8030
> Project: Apache Drill
> Issue Type: Sub-task
> Components: Tools, Build & Test
> Affects Versions: 1.19.0
> Reporter: Vitalii Diravka
> Assignee: Vitalii Diravka
> Priority: Minor
> Fix For: Future
>
>
> DRILL-7908 fixes distributed deadlocks in _TestDrillbitResilience_ and add
> better timing for simulation the different Drill states. But sometimes tests
> indicate memory leak.
> They are not there, looks like Drill just check actual memory to early, when
> dot all fragments are closed, so adding timeout before final
> _countAllocatedMemory_ fixes the issue.
> The other reason of test failures - the queries were not in expected state
> before cancelling (for instance in STARTING state instead of RUNNING), so
> adding timeout before starting cancellation thread allows to wait the proper
> drill query state, which is expected to be for Drill in test case before
> cancellation.
> I don't have anymore test failures with NUM_RUNS = 1000 (@RepeatedTest) for
> the problematic test cases.
> The other test case which failed is:
> {code:java}
> Error: Failures:
> 3540Error:
> TestDrillbitResilience.foreman_runTryEnd:289->testForeman:973->assertFailsWithException:960->assertFailsWithException:954
> Query state should be FAILED (and not COMPLETED). ==> expected: <COMPLETED>
> but was: <FAILED>
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)