[ 
https://issues.apache.org/jira/browse/DRILL-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-8030:
-----------------------------------
    Description: 
DRILL-7908 fixes distributed deadlocks in _TestDrillbitResilience_ and add 
better timing for simulation the different Drill states. But sometimes tests 
indicate memory leak.
 They are not there, looks like Drill just check actual memory to early, when 
dot all fragments are closed, so adding timeout before final 
_countAllocatedMemory_ fixes the issue. 
 The other reason of test failures - the queries were not in expected state 
before cancelling (for instance in STARTING state instead of RUNNING), so 
adding timeout before starting cancellation thread allows to wait the proper 
drill query state, which is expected to be for Drill  in test case before 
cancellation.
 I don't have anymore test failures with NUM_RUNS = 1000 (@RepeatedTest) for 
the problematic test cases. 

The other test case which failed is:
{code:java}
Error:  Failures: 
3540Error:    
TestDrillbitResilience.foreman_runTryEnd:289->testForeman:973->assertFailsWithException:960->assertFailsWithException:954
 Query state should be FAILED (and not COMPLETED). ==> expected: <COMPLETED> 
but was: <FAILED>
{code}

  was:
DRILL-7908 fixes distributed deadlocks in _TestDrillbitResilience_ and add 
better timing for simulation the different Drill states. But sometimes tests 
indicate memory leak.
They are not there, looks like Drill just check actual memory to early, when 
dot all fragments are closed, so adding timeout before final 
_countAllocatedMemory_ fixes the issue. 
The other reason of test failures - the queries were not in expected state 
before cancelling (for instance in STARTING state instead of RUNNING), so 
adding timeout before starting cancellation thread allows to wait the proper 
drill query state, which is expected to be for Drill  in test case before 
cancellation.
I don't have anymore test failures with NUM_RUNS = 1000 (@RepeatedTest) for the 
problematic test cases.

 


> Memory leak in TestDrillbitResilience
> -------------------------------------
>
>                 Key: DRILL-8030
>                 URL: https://issues.apache.org/jira/browse/DRILL-8030
>             Project: Apache Drill
>          Issue Type: Sub-task
>          Components: Tools, Build &amp; Test
>    Affects Versions: 1.19.0
>            Reporter: Vitalii Diravka
>            Assignee: Vitalii Diravka
>            Priority: Minor
>             Fix For: Future
>
>
> DRILL-7908 fixes distributed deadlocks in _TestDrillbitResilience_ and add 
> better timing for simulation the different Drill states. But sometimes tests 
> indicate memory leak.
>  They are not there, looks like Drill just check actual memory to early, when 
> dot all fragments are closed, so adding timeout before final 
> _countAllocatedMemory_ fixes the issue. 
>  The other reason of test failures - the queries were not in expected state 
> before cancelling (for instance in STARTING state instead of RUNNING), so 
> adding timeout before starting cancellation thread allows to wait the proper 
> drill query state, which is expected to be for Drill  in test case before 
> cancellation.
>  I don't have anymore test failures with NUM_RUNS = 1000 (@RepeatedTest) for 
> the problematic test cases. 
> The other test case which failed is:
> {code:java}
> Error:  Failures: 
> 3540Error:    
> TestDrillbitResilience.foreman_runTryEnd:289->testForeman:973->assertFailsWithException:960->assertFailsWithException:954
>  Query state should be FAILED (and not COMPLETED). ==> expected: <COMPLETED> 
> but was: <FAILED>
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to