[ 
https://issues.apache.org/jira/browse/DRILL-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472793#comment-16472793
 ] 

ASF GitHub Bot commented on DRILL-6410:
---------------------------------------

sachouche opened a new pull request #1257: DRILL-6410: Fixed memory leak in 
flat Parquet reader
URL: https://github.com/apache/drill/pull/1257
 
 
   **Problem Description**
   - Occasionally, a memory leak is observed within the Parquet reader (flat) 
when query cancellation is invoked
   - I tried a previous attempt to address this issue but it seems it is still 
happening
   - Thus far, only QA have been able to observe this issue (and only 
occasionally)
   
   **Analysis**
   - There was a recent breakthrough which gives me hope for addressing this 
issue 
   - The leak logged two piece of information: leak size and state of the child 
allocator
   - The state of the child allocator indicated no leak (all allocated bytes 
released)
   - After code examination, it occurred to me this was happening because the 
Asynchronous Page Reader task was releasing the Drill buffer while the scan 
thread was closing the allocator
   - The code attempts to cancel asynchronous tasks and then release allocated 
buffers, though there is one big issue: the Java FutureTask.cancel(true) 
doesn't block during the cancellation process; this method **merely interrupts 
the asynchronous task** and proceeds
   - This means if the asynchronous thread was context switched or doing 
computation (not blocked waiting), then the fragment cleanup logic can close 
the allocator before all resources have been released
   
   **Fix**
   - The Java ThreadPoolExecutor and FutureTask have few extension points to 
enhance the task termination process
   - Created a new utility class which can create an ExecutorService with the 
ability to block during future cancellation
   - Blocking will happen only when the cancel method is allowed to interrupt 
the asynchronous task
   - Note that there shouldn't be any performance degradation as 
synchronization code was added only to cover the cancel path
   - Also added a new test-suite to test the correctness of this new utility
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Memory leak in Parquet Reader during cancellation
> -------------------------------------------------
>
>                 Key: DRILL-6410
>                 URL: https://issues.apache.org/jira/browse/DRILL-6410
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>            Reporter: salim achouche
>            Assignee: salim achouche
>            Priority: Major
>
> Occasionally, a memory leak is observed within the flat Parquet reader when 
> query cancellation is invoked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to