GitHub user sachouche opened a pull request: https://github.com/apache/drill/pull/1087
Attempt to fix memory leak in Parquet ** Problem Description ** This is an extremely rare leak which I was able to emulate by putting a sleep in the AsyncPageReader right after reading the page and before enqueue in the result queue. This is how this issue could manifest itself in real life scenario: - AsyncPageReader reads a page into a buffer but didn't enqueue yet the result (thread got preempted) - Parquet Scan thread blocked waiting on the task (Future object dequeued) - Cancel received and Scan thread interrupted - Future.get() returns (Future object is lost) - Scan thread executes release logic - Scan thread is not able to interrupt the AsyncPageReader thread since the future object is lost - AsyncPageReader thread resumes and enqueues the DrillBuf in the result queue - This results in a leak since this buffer is not properly released ** Fix Description ** - The fix is straightforward as we peek the Future object during the blocking get() method - This way, an exception (such as an interrupt) will leave the Future object in the task queue - The cleanup logic will be able to guarantee the DrillBuf object is either GCed by the AsyncPageReader or ParquetScan thread You can merge this pull request into a Git repository by running: $ git pull https://github.com/sachouche/drill DRILL-6079 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1087.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1087 ---- commit 52030d1d9cc3b8992a10ade8c7126d66e785043a Author: Salim Achouche <sachouche2@...> Date: 2017-12-22T19:50:56Z Attempt to fix memory leak in Parquet ---- ---