[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/1073 ---
[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
Github user ilooner closed the pull request at: https://github.com/apache/drill/pull/1073 ---
[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
GitHub user ilooner reopened a pull request: https://github.com/apache/drill/pull/1073 DRILL-5967: Fixed memory leak in OrderedPartitionSender The OrderedPartitionSender was leaking memory every time it was created because it created a wrapper RecordBatch which allocated memory but was never closed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ilooner/drill DRILL-5967 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1073.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1073 commit 13cd2fe7e8dccb2ac7546de508fbfbed8e19b48b Author: Timothy FarkasDate: 2017-12-14T18:48:27Z DRILL-5967: Fixed memory leak in OrderedPartitionSender commit e48e274649be7b9a14746c200d0ec5f5b3740664 Author: Timothy Farkas Date: 2017-12-18T20:01:30Z - Applied review comments commit 2233e6b32c1dd9f1b89fe4d6fa557e8bd894592e Author: Timothy Farkas Date: 2017-12-18T21:54:01Z - Applied more review comments commit f3aef285fc168cad07b32e5e99cd2cd7811d765c Author: Timothy Farkas Date: 2017-12-18T21:56:13Z - Applied review comments commit 0cdb2f3c4944f6253c72d318276603fc329546b5 Author: Timothy Farkas Date: 2017-12-19T19:27:19Z - Removed unnecessary creation of a list in OrderedPartitionSenderCreator commit f6d4aec39aad7326a735b883647772b28e2b86ab Author: Timothy Farkas Date: 2018-01-11T22:19:18Z - Added comment documenting the workaround. ---
[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1073#discussion_r158591666 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java --- @@ -341,6 +351,10 @@ public void close() throws Exception { updateAggregateStats(); partitioner.clear(); } + +if (closeIncoming) { + ((CloseableRecordBatch) incoming).close(); +} --- End diff -- OK, let's limit scope to stay on schedule. Please include a comment that explains the above, perhaps in the `OrderedPartitionSenderCreator` class. As you described, the proper solution would be to simply add both operators to the stack of operators that the fragment executor should close. But, since the code assumes that each builder builds a single operator, fixing this design flaw does, as you noted, require changes that are out of scope. ---
[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1073#discussion_r158577033 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java --- @@ -341,6 +351,10 @@ public void close() throws Exception { updateAggregateStats(); partitioner.clear(); } + +if (closeIncoming) { + ((CloseableRecordBatch) incoming).close(); +} --- End diff -- Hi Paul, I understand the FragmentExecutor closes all the operators. However, the FragmentExecutor is only aware of the operators in the tree that it has a reference too. In the case of the OrderedPartitionSenderCreator, an upstream record batch is wrapped in an OrderedPartitionRecordBatch. Since the OrderedPartitionRecordBatch is instantiated in the creator the FragmentExecutor does not have a reference to it. So when the FragmentExecutor closes the operators it closes the original operator, but not not the wrapping OrderedPartitionRecordBatch. This is an issue since the OrderedPartitionRecordBatch allocates VectorContainers which are consequentially never released. This is clearly the source of the memory leak and the fix was verified by QA. There are two possible Fixes. A) We change the Creators in some way to communicate to the FragmentExecutor that they have wrapped an operator, so the FragmentExecutor can close the wrapped operator instead of the original operator. My impression is that this will evolve into a fairly large change. B) Or we take a less invasive approach and simply tell the PartitionSenderRootExec whether to close the wrapped operator (This is what I did). I agree approach B is not that great and that approach A is cleaner. But taking approach A is a more intensive project, and I was not keen on having another simple change explode into another 100 file PR. It would be easier to make that change after we've made more improvements to Drill unit testing. Let me know what you think. I am not against taking approach A, but then we'll have to talk with Pritesh about fitting that project in. ---
[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1073#discussion_r158158542 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java --- @@ -341,6 +351,10 @@ public void close() throws Exception { updateAggregateStats(); partitioner.clear(); } + +if (closeIncoming) { + ((CloseableRecordBatch) incoming).close(); +} --- End diff -- Under what scenario would we *not* close the incoming? This fix introduces two paths: one that closes, one that does not. Is the non-closing path needed? More to the point, do we have unit tests that verify both paths? Can we simplify this and just define that this operator will close the incoming? Actually, let's take a step back. This fix is based on a false premise (but one a mistake that is easy to make.) This fix assumes that a parent batch (operator) is responsible for closing its child. But, that is not how the Drill protocol works. Instead, the fragment executor maintains the operator tree. Upon exit, it walks the tree and closes all its children in order from top down. So, the true source of this bug is that either a) the fragment exec is failing to do the close, b) that the incoming is not releasing memory on close, or c) that some close operation reallocates memory that a previous close tried to release. Let's suppose the issue is a: that the fragment exec does not manage the incoming. Let's fix that. Similar with the other cases. Can you maybe do a bit more research to understand the full scope of the issue? ---
[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1073#discussion_r157605837 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderCreator.java --- @@ -35,7 +35,5 @@ public RootExec getRoot(FragmentContext context, assert children != null && children.size() == 1; return new PartitionSenderRootExec(context, children.iterator().next(), config); - --- End diff -- Will be good to exclude format only change from the PR. ---
[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1073#discussion_r157605488 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/partitionsender/TestPartitionSender.java --- @@ -350,7 +350,7 @@ public void testAlgorithm() throws Exception { public MockPartitionSenderRootExec(FragmentContext context, RecordBatch incoming, HashPartitionSender operator) throws OutOfMemoryException { - super(context, incoming, operator); + super(context, incoming, operator, false); --- End diff -- Is the change still necessary? ---
[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1073#discussion_r157545737 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/partitionsender/PartitionSenderRootExec.java --- @@ -98,10 +100,12 @@ public int metricId() { public PartitionSenderRootExec(FragmentContext context, RecordBatch incoming, - HashPartitionSender operator) throws OutOfMemoryException { + HashPartitionSender operator, + boolean closeIncoming) throws OutOfMemoryException { --- End diff -- Introduce back constructor without `closeIncoming` argument and call this constructor from it with `closeIncoming` set to `false`. ---
[GitHub] drill pull request #1073: DRILL-5967: Fixed memory leak in OrderedPartitionS...
GitHub user ilooner opened a pull request: https://github.com/apache/drill/pull/1073 DRILL-5967: Fixed memory leak in OrderedPartitionSender The OrderedPartitionSender was leaking memory every time it was created because it created a wrapper RecordBatch which allocated memory but was never closed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ilooner/drill DRILL-5967 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1073.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1073 commit 13cd2fe7e8dccb2ac7546de508fbfbed8e19b48b Author: Timothy FarkasDate: 2017-12-14T18:48:27Z DRILL-5967: Fixed memory leak in OrderedPartitionSender ---