[ https://issues.apache.org/jira/browse/TINKERPOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15606957#comment-15606957 ]
ASF GitHub Bot commented on TINKERPOP-1525: ------------------------------------------- GitHub user dalaro opened a pull request: https://github.com/apache/tinkerpop/pull/466 TINKERPOP-1525 Avoid starting VP worker iterations that never end (Spark 2.0 version) This is exactly like #462, except that it tracks a change except it tracks a switch between Spark 1.6 and 2.0 away from functions that manipulate iterables to those that manipulate iterators. Assuming #462 eventually gets into master, and assuming that TINKERPOP-1389 eventually merges with master, the second merge will conflict. It still seems marginally safer to make this change in parallel on TINKERPOP-1389 and master/tp32 than just the latter, since the conflict will look more like "oh i better keep one of these two almost-identical edge-case checks" than "oh the Spark 1.x branch had some silly edge case check that I can just delete for 2.0". You can merge this pull request into a Git repository by running: $ git pull https://github.com/dalaro/incubator-tinkerpop TINKERPOP-1525-for-TINKERPOP-1389 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/466.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #466 ---- commit 2a8f741190beebd7b7e6a9ff7922afb9b6807fa5 Author: Dan LaRocque <dal...@hopcount.org> Date: 2016-10-26T00:37:17Z Avoid starting VP worker iterations that never end SparkExecutor.executeVertexProgramIteration was written in such a way that an empty RDD partition would cause it to invoke VertexProgram.workerIterationStart without ever invoking VertexProgram.workerIterationEnd. This seems like a contract violation. I have at least one VP that relies on workerIterationStart|End to allocate and release resources. Failing to invoke End like this causes a leak in that VP, as it would for any VP that uses that resource management pattern. (cherry picked from commit 36e1159a80f539b8bd4a884e5c1cf304ec52c4f9; this is the same change, except it tracks a switch between Spark 1.6 and 2.0 away from functions that manipulate iterables to those that manipulate iterators) ---- > Plug VertexProgram iteration leak on empty Spark RDD partitions > --------------------------------------------------------------- > > Key: TINKERPOP-1525 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1525 > Project: TinkerPop > Issue Type: Bug > Components: hadoop > Affects Versions: 3.2.3 > Reporter: Dan LaRocque > > If SparkExecutor gets an RDD with empty partitions, it can invoke > {{VertexProgram.workerIterationStart}} without ever invoking > {{VertexProgram.workerIterationEnd}}. > For vertex programs that allocate and release meaningful resources in the > start/end methods, this can lead to resource leaks. > I already tested a fix that I made against the 3.2 series. I will submit PRs > momentarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)