[jira] [Commented] (TINKERPOP-1525) Plug VertexProgram iteration leak on empty Spark RDD partitions
[ https://issues.apache.org/jira/browse/TINKERPOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15618743#comment-15618743 ] ASF GitHub Bot commented on TINKERPOP-1525: --- Github user asfgit closed the pull request at: https://github.com/apache/tinkerpop/pull/462 > Plug VertexProgram iteration leak on empty Spark RDD partitions > --- > > Key: TINKERPOP-1525 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1525 > Project: TinkerPop > Issue Type: Bug > Components: hadoop >Affects Versions: 3.2.3 >Reporter: Dan LaRocque > > If SparkExecutor gets an RDD with empty partitions, it can invoke > {{VertexProgram.workerIterationStart}} without ever invoking > {{VertexProgram.workerIterationEnd}}. > For vertex programs that allocate and release meaningful resources in the > start/end methods, this can lead to resource leaks. > I already tested a fix that I made against the 3.2 series. I will submit PRs > momentarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1525) Plug VertexProgram iteration leak on empty Spark RDD partitions
[ https://issues.apache.org/jira/browse/TINKERPOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15616650#comment-15616650 ] ASF GitHub Bot commented on TINKERPOP-1525: --- Github user dkuppitz commented on the issue: https://github.com/apache/tinkerpop/pull/462 VOTE: +1 > Plug VertexProgram iteration leak on empty Spark RDD partitions > --- > > Key: TINKERPOP-1525 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1525 > Project: TinkerPop > Issue Type: Bug > Components: hadoop >Affects Versions: 3.2.3 >Reporter: Dan LaRocque > > If SparkExecutor gets an RDD with empty partitions, it can invoke > {{VertexProgram.workerIterationStart}} without ever invoking > {{VertexProgram.workerIterationEnd}}. > For vertex programs that allocate and release meaningful resources in the > start/end methods, this can lead to resource leaks. > I already tested a fix that I made against the 3.2 series. I will submit PRs > momentarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1525) Plug VertexProgram iteration leak on empty Spark RDD partitions
[ https://issues.apache.org/jira/browse/TINKERPOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608588#comment-15608588 ] ASF GitHub Bot commented on TINKERPOP-1525: --- Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/462 VOTE +1 > Plug VertexProgram iteration leak on empty Spark RDD partitions > --- > > Key: TINKERPOP-1525 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1525 > Project: TinkerPop > Issue Type: Bug > Components: hadoop >Affects Versions: 3.2.3 >Reporter: Dan LaRocque > > If SparkExecutor gets an RDD with empty partitions, it can invoke > {{VertexProgram.workerIterationStart}} without ever invoking > {{VertexProgram.workerIterationEnd}}. > For vertex programs that allocate and release meaningful resources in the > start/end methods, this can lead to resource leaks. > I already tested a fix that I made against the 3.2 series. I will submit PRs > momentarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1525) Plug VertexProgram iteration leak on empty Spark RDD partitions
[ https://issues.apache.org/jira/browse/TINKERPOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15607059#comment-15607059 ] ASF GitHub Bot commented on TINKERPOP-1525: --- Github user asfgit closed the pull request at: https://github.com/apache/tinkerpop/pull/466 > Plug VertexProgram iteration leak on empty Spark RDD partitions > --- > > Key: TINKERPOP-1525 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1525 > Project: TinkerPop > Issue Type: Bug > Components: hadoop >Affects Versions: 3.2.3 >Reporter: Dan LaRocque > > If SparkExecutor gets an RDD with empty partitions, it can invoke > {{VertexProgram.workerIterationStart}} without ever invoking > {{VertexProgram.workerIterationEnd}}. > For vertex programs that allocate and release meaningful resources in the > start/end methods, this can lead to resource leaks. > I already tested a fix that I made against the 3.2 series. I will submit PRs > momentarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1525) Plug VertexProgram iteration leak on empty Spark RDD partitions
[ https://issues.apache.org/jira/browse/TINKERPOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606960#comment-15606960 ] ASF GitHub Bot commented on TINKERPOP-1525: --- Github user dalaro commented on the issue: https://github.com/apache/tinkerpop/pull/462 I made a version of this change for TINKERPOP-1389 as #466. That one's optional, but I thought merging this into TINKERPOP-1389 might help ensure this fix does not get lost in the eventual possibly-conflict-y and confusing merge of TINKERPOP-1389 to master. > Plug VertexProgram iteration leak on empty Spark RDD partitions > --- > > Key: TINKERPOP-1525 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1525 > Project: TinkerPop > Issue Type: Bug > Components: hadoop >Affects Versions: 3.2.3 >Reporter: Dan LaRocque > > If SparkExecutor gets an RDD with empty partitions, it can invoke > {{VertexProgram.workerIterationStart}} without ever invoking > {{VertexProgram.workerIterationEnd}}. > For vertex programs that allocate and release meaningful resources in the > start/end methods, this can lead to resource leaks. > I already tested a fix that I made against the 3.2 series. I will submit PRs > momentarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1525) Plug VertexProgram iteration leak on empty Spark RDD partitions
[ https://issues.apache.org/jira/browse/TINKERPOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606957#comment-15606957 ] ASF GitHub Bot commented on TINKERPOP-1525: --- GitHub user dalaro opened a pull request: https://github.com/apache/tinkerpop/pull/466 TINKERPOP-1525 Avoid starting VP worker iterations that never end (Spark 2.0 version) This is exactly like #462, except that it tracks a change except it tracks a switch between Spark 1.6 and 2.0 away from functions that manipulate iterables to those that manipulate iterators. Assuming #462 eventually gets into master, and assuming that TINKERPOP-1389 eventually merges with master, the second merge will conflict. It still seems marginally safer to make this change in parallel on TINKERPOP-1389 and master/tp32 than just the latter, since the conflict will look more like "oh i better keep one of these two almost-identical edge-case checks" than "oh the Spark 1.x branch had some silly edge case check that I can just delete for 2.0". You can merge this pull request into a Git repository by running: $ git pull https://github.com/dalaro/incubator-tinkerpop TINKERPOP-1525-for-TINKERPOP-1389 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/466.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #466 commit 2a8f741190beebd7b7e6a9ff7922afb9b6807fa5 Author: Dan LaRocqueDate: 2016-10-26T00:37:17Z Avoid starting VP worker iterations that never end SparkExecutor.executeVertexProgramIteration was written in such a way that an empty RDD partition would cause it to invoke VertexProgram.workerIterationStart without ever invoking VertexProgram.workerIterationEnd. This seems like a contract violation. I have at least one VP that relies on workerIterationStart|End to allocate and release resources. Failing to invoke End like this causes a leak in that VP, as it would for any VP that uses that resource management pattern. (cherry picked from commit 36e1159a80f539b8bd4a884e5c1cf304ec52c4f9; this is the same change, except it tracks a switch between Spark 1.6 and 2.0 away from functions that manipulate iterables to those that manipulate iterators) > Plug VertexProgram iteration leak on empty Spark RDD partitions > --- > > Key: TINKERPOP-1525 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1525 > Project: TinkerPop > Issue Type: Bug > Components: hadoop >Affects Versions: 3.2.3 >Reporter: Dan LaRocque > > If SparkExecutor gets an RDD with empty partitions, it can invoke > {{VertexProgram.workerIterationStart}} without ever invoking > {{VertexProgram.workerIterationEnd}}. > For vertex programs that allocate and release meaningful resources in the > start/end methods, this can lead to resource leaks. > I already tested a fix that I made against the 3.2 series. I will submit PRs > momentarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1525) Plug VertexProgram iteration leak on empty Spark RDD partitions
[ https://issues.apache.org/jira/browse/TINKERPOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605725#comment-15605725 ] ASF GitHub Bot commented on TINKERPOP-1525: --- Github user dalaro commented on the issue: https://github.com/apache/tinkerpop/pull/462 Marko's right, I wrote and tested this against TINKERPOP-1389 (before the latest rebase), then cherry-picked against current master without retesting. Sorry about that. He's also right about the Iterator/Iterable change. I'll change it sometime today, force-push the branch, and comment. > Plug VertexProgram iteration leak on empty Spark RDD partitions > --- > > Key: TINKERPOP-1525 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1525 > Project: TinkerPop > Issue Type: Bug > Components: hadoop >Affects Versions: 3.2.3 >Reporter: Dan LaRocque > > If SparkExecutor gets an RDD with empty partitions, it can invoke > {{VertexProgram.workerIterationStart}} without ever invoking > {{VertexProgram.workerIterationEnd}}. > For vertex programs that allocate and release meaningful resources in the > start/end methods, this can lead to resource leaks. > I already tested a fix that I made against the 3.2 series. I will submit PRs > momentarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1525) Plug VertexProgram iteration leak on empty Spark RDD partitions
[ https://issues.apache.org/jira/browse/TINKERPOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605176#comment-15605176 ] ASF GitHub Bot commented on TINKERPOP-1525: --- Github user spmallette commented on the issue: https://github.com/apache/tinkerpop/pull/462 Travis is showing a pretty clear compilation failure: ```text [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project spark-gremlin: Compilation failure [ERROR] /home/travis/build/apache/tinkerpop/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkExecutor.java:[92,37] incompatible types: no instance(s) of type variable(s) T exist so that java.util.Iterator conforms to java.lang.Iterable> ``` can that be ignored? > Plug VertexProgram iteration leak on empty Spark RDD partitions > --- > > Key: TINKERPOP-1525 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1525 > Project: TinkerPop > Issue Type: Bug > Components: hadoop >Affects Versions: 3.2.3 >Reporter: Dan LaRocque > > If SparkExecutor gets an RDD with empty partitions, it can invoke > {{VertexProgram.workerIterationStart}} without ever invoking > {{VertexProgram.workerIterationEnd}}. > For vertex programs that allocate and release meaningful resources in the > start/end methods, this can lead to resource leaks. > I already tested a fix that I made against the 3.2 series. I will submit PRs > momentarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1525) Plug VertexProgram iteration leak on empty Spark RDD partitions
[ https://issues.apache.org/jira/browse/TINKERPOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605155#comment-15605155 ] ASF GitHub Bot commented on TINKERPOP-1525: --- Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/462 Ah. Yea -- huh. Total odd ball case, but yea, you are right. The empty iterator would make it so the end iteration never fires. VOTE +1. > Plug VertexProgram iteration leak on empty Spark RDD partitions > --- > > Key: TINKERPOP-1525 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1525 > Project: TinkerPop > Issue Type: Bug > Components: hadoop >Affects Versions: 3.2.3 >Reporter: Dan LaRocque > > If SparkExecutor gets an RDD with empty partitions, it can invoke > {{VertexProgram.workerIterationStart}} without ever invoking > {{VertexProgram.workerIterationEnd}}. > For vertex programs that allocate and release meaningful resources in the > start/end methods, this can lead to resource leaks. > I already tested a fix that I made against the 3.2 series. I will submit PRs > momentarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1525) Plug VertexProgram iteration leak on empty Spark RDD partitions
[ https://issues.apache.org/jira/browse/TINKERPOP-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603693#comment-15603693 ] ASF GitHub Bot commented on TINKERPOP-1525: --- GitHub user dalaro opened a pull request: https://github.com/apache/tinkerpop/pull/462 TINKERPOP-1525 Avoid starting VP worker iterations that never end SparkExecutor.executeVertexProgramIteration was written in such a way that an empty RDD partition would cause it to invoke VertexProgram.workerIterationStart without ever invoking VertexProgram.workerIterationEnd. This seems like a contract violation. I have at least one VP that relies on workerIterationStart|End to allocate and release resources. Failing to invoke End like this causes a leak in that VP, as it would for any VP that uses that resource management pattern. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dalaro/incubator-tinkerpop TINKERPOP-1525-tp32 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/462.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #462 commit 1fc6ec5d7c20d6163c61f84198e3e68457e23eae Author: Dan LaRocqueDate: 2016-10-21T21:04:30Z Avoid starting VP worker iterations that never end SparkExecutor.executeVertexProgramIteration was written in such a way that an empty RDD partition would cause it to invoke VertexProgram.workerIterationStart without ever invoking VertexProgram.workerIterationEnd. This seems like a contract violation. I have at least one VP that relies on workerIterationStart|End to allocate and release resources. Failing to invoke End like this causes a leak in that VP, as it would for any VP that uses that resource management pattern. > Plug VertexProgram iteration leak on empty Spark RDD partitions > --- > > Key: TINKERPOP-1525 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1525 > Project: TinkerPop > Issue Type: Bug >Affects Versions: 3.2.3, 3.3.0 >Reporter: Dan LaRocque > > If SparkExecutor gets an RDD with empty partitions, it can invoke > {{VertexProgram.workerIterationStart}} without ever invoking > {{VertexProgram.workerIterationEnd}}. > For vertex programs that allocate and release meaningful resources in the > start/end methods, this can lead to resource leaks. > I already tested a fix that I made against the 3.2 series. I will submit PRs > momentarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332)