[
https://issues.apache.org/jira/browse/TINKERPOP-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15045307#comment-15045307
]
Dan LaRocque commented on TINKERPOP-1025:
-----------------------------------------
I looked at
https://github.com/apache/incubator-tinkerpop/commit/5c7bc38bdb42ae50243f58a22fc74bc094be6333.
It looks approximately like a superset, change-wise, of
https://github.com/dalaro/incubator-tinkerpop/commit/2d394f978f42ea3ba401ffa1d55c9145f0df274b,
but your approach is superior since it avoids duplicating work in the presence
of mapreduces. I tested both commits and 3.1.0-incubating in the same
scenario. Both commits fix the issue present in 3.1.0-i: graphRDD getting
emitted unchanged by a vertex program for certain RDD implementations, such as
UnionRDD, because SparkGraphComputer relied on in-place writes to RDD data for
element compute key persistence.
I don't have a vote, but I'm in favor.
> Solve SparkContext Persistence Issues with BulkLoaderVertexProgram
> ------------------------------------------------------------------
>
> Key: TINKERPOP-1025
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1025
> Project: TinkerPop
> Issue Type: Bug
> Components: hadoop
> Affects Versions: 3.1.0-incubating
> Reporter: Marko A. Rodriguez
> Fix For: 3.1.1-incubating
>
>
> {{BulkLoaderVertexProgramTest}} fails when a persisted {{SparkContext}} is
> used WITH an {{InputRDD}}.
> Weird.
> If you use persisted context and {{InputFormat}}. Good.
> If you use a non-persisted context and {{InputFormat}}. Good.
> If you use a non-persisted context and {{InputRDD}}. Good.
> If you used a persisted context and {{InputRDD}}. Bad.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)