[
https://issues.apache.org/jira/browse/TINKERPOP-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stephen mallette closed TINKERPOP-2081.
---------------------------------------
Resolution: Fixed
Fix Version/s: 3.3.5
3.4.0
> PersistedOutputRDD materialises rdd lazily with Spark 2.x
> ---------------------------------------------------------
>
> Key: TINKERPOP-2081
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2081
> Project: TinkerPop
> Issue Type: Bug
> Components: hadoop
> Affects Versions: 3.3.4
> Reporter: Artem Aliev
> Assignee: stephen mallette
> Priority: Major
> Fix For: 3.4.0, 3.3.5
>
>
> PersistedOutputRDD is not actually persist RDD in spark memory but mark it
> for lazy caching in the future. It looks like caching was eager in Spark 1.6,
> but in spark 2.0 it lazy.
> The lazy caching looks wrong for this case, the source graph could be changed
> after snapshot is created and snapshot should not be affected by that changes.
> The fix itself is simple: PersistedOutputRDD should call any spark action to
> trigger eager caching. For example count()
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)