Artem Aliev created TINKERPOP-2081:
--------------------------------------
Summary: PersistedOutputRDD materialises rdd lazily with Spark 2.x
Key: TINKERPOP-2081
URL: https://issues.apache.org/jira/browse/TINKERPOP-2081
Project: TinkerPop
Issue Type: Bug
Affects Versions: 3.3.4
Reporter: Artem Aliev
PersistedOutputRDD is not actually persist RDD in spark memory but mark it for
lazy caching in the future. It looks like caching was eager in Spark 1.6, but
in spark 2.0 it lazy.
The lazy caching looks wrong for this case, the source graph could be changed
after snapshot is created and snapshot should not be affected by that changes.
The fix itself is simple: PersistedOutputRDD should call any spark action to
trigger eager caching. For example count()
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)