Marek Simunek created BEAM-6053:
-----------------------------------

             Summary: Add option to disable caching in Spark
                 Key: BEAM-6053
                 URL: https://issues.apache.org/jira/browse/BEAM-6053
             Project: Beam
          Issue Type: Improvement
          Components: runner-spark
    Affects Versions: 2.9.0
            Reporter: Marek Simunek
            Assignee: Amit Sela


Add possibility to SparkOptions to turn off spark RDD caching. There are use 
cases when its faster to recompute whole RDD rather then serialize, store, 
deserialize, read from store.

We probably don't want to have some list of `PCollections` which we don't want 
to cache, because that would be tailored to specific runner and would be 
against Beam's concepts. So I propose to turn off caching for the whole 
pipeline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to