[ 
https://issues.apache.org/jira/browse/SPARK-6003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-6003.
------------------------------
    Resolution: Duplicate

I suggest that this is close enough to being subsumed in SPARK-5140 that it be 
merged. They're both about blocking to make sure an RDD is persisted and what 
the semantics are for that.

> Spark should offer a "sync" method that guarantees that RDDs are eagerly 
> evaluated and persisted
> ------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-6003
>                 URL: https://issues.apache.org/jira/browse/SPARK-6003
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 1.2.1
>            Reporter: Derrick Burns
>            Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> This may already exist, but I could not find it.
> One of the challenges in developing RELIABLE Spark application is dealing 
> with the elegant lazy evaluation semantics of RDD transformations.  It would 
> be useful to have a action with no output whose side-effect is to ensure that 
> the RDD is eagerly evaluated and persisted according the whatever persistence 
> level is set for the RDD.  
> Calling RDD.count() or any other action might do the trick -- and indeed I 
> have tried this -- however, it may be the case that RDD.count() does NOT 
> persist the data. 
> For example,  MappedRDD(x:RDD).count() === x.count(), so it is possible to 
> implement count without persisting the result of MappedRDD(x).  Without 
> looking at the code, one cannot know whether an operation is eagerly 
> evaluated AND persisted or not.   Having a standard Spark primitive that both 
> eagerly evaluated and RDD and persisted it according to its current 
> persistence level would be very useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to