Re: How to enforce RDD to be cached?

dsiegel Wed, 03 Dec 2014 11:04:13 -0800

shahabm wrote
> I noticed that rdd.cache() is not happening immediately rather due to lazy
> feature of Spark, it is happening just at the moment  you perform some
> map/reduce actions. Is this true?


Yes, .cache() is a transformation (lazy evaluation)


shahabm wrote
> If this is the case, how can I enforce Spark to cache immediately at its
> cache() statement? I need this to perform some benchmarking and I need to
> separate rdd caching and rdd transformation/action processing time.

put an action immediately after .cache()
.cache().first() may be low impact, as it only returns the first element of
the RDD, rather than iterating.  




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-enforce-RDD-to-be-cached-tp20230p20284.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: How to enforce RDD to be cached?

Reply via email to