Thanks all for helping. Following the Earthson's tip i resolved. I have to report that if you materialized the RDD and after you try to checkpoint it the operation doesn't perform.
newRdd = oldRdd.map(myFun).persist(myStorageLevel) newRdd.foreach(x => myFunLogic(x)) // Here materialized for other reasons ... if(condition){ // after i would checkpoint newRdd.checkpoint newRdd.isCheckpointed // false here newRdd.foreach(x => {}) // Force evaluation newRdd.isCheckpointed // still false here } oldRdd.unpersist(true) 2014-05-06 3:35 GMT+02:00 Earthson <earthson...@gmail.com>: > checkpoint seems to be just add a CheckPoint mark? You need an action after > marked it. I have tried it with success:) > > newRdd = oldRdd.map(myFun).persist(myStorageLevel) > newRdd.checkpoint // <<checkpoint here > newRdd.isCheckpointed // false here > newRdd.foreach(x => {}) // Force evaluation > newRdd.isCheckpointed // true here > oldRdd.unpersist(true) > > > ~~~~~~~~ > > If you have new broadcast object for each step of iteration, broadcast will > eat up all of the memory. You may need to set "spark.cleaner.ttl" to a > small > enough value. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Incredible-slow-iterative-computation-tp4204p5407.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >