Yes, just to add see the following scenario of rdd lineage:
RDD1 - RDD2 - RDD3 - RDD4
here RDD2 depends on the RDD1's output and the lineage goes till RDD4.
Now, for some reason RDD3 is lost, and spark will recompute it from RDD2.
Thanks
Best Regards
On Thu, Jul 9, 2015 at 5:51 AM, canan
Lots of places refer RDD lineage, I'd like to know what it refer to
exactly. My understanding is that it means the RDD dependencies and the
intermediate MapOutput info in MapOutputTracker. Correct me if I am wrong.
Thanks