Yes, just to add see the following scenario of rdd lineage:

RDD1 -> RDD2 -> RDD3 -> RDD4


here RDD2 depends on the RDD1's output and the lineage goes till RDD4.

Now, for some reason RDD3 is lost, and spark will recompute it from RDD2.

Thanks
Best Regards

On Thu, Jul 9, 2015 at 5:51 AM, canan chen <ccn...@gmail.com> wrote:

> Lots of places refer RDD lineage, I'd like to know what it refer to
> exactly.  My understanding is that it means the RDD dependencies and the
> intermediate MapOutput info in MapOutputTracker.  Correct me if I am wrong.
> Thanks
>
>
>

Reply via email to