Clarification on Spark code comments

Neerav Kumar Tue, 12 May 2020 19:08:18 -0700

Hi

I am new to the community so pardon me if my question is framed incorrectly. I 
was going through the Spark code base on GitHub and am confused with comment 
mentioned. In file 
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/util/PeriodicRDDCheckpointer.scala
I see the comment says
Example usage:
* {{{
* val (rdd1, rdd2, rdd3, ...) = ...
* val cp = new PeriodicRDDCheckpointer(2, sc)
* cp.update(rdd1)
* rdd1.count();
* // persisted: rdd1
* cp.update(rdd2)
* rdd2.count();
* // persisted: rdd1, rdd2
* // checkpointed: rdd2
* cp.update(rdd3)
* rdd3.count();
* // persisted: rdd1, rdd2, rdd3
* // checkpointed: rdd2 rdd3
* cp.update(rdd4)
* rdd4.count();
* // persisted: rdd2, rdd3, rdd4
* // checkpointed: rdd4
* cp.update(rdd5)
* rdd5.count();
* // persisted: rdd3, rdd4, rdd5
* // checkpointed: rdd4 rdd5
* }}}


The checkpointed value does not make sense for rdd3.count() and rdd5.count(). I 
have crossed out the existing value and included the one I think makes sense. 
Is my understanding incorrect or is it a bug in documentation.

Regards
Neerav

Clarification on Spark code comments

Reply via email to