Hi,

I was just thinking about necessity for rdd replication. One category could
be something like large number of threads requiring same rdd. Even though,
a single rdd can be shared by multiple threads belonging to "same
application" , I believe we can extract better parallelism  if the rdd is
replicated, am I right?.

I am eager to know if there are any real life applications or any other
scenarios which force rdd to be replicated. Can someone please throw some
light on "necessity for rdd replication".

Thank you

Reply via email to