RDD don't *need* replication; but it doesn't harm if the underlying things has replication.
On Mon, Aug 4, 2014 at 5:51 PM, Deep Pradhan <pradhandeep1...@gmail.com> wrote: > Hi, > Spark can run on top of HDFS. > While Spark talks about the RDDs which do not need replication because the > partitions can be built with the help of lineage. But, HDFS inherently has > replication. How do these two concepts go together? > Thank You >