You may start from here <https://github.com/apache/spark/blob/4fa2fda88fc7beebb579ba808e400113b512533b/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L706-L712> .
On Mon, Aug 25, 2014 at 9:05 PM, rapelly kartheek <kartheek.m...@gmail.com> wrote: > Hi, > > I've exercised multiple options available for persist() including RDD > replication. I have gone thru the classes that involve in caching/storing > the RDDS at different levels. StorageLevel class plays a pivotal role by > recording whether to use memory or disk or to replicate the RDD on multiple > nodes. > The class LocationIterator iterates over the preferred machines one by > one for > each partition that is replicated. I got a rough idea of CoalescedRDD. > Please correct me if I am wrong. > > But I am looking for the code that chooses the resources to replicate the > RDDs. Can someone please tell me how replication takes place and how do we > choose the resources for replication. I just want to know as to where > should I look into to understand how the replication happens. > > > > Thank you so much!!! > > regards > > -Karthik >