Using HDFS locality. The workers call for the data from hdfs/queue etc.
Unless you use parallelize then its sent from driver (typically on the
master) to the worker nodes.

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Tue, Jun 24, 2014 at 11:51 AM, srujana <srujana...@persistent.co.in>
wrote:

> Hi,
>
> I am working on auto scaling spark cluster. I would like to know how master
> distributes the data to the slaves for processing in detail.
>
> Any information on this would be helpful.
>
> Thanks,
> Srujana
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-data-is-distributed-while-processing-in-spark-cluster-tp8160.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Reply via email to