I have a set of csv files that I want to read as a single RDD using a stand
alone cluster.

These file reside on one machine right now. If I start a cluster with
multiple worker nodes, how do I use these worker nodes to read the files
and do the RDD computation ? Do I have to copy the files on every worker
node ?

Assume that copying these into a HDFS is not a option for now ..

Thanks,

Reply via email to