How to use cluster for large set of linux files

Manoj Samel Wed, 22 Jan 2014 12:38:58 -0800

I have a set of csv files that I want to read as a single RDD using a stand
alone cluster.


These file reside on one machine right now. If I start a cluster with
multiple worker nodes, how do I use these worker nodes to read the files
and do the RDD computation ? Do I have to copy the files on every worker
node ?

Assume that copying these into a HDFS is not a option for now ..

Thanks,

How to use cluster for large set of linux files

Reply via email to