What is yarn cluster?

And, does spark necessarily need Hadoop already installed in the cluster? For 
example, can one download spark and run it on a bunch of nodes, with no prior 
installation of Hadoop?

Thanks,


From: Yi Tian [mailto:[email protected]]
Sent: Sunday, October 26, 2014 9:08 PM
To: Pagliari, Roberto
Cc: [email protected]
Subject: Re: Spark SQL configuration

You can write `HADOOP_CONF_DIR=your_hadoop_conf_path` to `conf/spark-env.sh` to 
enable:

1 connect to your yarn cluster
2 set `hdfs` as default FileSystem, otherwise you have to write “hdfs://“ 
before every paths you defined, like: `val input = 
sc.textFile(“hdfs://user/spark/test.dat”)`


Best Regards,

Yi Tian
[email protected]<mailto:[email protected]>



On Oct 27, 2014, at 07:59, Pagliari, Roberto 
<[email protected]<mailto:[email protected]>> wrote:

I’m a newbie with Spark. After installing it on all the machines I want to use, 
do I need to tell it about Hadoop configuration, or will it be able to find it 
himself?

Thank you,

Reply via email to