Hello, I am currently learning Apache Spark and I want to see how it integrates with an existing Hadoop Cluster.
My current Hadoop configuration is version 2.2.0 without Yarn. I have build Apache Spark (v1.0.0) following the instructions in the README file. Only setting the SPARK_HADOOP_VERSION=1.2.1. Also, I export the HADOOP_CONF_DIR to point to the configuration directory of Hadoop configuration. My use-case is the Linear Least Regression MLlib example of Apache Spark (link: http://spark.apache.org/docs/latest/mllib-linear-methods.html#linear-least-squares-lasso-and-ridge-regression). The only difference in the code is that I give the text file to be an HDFS file. However, I get a "Runtime Exception: Error in configuring object." So my question is the following: Does Spark work with a Hadoop distribution without Yarn? If yes, am I doing it right? If no, can I build Spark with SPARK_HADOOP_VERSION=2.2.0 and with SPARK_YARN=false? Thank you, Nick