Is this terminating the execution or spark application still runs after
this error?

One thing for sure, it is looking for local file on driver (ie your mac) @
location: file:/Users/jgp/Documents/Data/restaurants-data.json

On Mon, Jul 11, 2016 at 12:33 PM, Jean Georges Perrin <j...@jgp.net> wrote:

>
> I have my dev environment on my Mac. I have a dev Spark server on a
> freshly installed physical Ubuntu box.
>
> I had some connection issues, but it is now all fine.
>
> In my code, running on the Mac, I have:
>
> 1 SparkConf conf = new SparkConf().setAppName("myapp").setMaster("
> spark://10.0.100.120:7077");
> 2 JavaSparkContext javaSparkContext = new JavaSparkContext(conf);
> 3 javaSparkContext.setLogLevel("WARN");
> 4 SQLContext sqlContext = new SQLContext(javaSparkContext);
> 5
> 6 // Restaurant Data
> 7 df = sqlContext.read().option("dateFormat", "yyyy-mm-dd").json(source
> .getLocalStorage());
>
>
> 1) Clarification question: This code runs on my mac, connects to the
> server, but line #7 assumes the file is on my mac, not on the server, right?
>
> 2) On line 7, I get an exception:
>
> 16-07-10 22:20:04:143 DEBUG  - address: jgp-MacBook-Air.local/10.0.100.100
> isLoopbackAddress: false, with host 10.0.100.100 jgp-MacBook-Air.local
> 16-07-10 22:20:04:240 INFO
> org.apache.spark.sql.execution.datasources.json.JSONRelation - Listing
> file:/Users/jgp/Documents/Data/restaurants-data.json on driver
> 16-07-10 22:20:04:288 DEBUG org.apache.hadoop.util.Shell - Failed to
> detect a valid hadoop home directory
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
> at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:225)
> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:250)
> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
> at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(
> FileInputFormat.java:447)
> at org.apache.spark.sql.execution.datasources.json.JSONRelation.org
> $apache$spark$sql$execution$datasources$json$JSONRelation$$createBaseRdd(JSONRelation.scala:98)
> at
> org.apache.spark.sql.execution.datasources.json.JSONRelation$$anonfun$4$$anonfun$apply$1.apply(JSONRelation.scala:115)
> at
> org.apache.spark.sql.execution.datasources.json.JSONRelation$$anonfun$4$$anonfun$apply$1.apply(JSONRelation.scala:115)
> at scala.Option.getOrElse(Option.scala:120)
> at
> org.apache.spark.sql.execution.datasources.json.JSONRelation$$anonfun$4.apply(JSONRelation.scala:115)
> at
> org.apache.spark.sql.execution.datasources.json.JSONRelation$$anonfun$4.apply(JSONRelation.scala:109)
> at scala.Option.getOrElse(Option.scala:120)
>
> Do I have to install HADOOP on the server? - I imagine that from:
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
>
> TIA,
>
> jg
>
>


-- 
Best Regards,
Ayan Guha

Reply via email to