Hi, Michael. I Have the same problem. My warehouse directory is always
created locally. I copied the default hive-site.xml into the
$SPARK_HOME/conf directory on each node. After I executed the code below,
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
hiveContext.hql("CREATE TABLE IF NOT EXISTS src (key INT, value
STRING)")
hiveContext.hql("LOAD DATA LOCAL INPATH
'/extdisk2/tools/spark/examples/src/main/resources/kv1.txt' INTO TABLE src")
hiveContext.hql("FROM src SELECT key, value").collect()
I got the exception below:
java.io.FileNotFoundException: File file:/user/hive/warehouse/src/kv1.txt
does not exist
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398)
at
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:137)
at
org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:763)
at
org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:106)
at
org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:193)
At last, I found /user/hive/warehouse/src/kv1.txt was created on the node
where I start spark-shell.
The spark that I used is pre-built spark1.0.1 for hadoop2.
Thanks in advance.
Michael Armbrust wrote
> The warehouse and the metastore directories are two different things. The
> metastore holds the schema information about the tables and will by
> default
> be a local directory. With javax.jdo.option.ConnectionURL you can
> configure it to be something like mysql. The warehouse directory is the
> default location where the actual contents of the tables is stored. What
> directory are seeing created locally?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/HiveContext-is-creating-metastore-warehouse-locally-instead-of-in-hdfs-tp10838p11024.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.