Hi List, I tried running the following sample Java code using Spark2 version 2.0.0 on YARN (HDP-2.5.0.0)
public class SparkSQLTest { public static void main(String[] args) { SparkSession sparkSession = SparkSession.builder().master("yarn") .config("spark.sql.warehouse.dir", "/apps/hive/warehouse") .config("hive.metastore.uris", "thrift://xxxxxxxxx:9083") .config("spark.driver.extraJavaOptions", "-Dhdp.version=2.5.0.0-1245") .config("spark.yarn.am.extraJavaOptions", "-Dhdp.version=2.5.0.0-1245") .config("spark.yarn.jars", "hdfs:///tmp/lib/spark2/*").enableHiveSupport().getOrCreate(); sparkSession.sql("insert into testdb.employee_orc select * from testdb.employee where empid<5"); } } I get the following error pointing to a local file system (file:/home/hive/spark-warehouse) wondering from where its being picked: 16:08:21.321 [dispatcher-event-loop-7] INFO org.apache.spark.storage.BlockManagerInfo - Added broadcast_0_piece0 in memory on 192.168.218.92:40831 (size: 30.6 KB, free: 4.0 GB) 16:08:21.322 [main] DEBUG org.apache.spark.storage.BlockManagerMaster - Updated info of block broadcast_0_piece0 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Told master about block broadcast_0_piece0 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Put block broadcast_0_piece0 locally took 4 ms 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Putting block broadcast_0_piece0 without replication took 4 ms 16:08:21.326 [main] INFO org.apache.spark.SparkContext - Created broadcast 0 from sql at SparkSQLTest.java:33 16:08:21.449 [main] DEBUG org.apache.spark.sql.hive.execution.InsertIntoHiveTable - Created staging dir = file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1 for path = file:/home/hive/spark-warehouse/testdb.db/employee_orc 16:08:21.451 [main] INFO org.apache.hadoop.hive.common.FileUtils - Creating directory if it doesn't exist: file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1 Exception in thread "main" java.lang.IllegalStateException: Cannot create staging directory 'file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1' at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getStagingDir(InsertIntoHiveTable.scala:83) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getExternalScratchDir(InsertIntoHiveTable.scala:97) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getExternalTmpPath(InsertIntoHiveTable.scala:105) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:148) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:142) at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:313) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:167) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582) at com.xxxx.xxx.xxx.xxx.xxxx.SparkSQLTest.main(SparkSQLTest.java:33) 16:08:21.454 [pool-8-thread-1] INFO org.apache.spark.SparkContext - Invoking stop() from shutdown hook 16:08:21.455 [pool-8-thread-1] DEBUG org.spark_project.jetty.util.component.AbstractLifeCycle - stopping org.spark_project.jetty.server.Server@620aa4ea 16:08:21.455 [pool-8-thread-1] DEBUG org.spark_project.jetty.server.Server - Graceful shutdown org.spark_project.jetty.server.Server@620aa4ea by Thanks, -Nirmal ________________________________ NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
<<attachment: winmail.dat>>
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org