Unable to run simple spark-sql

Nirmal Kumar Tue, 18 Jun 2019 05:05:46 -0700

Hi List,

I tried running the following sample Java code using Spark2 version 2.0.0 on 
YARN (HDP-2.5.0.0)


public class SparkSQLTest {
  public static void main(String[] args) {
    SparkSession sparkSession = SparkSession.builder().master("yarn")
        .config("spark.sql.warehouse.dir", "/apps/hive/warehouse")
        .config("hive.metastore.uris", "thrift://xxxxxxxxx:9083")
        .config("spark.driver.extraJavaOptions", "-Dhdp.version=2.5.0.0-1245")
        .config("spark.yarn.am.extraJavaOptions", "-Dhdp.version=2.5.0.0-1245")
        .config("spark.yarn.jars", 
"hdfs:///tmp/lib/spark2/*").enableHiveSupport().getOrCreate();

    sparkSession.sql("insert into testdb.employee_orc select * from 
testdb.employee where empid<5");
  }
}

I get the following error pointing to a local file system 
(file:/home/hive/spark-warehouse) wondering from where its being picked:

16:08:21.321 [dispatcher-event-loop-7] INFO 
org.apache.spark.storage.BlockManagerInfo - Added broadcast_0_piece0 in memory 
on 192.168.218.92:40831 (size: 30.6 KB, free: 4.0 GB)
16:08:21.322 [main] DEBUG org.apache.spark.storage.BlockManagerMaster - Updated 
info of block broadcast_0_piece0
16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Told master 
about block broadcast_0_piece0
16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Put block 
broadcast_0_piece0 locally took  4 ms
16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Putting block 
broadcast_0_piece0 without replication took  4 ms
16:08:21.326 [main] INFO org.apache.spark.SparkContext - Created broadcast 0 
from sql at SparkSQLTest.java:33
16:08:21.449 [main] DEBUG 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable - Created staging dir = 
file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1
 for path = file:/home/hive/spark-warehouse/testdb.db/employee_orc
16:08:21.451 [main] INFO org.apache.hadoop.hive.common.FileUtils - Creating 
directory if it doesn't exist: 
file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1
Exception in thread "main" java.lang.IllegalStateException: Cannot create 
staging directory  
'file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1'
        at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getStagingDir(InsertIntoHiveTable.scala:83)
        at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getExternalScratchDir(InsertIntoHiveTable.scala:97)
        at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getExternalTmpPath(InsertIntoHiveTable.scala:105)
        at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:148)
        at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:142)
        at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:313)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
        at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186)
       at org.apache.spark.sql.Dataset.<init>(Dataset.scala:167)
        at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65)
        at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
        at com.xxxx.xxx.xxx.xxx.xxxx.SparkSQLTest.main(SparkSQLTest.java:33)
16:08:21.454 [pool-8-thread-1] INFO org.apache.spark.SparkContext - Invoking 
stop() from shutdown hook
16:08:21.455 [pool-8-thread-1] DEBUG 
org.spark_project.jetty.util.component.AbstractLifeCycle - stopping 
org.spark_project.jetty.server.Server@620aa4ea
16:08:21.455 [pool-8-thread-1] DEBUG org.spark_project.jetty.server.Server - 
Graceful shutdown org.spark_project.jetty.server.Server@620aa4ea by

Thanks,
-Nirmal

________________________________






NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.

<<attachment: winmail.dat>>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Unable to run simple spark-sql

Reply via email to