Re: Unable to run simple spark-sql

Nirmal Kumar Tue, 18 Jun 2019 10:10:12 -0700

Just an update on the thread that it's kerberized.

I'm trying to execute the query with a different user xyz not hive.
Because seems like some permission issue the user xyz trying creating directory 
in /home/hive directory


Do i need some impersonation setting?

Thanks,
Nirmal

Get Outlook for Android<https://aka.ms/ghei36>

________________________________
From: Nirmal Kumar
Sent: Tuesday, June 18, 2019 5:56:06 PM
To: Raymond Honderdors; Nirmal Kumar
Cc: user
Subject: RE: Unable to run simple spark-sql

Hi Raymond,

Permission on hdfs is 777
drwxrwxrwx   - impadmin hdfs          0 2019-06-13 16:09 
/home/hive/spark-warehouse


But it’s pointing to a local file system:
Exception in thread "main" java.lang.IllegalStateException: Cannot create 
staging directory  
'file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1'

Thanks,
-Nirmal


From: Raymond Honderdors <raymond.honderd...@sizmek.com>
Sent: 18 June 2019 17:52
To: Nirmal Kumar <nirmal.ku...@impetus.co.in.invalid>
Cc: user <user@spark.apache.org>
Subject: Re: Unable to run simple spark-sql

Hi
Can you check the permission of the user running spark
On the hdfs folder where it tries to create the table

On Tue, Jun 18, 2019, 15:05 Nirmal Kumar 
<nirmal.ku...@impetus.co.in.invalid<mailto:nirmal.ku...@impetus.co.in.invalid>> 
wrote:
Hi List,

I tried running the following sample Java code using Spark2 version 2.0.0 on 
YARN (HDP-2.5.0.0)

public class SparkSQLTest {
  public static void main(String[] args) {
    SparkSession sparkSession = SparkSession.builder().master("yarn")
        .config("spark.sql.warehouse.dir", "/apps/hive/warehouse")
        .config("hive.metastore.uris", "thrift://xxxxxxxxx:9083")
        .config("spark.driver.extraJavaOptions", "-Dhdp.version=2.5.0.0-1245")
        
.config("spark.yarn.am<http://secure-web.cisco.com/1beuiC-aaBQJ0jgI7vONgZiTP5gCokYFEbllyW3ShZVdpQaIuYfuuEuS8iwzhqvwBE8C_E_bBe_7isO-HyPEVX6ZgJajKrQ6oWvTeBQCMjTHVCVImERG2S9qSHrH_mDzf656vrBFxAT1MYZhTZYzXl_3hyZ4BH-XCbKjXrCDyR1OR3tYqqDc7if9NJ1gqHWPwg84tho0__fut2d8y4XxMoMTQNnJzx5367QL6lYV5CFZj055coSLihVVYrh5jBID5jJF40PsrWSvdW7gJ_P6IAN9jTpHFJD7ZrokjlyS7WBAx5Mtnd2KxvNc2O6kKcxk2/http%3A%2F%2Fspark.yarn.am>.extraJavaOptions",
 "-Dhdp.version=2.5.0.0-1245")
        .config("spark.yarn.jars", 
"hdfs:///tmp/lib/spark2/*").enableHiveSupport().getOrCreate();

    sparkSession.sql("insert into testdb.employee_orc select * from 
testdb.employee where empid<5");
  }
}

I get the following error pointing to a local file system 
(file:/home/hive/spark-warehouse) wondering from where its being picked:

16:08:21.321 [dispatcher-event-loop-7] INFO 
org.apache.spark.storage.BlockManagerInfo - Added broadcast_0_piece0 in memory 
on 
192.168.218.92:40831<http://secure-web.cisco.com/18zd_gzhF2N4NeZyolJRHaQMm3mYmE7J-u5p8lbMjuy7lxIZN8zgUUzR8pAzFfMxMiTknORj-329_qyn9tpyQcLejfGKtMK8lhr24CVjsWQVC_YXrT8Ie0c3rifE3KxpJ2y2k58cNtAr0je4JPtzOp6x1HuSmOHLU6CXb80FNn2yi0-PBSRKBHYDJVGU9TlTto9wpY5gkO3U-u7BLR69hXgrqotcSHjzbipPVbI1-HcKKcTbYaEFEqUkM7yy9XJiBfxeqYYJyvstG-5JMJ8Vu8R9DU7gRE0VWMYDNKWPF9KAk_ky4jPHMYHf_DEJimDFI9l0OCyJlELPQs0iw1M6d5g/http%3A%2F%2F192.168.218.92%3A40831>
 (size: 30.6 KB, free: 4.0 GB)
16:08:21.322 [main] DEBUG org.apache.spark.storage.BlockManagerMaster - Updated 
info of block broadcast_0_piece0
16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Told master 
about block broadcast_0_piece0
16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Put block 
broadcast_0_piece0 locally took  4 ms
16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Putting block 
broadcast_0_piece0 without replication took  4 ms
16:08:21.326 [main] INFO org.apache.spark.SparkContext - Created broadcast 0 
from sql at SparkSQLTest.java:33
16:08:21.449 [main] DEBUG 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable - Created staging dir = 
file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1
 for path = file:/home/hive/spark-warehouse/testdb.db/employee_orc
16:08:21.451 [main] INFO org.apache.hadoop.hive.common.FileUtils - Creating 
directory if it doesn't exist: 
file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1
Exception in thread "main" java.lang.IllegalStateException: Cannot create 
staging directory  
'file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1'
        at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getStagingDir(InsertIntoHiveTable.scala:83)
        at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getExternalScratchDir(InsertIntoHiveTable.scala:97)
        at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getExternalTmpPath(InsertIntoHiveTable.scala:105)
        at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:148)
        at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:142)
        at 
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:313)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
        at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
        at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186)
       at org.apache.spark.sql.Dataset.<init>(Dataset.scala:167)
        at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65)
        at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
        at com.xxxx.xxx.xxx.xxx.xxxx.SparkSQLTest.main(SparkSQLTest.java:33)
16:08:21.454 [pool-8-thread-1] INFO org.apache.spark.SparkContext - Invoking 
stop() from shutdown hook
16:08:21.455 [pool-8-thread-1] DEBUG 
org.spark_project.jetty.util.component.AbstractLifeCycle - stopping 
org.spark_project.jetty.server.Server@620aa4ea<mailto:org.spark_project.jetty.server.Server@620aa4ea>
16:08:21.455 [pool-8-thread-1] DEBUG org.spark_project.jetty.server.Server - 
Graceful shutdown 
org.spark_project.jetty.server.Server@620aa4ea<mailto:org.spark_project.jetty.server.Server@620aa4ea>
 by

Thanks,
-Nirmal

________________________________






NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.

---------------------------------------------------------------------
To unsubscribe e-mail: 
user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>

________________________________






NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.

<<attachment: winmail.dat>>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Unable to run simple spark-sql

Reply via email to