[
https://issues.apache.org/jira/browse/SPARK-20808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16017108#comment-16017108
]
Joachim Hereth commented on SPARK-20808:
----------------------------------------
The warning is caused by an Exeption raised by a call to [saveTableIntoHive() |
https://github.com/apache/spark/blob/ac1ab6b9db188ac54c745558d57dd0a031d0b162/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L369].
I was not able to debug what caused the misleading Exception about privileges.
> External Table unnecessarily not create in Hive-compatible way
> --------------------------------------------------------------
>
> Key: SPARK-20808
> URL: https://issues.apache.org/jira/browse/SPARK-20808
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.1.0, 2.1.1
> Reporter: Joachim Hereth
> Priority: Minor
>
> In Spark 2.1.0 and 2.1.1 {{spark.catalog.createExternalTable}} creates tables
> unnecessarily in a hive-incompatible way.
> For instance executing in a spark shell
> {code}
> val database = "default"
> val table = "table_name"
> val path = "/user/daki/" + database + "/" + table
> var data = Array(("Alice", 23), ("Laura", 33), ("Peter", 54))
> val df = sc.parallelize(data).toDF("name","age")
> df.write.mode(org.apache.spark.sql.SaveMode.Overwrite).parquet(path)
> spark.sql("DROP TABLE IF EXISTS " + database + "." + table)
> spark.catalog.createExternalTable(database + "."+ table, path)
> {code}
> issues the warning
> {code}
> Search Subject for Kerberos V5 INIT cred (<<DEF>>,
> sun.security.jgss.krb5.Krb5InitCredential)
> 17/05/19 11:01:17 WARN hive.HiveExternalCatalog: Could not persist
> `default`.`table_name` in a Hive compatible way. Persisting it into Hive
> metastore in Spark SQL specific format.
> org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:User
> daki does not have privileges for CREATETABLE)
> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:720)
> ...
> {code}
> The Exception (user does not have privileges for CREATETABLE) is misleading
> (I do have the CREATE TABLE privilege).
> Querying the table with Hive does not return any result. With Spark one can
> access the data.
> The following code creates the table correctly (workaround):
> {code}
> def sqlStatement(df : org.apache.spark.sql.DataFrame, database : String,
> table: String, path: String) : String = {
> val rows = (for(col <- df.schema)
> yield "`" + col.name + "` " +
> col.dataType.simpleString).mkString(",\n")
> val sqlStmnt = ("CREATE EXTERNAL TABLE `%s`.`%s` (%s) " +
> "STORED AS PARQUET " +
> "Location 'hdfs://nameservice1%s'").format(database, table, rows, path)
> return sqlStmnt
> }
> spark.sql("DROP TABLE IF EXISTS " + database + "." + table)
> spark.sql(sqlStatement(df, database, table, path))
> {code}
> The code is executed via YARN against a Cloudera CDH 5.7.5 cluster with
> Sentry enabled (in case this matters regarding the privilege warning). Spark
> was built against the CDH libraries.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]