Github user vanzin commented on a diff in the pull request:
https://github.com/apache/spark/pull/21398#discussion_r190327819
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala ---
@@ -230,11 +232,29 @@ private[spark] class HiveExternalCatalog(conf:
SparkConf, hadoopConf: Configurat
// specify location for managed table. And in
[[CreateDataSourceTableAsSelectCommand]] we have
// to create the table directory and write out data before we create
this table, to avoid
// exposing a partial written table.
- val needDefaultTableLocation = tableDefinition.tableType == MANAGED &&
- tableDefinition.storage.locationUri.isEmpty
-
- val tableLocation = if (needDefaultTableLocation) {
-
Some(CatalogUtils.stringToURI(defaultTablePath(tableDefinition.identifier)))
+ //
+ // When using a remote metastore, and if a managed table is being
created with its
+ // location explicitly set to the location where it would be created
anyway, then do
+ // not set its location explicitly. This avoids an issue with Sentry
in secure clusters.
+ // Otherwise, the above comment applies.
--- End diff --
I'm not sure I really follow that comment. IIRC you can't change the
location of the default database in Hive, so I'm not sure how you'd hit a
situation where "default db uri != warehouse dir", at least when using a remote
metastore.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]