kbendick commented on a change in pull request #2792:
URL: https://github.com/apache/iceberg/pull/2792#discussion_r667164077
##########
File path: spark/src/main/java/org/apache/iceberg/spark/SparkUtil.java
##########
@@ -99,4 +103,30 @@ public static void
validatePartitionTransforms(PartitionSpec spec) {
}
}
}
+
+ /**
+ * Pulls any Catalog specific overrides for the Hadoop conf from the current
SparkSession, which can be
+ * set via spark.sql.catalog.$catalogName.hadoop.*
+ *
+ * The SparkCatalog allows for hadoop configurations to be overridden per
catalog, by setting
+ * them on the SQLConf, where the following will add the property
"fs.default.name" with value
+ * "hdfs://hanksnamenode:8020" to the catalog's hadoop configuration.
+ * SparkSession.builder()
+ * .config(s"spark.sql.catalog.$catalogName.hadoop.fs.default.name",
"hdfs://hanksnamenode:8020")
+ * .getOrCreate()
+ * @param spark The current Spark session
+ * @param catalogName Name of the catalog to find overrides for.
+ * @return the Hadoop Configuration that should be used for this catalog,
with catalog specific overrides applied.
+ */
+ public static Configuration hadoopConfCatalogOverrides(SparkSession spark,
String catalogName) {
+ // Find keys for the catalog intended to be hadoop configurations
+ final String hadoopConfCatalogPrefix = String.format("%s.%s.%s",
ICEBERG_CATALOG_PREFIX, catalogName, "hadoop.");
+ Configuration conf = spark.sessionState().newHadoopConf();
+ spark.sqlContext().conf().settings().forEach((k, v) -> {
+ if (v != null && k.startsWith(hadoopConfCatalogPrefix)) {
Review comment:
I left a comment re: where these checks came from.
It's possible that maybe we _do_ want to allow for `null` values so users
can unset configs that are set on the default hadoop configuration for the
session. But I'm not sure if setting to `null` would correctly revert it to the
default value. I think they'd need to explicitly set it to whatever default
value they want.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]