kbendick commented on a change in pull request #2792:
URL: https://github.com/apache/iceberg/pull/2792#discussion_r670805240
##########
File path: spark/src/main/java/org/apache/iceberg/spark/SparkUtil.java
##########
@@ -99,4 +103,32 @@ public static void
validatePartitionTransforms(PartitionSpec spec) {
}
}
}
+
+ /**
+ * Pulls any Catalog specific overrides for the Hadoop conf from the current
SparkSession, which can be
+ * set via spark.sql.catalog.$catalogName.hadoop.*
+ *
+ * The SparkCatalog allows for hadoop configurations to be overridden per
catalog, by setting
+ * them on the SQLConf, where the following will add the property
"fs.default.name" with value
+ * "hdfs://hanksnamenode:8020" to the catalog's hadoop configuration.
+ * SparkSession.builder()
+ * .config(s"spark.sql.catalog.$catalogName.hadoop.fs.default.name",
"hdfs://hanksnamenode:8020")
+ * .getOrCreate()
+ * @param spark The current Spark session
+ * @param catalogName Name of the catalog to find overrides for.
+ * @return the Hadoop Configuration that should be used for this catalog,
with catalog specific overrides applied.
+ */
+ public static Configuration hadoopConfCatalogOverrides(SparkSession spark,
String catalogName) {
+ // Find keys for the catalog intended to be hadoop configurations
+ final String hadoopConfCatalogPrefix = String.format("%s.%s.%s",
ICEBERG_CATALOG_PREFIX, catalogName, "hadoop.");
+ Configuration conf = spark.sessionState().newHadoopConf();
+ spark.sqlContext().conf().settings().forEach((k, v) -> {
+ // These checks are copied from
`spark.sessionState().newHadoopConfWithOptions()`, which we
Review comment:
So I looked into this further, and you're right that `settings` is an
instance of `java.util.collections.SynchronizedMap`.
But the `forEach` method of `SynchronizedMap` already has a `synchronized`
block of it in the definition.
I tried it with `synchronized` around it and the unit tests passed still,
but I'm wondering if it's necessary given that it synchronizes within the class
definition itself on the same object (`this` from the class definition).
I'm going to look into this further and then follow up with you offline
about this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]