Re: [PR] [MINOR] Forward spark.hoodie.* SparkConf to write path (parity with read path) [hudi]

via GitHub Tue, 23 Jun 2026 12:01:22 -0700


nsivabalan commented on code in PR #18650:
URL: https://github.com/apache/hudi/pull/18650#discussion_r3462211181



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala:
##########
@@ -1033,9 +1034,87 @@ object DataSourceOptionsHelper {
   private val log = LoggerFactory.getLogger(DataSourceOptionsHelper.getClass)
 
   // Prefix constants for config normalization
+  private val HOODIE_PREFIX = "hoodie."
   private val SPARK_HOODIE_PREFIX = "spark.hoodie."
   private val SPARK_PREFIX = "spark."
 
+  /**
+   * Collects `hoodie.*` and `spark.hoodie.*` configs from the SparkConf, 
normalizes the
+   * `spark.hoodie.*` keys to canonical `hoodie.*`, and merges with explicit 
DataFrame
+   * options. Explicit options win over SparkConf.
+   *
+   * This is the read-path entry point: reads have always picked up 
session-level `hoodie.*`
+   * confs (e.g. `hoodie.datasource.query.type`), so both prefixes are 
forwarded here.
+   * Do NOT use this for writes — see `collectSparkHoodieConfs` for why 
ambient `hoodie.*`
+   * confs must not be forwarded to the write path.
+   *
+   * Example (SparkConf has both prefixes set; explicit options override):
+   * {{{
+   *   SparkConf:  spark.hoodie.X = "a", hoodie.Y = "b"
+   *   optParams:  hoodie.X = "c"
+   *   result:     hoodie.X = "c"   // explicit wins over both prefixes
+   *               hoodie.Y = "b"
+   * }}}
+   */
+  def collectHoodieAndSparkHoodieConfs(sqlContext: SQLContext,

Review Comment:
   Can we change the naming of the method `collectHoodieAndSparkHoodieConfs` to 
differentiate reads and writes. 
   
   as of now, we rely on documentation for devs to not make un intended changes 
in future. i.e to call the right method when refactoring or fixing bugs. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [MINOR] Forward spark.hoodie.* SparkConf to write path (parity with read path) [hudi]

Reply via email to