kbuci opened a new pull request, #18297:
URL: https://github.com/apache/hudi/pull/18297

   ### Describe the issue this Pull Request addresses
   
   Spark SQL write operations (CREATE TABLE, INSERT INTO, DELETE, etc.) do not
   consume configs set with the `spark.hoodie.*` prefix. Users expect that 
setting
   `spark.hoodie.metadata.enable=false` (or any other hoodie config behind the
   `spark.` namespace) in their Spark session would be respected by both reads 
and
   writes, but the write path silently ignores these configs because
   `combineOptions` only passes through keys that start with `hoodie.`.
   
   The read path already handles this correctly via `DefaultSource` and
   `DataSourceOptionsHelper.parametersWithReadDefaults`, but the write path in
   `ProvidesHoodieConfig.combineOptions` does not.
   
   ### Summary and Changelog
   
   `spark.hoodie.*` SQL configs are now normalized to `hoodie.*` and included in
   the write-path option merging, with `hoodie.*` keys still taking precedence 
when
   both forms are set.
   
   - Added `extractSparkPrefixedHoodieConfigs` to `HoodieSqlCommonUtils` that
     filters `spark.hoodie.*` keys and strips the `spark.` prefix.
   - Updated `ProvidesHoodieConfig.combineOptions` to include the normalized
     `spark.hoodie.*` configs at a priority level just below `hoodie.*` SQL 
configs.
   - Added a unit test in `TestSqlConf` that sets `spark.hoodie.metadata.enable`
     to `false` via `withSQLConf`, performs a Spark SQL insert, and asserts the
     metadata table is not created.
   
   ### Impact
   
   Users can now set Hudi write configs using the `spark.hoodie.*` prefix in 
their
   Spark session and have them respected by Spark SQL DML operations. No public 
API
   changes. No breaking changes — existing `hoodie.*` configs continue to take
   precedence.
   
   ### Risk Level
   
   Low. The change is additive and follows the same normalization pattern 
already
   used on the read path. Existing config resolution order is preserved;
   `hoodie.*` keys still override `spark.hoodie.*` keys when both are present.
   
   ### Documentation Update
   
   None. The `spark.hoodie.*` prefix convention is already documented; this 
change
   fixes the write path to honor it consistently.
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Enough context is provided in the sections above
   - [x] Adequate tests were added if applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to