kbuci opened a new pull request, #18297:
URL: https://github.com/apache/hudi/pull/18297
### Describe the issue this Pull Request addresses
Spark SQL write operations (CREATE TABLE, INSERT INTO, DELETE, etc.) do not
consume configs set with the `spark.hoodie.*` prefix. Users expect that
setting
`spark.hoodie.metadata.enable=false` (or any other hoodie config behind the
`spark.` namespace) in their Spark session would be respected by both reads
and
writes, but the write path silently ignores these configs because
`combineOptions` only passes through keys that start with `hoodie.`.
The read path already handles this correctly via `DefaultSource` and
`DataSourceOptionsHelper.parametersWithReadDefaults`, but the write path in
`ProvidesHoodieConfig.combineOptions` does not.
### Summary and Changelog
`spark.hoodie.*` SQL configs are now normalized to `hoodie.*` and included in
the write-path option merging, with `hoodie.*` keys still taking precedence
when
both forms are set.
- Added `extractSparkPrefixedHoodieConfigs` to `HoodieSqlCommonUtils` that
filters `spark.hoodie.*` keys and strips the `spark.` prefix.
- Updated `ProvidesHoodieConfig.combineOptions` to include the normalized
`spark.hoodie.*` configs at a priority level just below `hoodie.*` SQL
configs.
- Added a unit test in `TestSqlConf` that sets `spark.hoodie.metadata.enable`
to `false` via `withSQLConf`, performs a Spark SQL insert, and asserts the
metadata table is not created.
### Impact
Users can now set Hudi write configs using the `spark.hoodie.*` prefix in
their
Spark session and have them respected by Spark SQL DML operations. No public
API
changes. No breaking changes — existing `hoodie.*` configs continue to take
precedence.
### Risk Level
Low. The change is additive and follows the same normalization pattern
already
used on the read path. Existing config resolution order is preserved;
`hoodie.*` keys still override `spark.hoodie.*` keys when both are present.
### Documentation Update
None. The `spark.hoodie.*` prefix convention is already documented; this
change
fixes the write path to honor it consistently.
### Contributor's checklist
- [x] Read through [contributor's
guide](https://hudi.apache.org/contribute/how-to-contribute)
- [x] Enough context is provided in the sections above
- [x] Adequate tests were added if applicable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]