nsivabalan commented on PR #18650:
URL: https://github.com/apache/hudi/pull/18650#issuecomment-4765526670

   Thanks @prashantwason for the parity fix — this is a clean follow-up to 
#18205, which fixed the read path but never the write path.
   
   I went through the diff carefully and made the following follow-up changes 
(commit 682234882454):
   
   ### Refactor: normalize at collection time, not just in the per-path helpers
   
   The previous shape had `collectHoodieAndSparkHoodieConfs` collect raw keys 
(still in `spark.hoodie.*` form) and rely on `parametersWithReadDefaults` / 
`parametersWithWriteDefaults` to strip the prefix later. This meant 
intermediate transforms between collection and defaulting saw the 
`spark.hoodie.*` keys — fragile if those transforms ever start reading keys 
directly.
   
   Moved normalization into `collectHoodieAndSparkHoodieConfs` so the 
entry-point helper emits canonical keys. `normalizeSparkHoodiePrefix` is now 
idempotent and is still called from `parametersWithReadDefaults` / 
`parametersWithWriteDefaults` as defense-in-depth for callers that bypass 
`collectHoodieAndSparkHoodieConfs` (more on this below).
   
   ### Behavior note: SQL ALTER TABLE paths also get parity now
   
   `parametersWithWriteDefaults` has callers beyond 
`DefaultSource.createRelation`:
   
   - `HoodieCLIUtils.parametersWithWriteDefaults(...)` (CLI tools)
   - `AlterTableCommand` / `AlterHoodieTableAddColumnsCommand` (SQL `ALTER 
TABLE`)
   
   These previously dropped `spark.hoodie.*` too; with this PR they honor them. 
That's the correct parity direction, but worth being explicit about in the PR 
description (updated).
   
   ### Tests added (`TestDataSourceOptions`)
   
   - Direct unit tests on `normalizeSparkHoodiePrefix`: strip semantics, 
precedence (`hoodie.*` wins on conflict), idempotence.
   - Direct unit tests on `collectHoodieAndSparkHoodieConfs`: filters 
non-hoodie keys; explicit options override both prefixes from SparkConf.
   - Write-path defaulting tests mirroring the existing read tests 
(`testWriteDefaultsSupportSparkHoodieConfigs`, 
`testWriteDefaultsPreferHoodieOverSparkHoodieWhenBothSet`).
   
   ### Other small cleanups
   
   - Use the new `HOODIE_PREFIX` constant instead of the hard-coded `"hoodie."` 
string literal in `collectHoodieAndSparkHoodieConfs`.
   - Expanded Javadoc on both helpers with an input/output example and 
idempotence note.
   
   Pushed to the same branch. Ready for review. @yihua @danny0405 PTAL.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to