Davis-Zhang-Onehouse opened a new pull request, #18824: URL: https://github.com/apache/hudi/pull/18824
### Describe the issue this Pull Request addresses `HoodieMetadataWriteUtils.createMetadataWriteConfig` builds the MDT `HoodieWriteConfig` from scratch and does not copy `HoodieCommonConfig` / `HoodieMemoryConfig` values from the data-table write config. As a result, user overrides in the spillable-map config family take effect for the data-table writer but are silently ignored by the MDT writer/compactor. The most visible symptom is that `hoodie.common.spillable.diskmap.type=ROCKS_DB` has no effect on MDT compaction. `HoodieCompactor` reads the value via `config.getCommonConfig().getSpillableDiskMapType()`, and `commonConfig` on the MDT `HoodieWriteConfig` is the default-only instance. MDT compaction tasks therefore continue to use a BITCASK `ExternalSpillableMap` and can stall in `BitCaskDiskMap$CompressionHandler.decompressBytes` during merges even after operators apply the override at the table level. ### Summary and Changelog Inherit the configs that drive the MDT writer's `ExternalSpillableMap` and `HoodieMergedLogRecordScanner`: | Key | Defined in | |-----|-----------| | `hoodie.common.spillable.diskmap.type` | HoodieCommonConfig | | `hoodie.common.diskmap.compression.enabled` | HoodieCommonConfig | | `hoodie.memory.spillable.map.path` | HoodieMemoryConfig | | `hoodie.memory.compaction.max.size` | HoodieCommonConfig / HoodieMemoryConfig | | `hoodie.memory.compaction.fraction` | HoodieMemoryConfig | | `hoodie.memory.merge.max.size` | HoodieMemoryConfig | | `hoodie.memory.merge.fraction` | HoodieMemoryConfig | | `hoodie.memory.dfs.buffer.max.size` | HoodieCommonConfig / HoodieMemoryConfig | Propagation is implemented as a static `MDT_INHERITED_SPILLABLE_MAP_CONFIGS` list driving a `containsKey`-guarded copy loop. The guard matters for the two `noDefaultValue()` configs (`SPILLABLE_MAP_BASE_PATH` and `MAX_MEMORY_FOR_COMPACTION`): without it, the MDT side would observe a "user-set" value when the data-table side actually relies on `IOUtils.getMaxMemoryPerCompaction`'s fraction fallback or the inferred default spill path, breaking that fallback chain. This mirrors the precedent set by `fix(metadata): propagate timeline server config from main dataset to metadata (#17486)`. ### Impact No new configs; no default value changes; no public API change. Behavior change is limited to MDT writers when the user has set one of the eight keys above: previously the MDT writer silently ignored the override and used the default, now it honors the user value. ### Risk Level low — the propagation is scoped to a small explicit list of configs, defaults remain unchanged, and an opt-in (no-override) path is preserved via the `containsKey` guard. Existing `TestHoodieMetadataWriteUtils` tests continue to pass (19/19). ### Documentation Update none — existing config docs already describe the user-facing keys; this PR just makes the MDT writer honor them. ### Contributor's checklist - [x] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [x] Enough context is provided in the sections above - [x] Adequate tests were added (`TestHoodieMetadataWriteUtils#testSpillableMapConfigPropagation`) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
