linliu-code commented on code in PR #13959:
URL: https://github.com/apache/hudi/pull/13959#discussion_r2377189006
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala:
##########
@@ -217,28 +218,28 @@ object HoodieWriterUtils {
}
/**
- * This function adds specific rules to choose the right config key for
payload class for version 9 tables.
- *
- * RULE 1: When
- * 1. table version is 9,
- * 2. writer key is a payload class key, and
- * 3. table config has legacy payload class configured,
- * then
- * return legacy payload class key.
- *
- * Basic rule:
- * return writer key.
+ * For table version >= 9, this function finds the corresponding key in
`HoodieTableConfig`
+ * for a key in `HoodieWriteConfig`, including configs related to payload
class, record merge mode, merge strategy id.
*/
- def getPayloadClassConfigKeyFromTableConfig(key: String, tableConfig:
HoodieConfig): String = {
- if (tableConfig == null) {
+ def getKeyInTableConfig(key: String, tableConfig: HoodieConfig): String = {
+ if (tableConfig == null || tableConfig.getInt(HoodieTableConfig.VERSION) <
HoodieTableVersion.NINE.versionCode()) {
Review Comment:
Tried. very verbose:
HoodieTableVersion.fromVersionCode(tableConfig.getInt(HoodieTableConfig.VERSION)).lesserThan(HoodieTableVersion.NINE).
Let us keep this way, it is clear already.
##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestPartialUpdateForMergeInto.scala:
##########
@@ -681,9 +681,7 @@ class TestPartialUpdateForMergeInto extends
HoodieSparkSqlTestBase {
test("Test MergeInto Partial Updates should fail with CUSTOM payload and
merge mode") {
withTempDir { tmp =>
withSQLConf(
- "hoodie.index.type" -> "GLOBAL_SIMPLE",
- "hoodie.write.record.merge.mode" -> "CUSTOM",
- "hoodie.datasource.write.payload.class" ->
"org.apache.hudi.common.testutils.reader.HoodieRecordTestPayload") {
Review Comment:
Using original setting, after the `create_table` statement, its table
properties are as follows. We can see the merge mode, payload class, etc. are
not expected. Therefore, after adding the merge mode check, this query failed.
Why previously it works? I think these `runtime` merge configs worked during
the write/read by overwriting the table properties, which is what we want to
forbid.
> #Properties saved on 2025-09-24T22:14:33.394Z
#Wed Sep 24 22:14:33 UTC 2025
hoodie.table.format=native
hoodie.table.version=9
hoodie.database.name=default
hoodie.table.initial.version=9
hoodie.datasource.write.hive_style_partitioning=true
hoodie.table.keygenerator.type=SIMPLE
hoodie.table.create.schema={"type"\:"record","name"\:"h1_record","namespace"\:"hoodie.h1","fields"\:[{"name"\:"record_key","type"\:["string","null"]},{"name"\:"name","type"\:["string","null"]},{"name"\:"age","type"\:["int","null"]},{"name"\:"salary","type"\:["double","null"]},{"name"\:"ts","type"\:["long","null"]},{"name"\:"department","type"\:["string","null"]}]}
hoodie.archivelog.folder=archived
hoodie.table.name=h1
hoodie.record.merge.strategy.id=eeb8d96f-b1e4-49fd-bbf8-28ac514178e5
hoodie.timeline.history.path=history
hoodie.table.type=MERGE_ON_READ
hoodie.datasource.write.partitionpath.urlencode=false
hoodie.datasource.write.drop.partition.columns=false
hoodie.timeline.layout.version=2
hoodie.record.merge.mode=EVENT_TIME_ORDERING
hoodie.table.partition.fields=department
hoodie.table.recordkey.fields=record_key
hoodie.timeline.path=timeline
hoodie.table.ordering.fields=ts
hoodie.table.checksum=997088624
##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestPartialUpdateAvroPayload.scala:
##########
@@ -122,8 +122,6 @@ class TestPartialUpdateAvroPayload extends
HoodieClientTestBase {
.option(HoodieTableConfig.ORDERING_FIELDS.key, "ts")
.option(DataSourceWriteOptions.OPERATION.key,
DataSourceWriteOptions.UPSERT_OPERATION_OPT_VAL)
.option(HoodieWriteConfig.TBL_NAME.key, "hoodie_test")
- .option(HoodieWriteConfig.WRITE_PAYLOAD_CLASS_NAME.key,
"org.apache.hudi.common.model.PartialUpdateAvroPayload")
- .option(HoodieWriteConfig.RECORD_MERGE_MODE.key(),
RecordMergeMode.CUSTOM.name())
Review Comment:
Previously we don't do the check; after we add the check, this test could
fail. Therefore, I remove these configs.
##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestPartialUpdateForMergeInto.scala:
##########
@@ -681,9 +681,7 @@ class TestPartialUpdateForMergeInto extends
HoodieSparkSqlTestBase {
test("Test MergeInto Partial Updates should fail with CUSTOM payload and
merge mode") {
withTempDir { tmp =>
withSQLConf(
- "hoodie.index.type" -> "GLOBAL_SIMPLE",
- "hoodie.write.record.merge.mode" -> "CUSTOM",
- "hoodie.datasource.write.payload.class" ->
"org.apache.hudi.common.testutils.reader.HoodieRecordTestPayload") {
Review Comment:
During the test it does not work that way. I guess it could be some gaps.
Not sure if previous test really worked as expected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]