linliu-code opened a new pull request, #18806: URL: https://github.com/apache/hudi/pull/18806
### Describe the issue this Pull Request addresses The Hudi schema-evolution docs at <https://hudi.apache.org/docs/schema_evolution/> publish a type-promotion matrix that lists `int -> long` and `int -> double` as supported promotions. Empirically, on Hudi 1.x master, enabling `hoodie.datasource.write.reconcile.schema=true` causes those promotions to fail the upsert. **This PR contains only a test** that documents the observed behavior. It does NOT include a fix. The intent is to ask reviewers to confirm: - **(a)** the test correctly reflects what the docs say should work, and this is a bug worth fixing (reconcile.schema should not block documented type promotions), OR - **(b)** the test is missing a config or usage detail that I should have set, and the failure is expected (in which case the docs probably need a clarification on the interaction between `reconcile.schema` and the promotion matrix). ### Summary and Changelog Adds one new test file: `hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestTypePromotionWithReconcile.scala`. The test is `testDocumentedTypePromotionShouldSucceed`, parameterized via `@CsvSource` across: | Dimension | Values | |---|---| | `tableType` | `COPY_ON_WRITE`, `MERGE_ON_READ` | | `reconcile` | `true`, `false` | | `promotion` | `INT_TO_LONG`, `INT_TO_DOUBLE` | → 8 cells total. Per the docs, all 8 should PASS. **Observed on this branch (off latest master `b82a5b2fe835`):** ``` Tests run: 8, Failures: 0, Errors: 4, Skipped: 0 ``` - The 4 `reconcile=false` cells PASS (promotion works as documented). - The 4 `reconcile=true` cells ERROR with: ``` org.apache.hudi.exception.SchemaCompatibilityException: Failed to reconcile incoming schema with the table's one at HoodieSchemaUtils$.deduceWriterSchemaWithReconcile(HoodieSchemaUtils.scala:233) ``` The test deliberately writes the SAME promotion that the docs say is supported, with the only difference being the value of `reconcile.schema`. The 2× upsert pattern is: 1. write initial batch with `col_promote` as `IntegerType` (3 rows) 2. write second batch with `col_promote` as `LongType` (or `DoubleType`) (2 rows) 3. read back, assert 5 rows + promoted type The existing test `TestBasicSchemaEvolution.scala:78` carries a `// TODO add test-case for upcasting` comment — this PR partially addresses that TODO, but scoped to just the reconcile interaction so the discussion stays focused. ### Impact No source code change. New test only. CI will show 4 failing test cells until either: - the production behavior changes to honor the documented promotion matrix under `reconcile.schema=true`, OR - the test is removed because the documented expectation was misread. ### Risk Level None — test-only. ### Documentation Update If the resolution is **(b)** (expected behavior), the schema-evolution docs should clarify that the type-promotion matrix applies only when `reconcile.schema=false`, or that `reconcile.schema=true` requires the incoming type to exactly match the table type. If the resolution is **(a)** (bug), no docs change needed — fix the validation in `HoodieSchemaUtils.deduceWriterSchemaWithReconcile`. ### Contributor's checklist - [x] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [x] Change Logs and Impact were stated clearly - [x] Adequate tests were added if applicable (this PR IS the test; no production code change) - [ ] CI passed — **EXPECTED to FAIL on the 4 reconcile=true cells; that's the repro this PR is opening for discussion** -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
