linliu-code opened a new pull request, #18806:
URL: https://github.com/apache/hudi/pull/18806

   ### Describe the issue this Pull Request addresses
   
   The Hudi schema-evolution docs at 
<https://hudi.apache.org/docs/schema_evolution/> publish a type-promotion 
matrix that lists `int -> long` and `int -> double` as supported promotions. 
Empirically, on Hudi 1.x master, enabling 
`hoodie.datasource.write.reconcile.schema=true` causes those promotions to fail 
the upsert.
   
   **This PR contains only a test** that documents the observed behavior. It 
does NOT include a fix. The intent is to ask reviewers to confirm:
   
   - **(a)** the test correctly reflects what the docs say should work, and 
this is a bug worth fixing (reconcile.schema should not block documented type 
promotions), OR
   - **(b)** the test is missing a config or usage detail that I should have 
set, and the failure is expected (in which case the docs probably need a 
clarification on the interaction between `reconcile.schema` and the promotion 
matrix).
   
   ### Summary and Changelog
   
   Adds one new test file: 
`hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestTypePromotionWithReconcile.scala`.
   
   The test is `testDocumentedTypePromotionShouldSucceed`, parameterized via 
`@CsvSource` across:
   
   | Dimension | Values |
   |---|---|
   | `tableType` | `COPY_ON_WRITE`, `MERGE_ON_READ` |
   | `reconcile` | `true`, `false` |
   | `promotion` | `INT_TO_LONG`, `INT_TO_DOUBLE` |
   
   → 8 cells total. Per the docs, all 8 should PASS.
   
   **Observed on this branch (off latest master `b82a5b2fe835`):**
   
   ```
   Tests run: 8, Failures: 0, Errors: 4, Skipped: 0
   ```
   
   - The 4 `reconcile=false` cells PASS (promotion works as documented).
   - The 4 `reconcile=true`  cells ERROR with:
     ```
     org.apache.hudi.exception.SchemaCompatibilityException:
       Failed to reconcile incoming schema with the table's one
       at 
HoodieSchemaUtils$.deduceWriterSchemaWithReconcile(HoodieSchemaUtils.scala:233)
     ```
   
   The test deliberately writes the SAME promotion that the docs say is 
supported, with the only difference being the value of `reconcile.schema`. The 
2× upsert pattern is:
   
   1. write initial batch with `col_promote` as `IntegerType` (3 rows)
   2. write second batch with `col_promote` as `LongType` (or `DoubleType`) (2 
rows)
   3. read back, assert 5 rows + promoted type
   
   The existing test `TestBasicSchemaEvolution.scala:78` carries a `// TODO add 
test-case for upcasting` comment — this PR partially addresses that TODO, but 
scoped to just the reconcile interaction so the discussion stays focused.
   
   ### Impact
   
   No source code change. New test only. CI will show 4 failing test cells 
until either:
   - the production behavior changes to honor the documented promotion matrix 
under `reconcile.schema=true`, OR
   - the test is removed because the documented expectation was misread.
   
   ### Risk Level
   
   None — test-only.
   
   ### Documentation Update
   
   If the resolution is **(b)** (expected behavior), the schema-evolution docs 
should clarify that the type-promotion matrix applies only when 
`reconcile.schema=false`, or that `reconcile.schema=true` requires the incoming 
type to exactly match the table type.
   
   If the resolution is **(a)** (bug), no docs change needed — fix the 
validation in `HoodieSchemaUtils.deduceWriterSchemaWithReconcile`.
   
   ### Contributor's checklist
   
   - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [x] Change Logs and Impact were stated clearly
   - [x] Adequate tests were added if applicable (this PR IS the test; no 
production code change)
   - [ ] CI passed — **EXPECTED to FAIL on the 4 reconcile=true cells; that's 
the repro this PR is opening for discussion**


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to