baibaichen opened a new issue, #11683:
URL: https://github.com/apache/incubator-gluten/issues/11683

   **Labels**: enhancement, VELOX
   
   ---
   
   ## Description
   
   Enable the `GlutenParquetTypeWideningSuite` test suite for Spark 4.0 and 
4.1, which validates Parquet type widening support 
([SPARK-40876](https://issues.apache.org/jira/browse/SPARK-40876)).
   
   ### Background
   
   `GlutenParquetTypeWideningSuite` has **84 tests** covering two types of 
Parquet type conversions:
   1. **Physical→Logical type restoration**: Reading `int32 + INT(8)` as 
`TINYINT` (safe, writer guarantees value range)
   2. **Schema evolution widening**: Reading old `IntegerType` data as 
`LongType`, `DoubleType`, or `DecimalType` (Spark 4.0 feature)
   
   Currently the suite is disabled with **74 out of 84 tests failing**. The 
failures fall into four categories:
   
   | Category | Count | Issue | Fix |
   |----------|-------|-------|-----|
   | A | 13 | Velox doesn't support INT→DOUBLE/REAL/DECIMAL widening | Velox 
C++ `convertType()` extension |
   | B | 29 | Exception type mismatch + no Decimal precision check | Exception 
translation + C++ precision check |
   | C | 31 | Parquet V2 encoding assertions + Decimal conversion limits | 
Disable native writer + test overrides + Velox C++ |
   | D | 1 | parquet-mr only decimal narrowing overflow→null | Exclude (cannot 
reproduce with native reader) |
   
   ### Plan
   
   This will be addressed in **3 PRs**:
   
   1. **PR 1 — Exception translation**: Add `translateException()` to convert 
Velox type errors to `SchemaColumnConvertNotSupportedException`. Enable the 
suite with appropriate excludes/overrides for tests that pass without C++ 
changes.
   
   2. **PR 2 — SPARK-18108 + Revert OAP**: Fix partition column type conflicts. 
Import upstream Velox [PR 
#15173](https://github.com/facebookincubator/velox/pull/15173).
   
   3. **PR 3 — Type widening implementation**: Velox C++ changes for 
INT→DOUBLE/REAL/DECIMAL and Decimal→Decimal widening. Requires upstream Velox 
PR first, then enable remaining tests.
   
   ### Test Results (Target)
   
   | | Spark 4.0 | Spark 4.1 |
   |--|-----------|-----------|
   | ✅ Passed | 46 | 46 |
   | 🟢 Override (passed) | 35 | 35 |
   | ❌ Excluded | 3 | 3 |
   | **Total** | **84** | **84** |
   
   Sub-issue of #11550.
   
   This issue was written with the assistance of AI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to