baibaichen opened a new pull request, #11689:
URL: https://github.com/apache/incubator-gluten/pull/11689

   ## What changes were proposed in this pull request?
   
   Fix two Parquet reading issues and enable 2 more tests in 
`GlutenParquetTypeWideningSuite`.
   
   ### Changes
   
   1. **Velox: Replace OAP INT narrowing with upstream PR #15173** 
(`get-velox.sh`):
      - Skip OAP commit `16732b4f5` (`[OAP][15173][15343] Allow reading 
integers into smaller-range types`) which over-relaxed `convertType()` type 
checks
      - Import upstream Velox [PR 
#15173](https://github.com/facebookincubator/velox/pull/15173) (fix reading 
array of row) to fix parquet-thrift compatibility
      - Add bare INT32 → TINYINT/SMALLINT support (physical type restoration)
   
   2. **Fix SPARK-18108** (`SubstraitToVeloxPlan.cc`): Exclude partition 
columns from `HiveTableHandle.dataColumns()` to prevent type validation 
failures when partition column types differ from file column types.
   
   3. **Update VeloxTestSettings** (`spark40 + spark41`): Remove 2 excludes for 
`LongType→IntegerType` and `LongType→DateType` which now properly pass.
   
   ### Test Results
   
   | | PR1 | PR2 |
   |--|-----|-----|
   | ✅ Passed | 45 | 47 (+2) |
   | ❌ Excluded | 39 | 37 (-2) |
   
   Additionally fixed (not in TypeWideningSuite):
   - SPARK-18108 Parquet reader fails when data column types conflict with 
partition ones (V1+V2)
   - Read Parquet file generated by parquet-thrift
   
   Depends on #11684 (PR1).
   Fixes #11683
   
   ## How was this patch tested?
   
   Local tests: TypeWideningSuite 47/0/58, SPARK-18108 ✅, parquet-thrift ✅, ORC 
Decimal ✅, VeloxScanSuite index schema evolution ✅.
   
   ## Was this patch authored or co-authored using generative AI tooling?
   
   Yes, co-authored with GitHub Copilot.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to