baibaichen opened a new pull request, #11689:
URL: https://github.com/apache/incubator-gluten/pull/11689
## What changes were proposed in this pull request?
Fix two Parquet reading issues and enable 2 more tests in
`GlutenParquetTypeWideningSuite`.
### Changes
1. **Velox: Replace OAP INT narrowing with upstream PR #15173**
(`get-velox.sh`):
- Skip OAP commit `16732b4f5` (`[OAP][15173][15343] Allow reading
integers into smaller-range types`) which over-relaxed `convertType()` type
checks
- Import upstream Velox [PR
#15173](https://github.com/facebookincubator/velox/pull/15173) (fix reading
array of row) to fix parquet-thrift compatibility
- Add bare INT32 → TINYINT/SMALLINT support (physical type restoration)
2. **Fix SPARK-18108** (`SubstraitToVeloxPlan.cc`): Exclude partition
columns from `HiveTableHandle.dataColumns()` to prevent type validation
failures when partition column types differ from file column types.
3. **Update VeloxTestSettings** (`spark40 + spark41`): Remove 2 excludes for
`LongType→IntegerType` and `LongType→DateType` which now properly pass.
### Test Results
| | PR1 | PR2 |
|--|-----|-----|
| ✅ Passed | 45 | 47 (+2) |
| ❌ Excluded | 39 | 37 (-2) |
Additionally fixed (not in TypeWideningSuite):
- SPARK-18108 Parquet reader fails when data column types conflict with
partition ones (V1+V2)
- Read Parquet file generated by parquet-thrift
Depends on #11684 (PR1).
Fixes #11683
## How was this patch tested?
Local tests: TypeWideningSuite 47/0/58, SPARK-18108 ✅, parquet-thrift ✅, ORC
Decimal ✅, VeloxScanSuite index schema evolution ✅.
## Was this patch authored or co-authored using generative AI tooling?
Yes, co-authored with GitHub Copilot.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]