Prathamesh9284 commented on issue #690:
URL: https://github.com/apache/wayang/issues/690#issuecomment-3914300188
Hi @zkaoudi
````markdown
### Root Cause
In `WayangTableScanVisitor.java` (line 67), the `fieldTypes` list is built
using:
```java
wayangRelNode.getRowType()
````
This returns the **RelNode’s row type**, which in certain cases may contain
fewer fields than the actual table schema (e.g., when Calcite optimizes away
unused columns).
However, the CSV source always reads **all columns from disk**, leading to a
mismatch between `tokens.length` and `fieldTypes.size()`, which causes runtime
issues.
---
### Proposed Fix
Update line 67 in `WayangTableScanVisitor.java`:
**From:**
```java
final List<RelDataType> fieldTypes =
wayangRelNode.getRowType().getFieldList().stream()
```
**To:**
```java
final List<RelDataType> fieldTypes =
wayangRelNode.getTable().getRowType().getFieldList().stream()
```
Using `getTable().getRowType()` ensures we always retrieve the full table
schema, which aligns with how `getColumnNames()` is implemented in
`WayangTableScan.java` (line 98).
Column pruning is still correctly handled downstream by the `WayangProject`
operator via a `MapOperator`.
---
### Testing
I added a regression test using Mockito that:
* Simulates a `WayangTableScan` with a trimmed row type (1 field)
* While the underlying table schema contains 4 fields
* Reproduces the exact scenario described in this issue
The test:
* Fails before the fix
* Passes after applying the fix
* All existing tests continue to pass
---
If this approach looks correct, I’ll proceed with opening a PR including the
fix and the regression test.
Looking forward to feedback!
```
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]