pchintar opened a new issue, #9805:
URL: https://github.com/apache/arrow-rs/issues/9805
### Description
When reading IPC data with column projection enabled, skipping a `ListView`
or `LargeListView` column can lead to buffer misalignment and incorrect
decoding of subsequent columns.
---
### Root Cause
In `arrow-ipc/src/reader.rs`, `skip_field` does not handle `ListView` /
`LargeListView` explicitly. As a result, these types fall through to the
default case:
```rust
_ => {
self.skip_buffer();
self.skip_buffer();
}
```
However, `ListView` columns are encoded with three buffers:
* null buffer
* offsets
* sizes
And `create_array` correctly consumes all three:
```rust
self.next_buffer()?; // null
self.next_buffer()?; // offsets
self.next_buffer()?; // sizes
```
This mismatch causes the buffer stream to become misaligned during
projection.
---
### Impact
* Can lead to:
* incorrect decoding of subsequent columns
* runtime errors (e.g., invalid buffer sizes)
* Only occurs when:
* projection is enabled
* a `ListView` / `LargeListView` column is skipped
---
### Reproduction
A minimal test case:
```rust
// Schema:
// a: ListView<Int32> (skipped)
// b: Int32 (projected)
let reader = FileReader::try_new(cursor, Some(vec![1]))?;
```
Before fix:
```
InvalidArgumentError("Need at least 16 bytes in buffers[0] in array of type
Int32, but got 1")
```
---
### Proposed Fix
Handle `ListView` and `LargeListView` in `skip_field` so that the number of
skipped buffers matches how these types are encoded and read elsewhere in the
IPC reader.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]