pchintar opened a new pull request, #9806:
URL: https://github.com/apache/arrow-rs/pull/9806
# Which issue does this PR close?
- Closes #9805 .
# Rationale for this change
When reading IPC data with column selection enabled, skipping a `ListView`
or `LargeListView` column can lead to buffer misalignment and incorrect
decoding of subsequent columns.
In `arrow-ipc/src/reader.rs`, `skip_field` currently does not handle these
types explicitly and falls back to the default case:
```rust
_ => {
self.skip_buffer();
self.skip_buffer();
}
```
However, `create_array` for `ListView` / `LargeListView` reads three buffers:
```rust
self.next_buffer()?; // null
self.next_buffer()?; // offsets
self.next_buffer()?; // sizes
```
This mismatch means that when a `ListView` column is skipped, fewer buffers
are consumed than expected. As a result, the next column reads from incorrect
buffer positions, which can lead to runtime errors or incorrect values/results.
This change aligns the skip behavior with the read path to ensure buffers
remain correctly aligned when columns are skipped.
# What changes are included in this PR?
* Updated `skip_field` in:
* `arrow-ipc/src/reader.rs`
* Added explicit handling for:
* `ListView`
* `LargeListView`
* Ensures the number of skipped buffers matches how these types are encoded
and read.
# Are these changes tested?
Yes.
* Added a regression test:
* `test_projection_skip_list_view` in `arrow-ipc/src/reader.rs`
* The test:
* creates a batch with a `ListView` column followed by a primitive column
* reads only the second column
* verifies the result matches expected values
Before the fix, the current code failed this test with a buffer size error:
```
InvalidArgumentError("Need at least 16 bytes in buffers[0] in array of type
Int32, but got 1")
```
After the changes made in `skip_field`, it passes.
All existing `arrow-ipc` tests also pass.
# Are there any user-facing changes?
No.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]