Kontinuation opened a new pull request, #20393:
URL: https://github.com/apache/datafusion/pull/20393
## Which issue does this PR close?
N/A (newly discovered bug)
## Rationale for this change
`ForeignTableProvider::scan()` converts a `None` projection (meaning "return
all columns") into an empty `RVec<usize>` before passing it across the FFI
boundary. On the receiving side, `scan_fn_wrapper` always wraps the received
`RVec` in `Some(...)`, passing `Some(&vec![])` to the inner
`TableProvider::scan()`. This means "project zero columns" — the exact opposite
of the intended "project all columns."
The root cause is that the `FFI_TableProvider::scan` function signature uses
`RVec<usize>` for the projections parameter. Since `RVec<usize>` cannot
represent `None`, the `None` vs. empty-vec distinction is lost at the FFI
boundary.
## What changes are included in this PR?
Three coordinated changes in `datafusion/ffi/src/table_provider.rs`:
1. **FFI struct definition**: Changed `scan` function pointer signature from
`RVec<usize>` to `ROption<RVec<usize>>` for the projections parameter, matching
how `limit` already uses `ROption<usize>` for the same `None`-vs-value
distinction.
2. **Receiver side** (`scan_fn_wrapper`): Converts `ROption<RVec<usize>>`
via `.into_option().map(...)` and passes `projections.as_ref()` to the inner
provider, preserving `None` semantics.
3. **Sender side** (`ForeignTableProvider::scan`): Converts
`Option<&Vec<usize>>` to `ROption<RVec<usize>>` via `.into()` instead of using
`unwrap_or_default()`.
Plus a new unit test `test_scan_with_none_projection_returns_all_columns`
that directly exercises the FFI round-trip with `projection=None` and verifies
all 3 columns are returned.
Also fixed the existing `test_aggregation` test to set `library_marker_id =
mock_foreign_marker_id` so it actually exercises the FFI path instead of taking
the local bypass.
## How are these changes tested?
- New test `test_scan_with_none_projection_returns_all_columns`: creates a
3-column MemTable, wraps it through FFI with the foreign marker set, calls
`scan(None)`, and asserts 3 columns are returned (previously returned 0).
- All 89 existing FFI tests pass (78 unit + 11 integration including
cross-dylib round-trips).
- Full `cargo test --workspace` passes (the only failures are pre-existing
sqllogictest `aggregate.slt` failures on `main` due to test data submodule
state — unrelated to this change).
- FFI example crates (`ffi_module_interface`, `ffi_example_table_provider`)
build successfully.
## Are these changes safe?
This is a **breaking FFI ABI change** to the `FFI_TableProvider::scan`
function pointer signature. However:
- The `abi_stable` crate's `#[derive(StableAbi)]` generates layout checks at
dylib load time, so mismatched dylibs will be caught at load rather than
causing silent corruption.
- All existing providers construct `FFI_TableProvider` via `::new()` or
`::new_with_ffi_codec()`, which internally wire up `scan_fn_wrapper` — nobody
constructs the `scan` function pointer manually.
## Root cause detail
**Sender** (`ForeignTableProvider::scan`, line ~477 on main):
```rust
let projections: Option<RVec<usize>> =
projection.map(|p| p.iter().map(|v| v.to_owned()).collect());
projections.unwrap_or_default(), // BUG: None becomes empty RVec
```
**Receiver** (`scan_fn_wrapper`, line ~272-276 on main):
```rust
let projections: Vec<_> = projections.into_iter().collect();
internal_provider
.scan(session, Some(&projections), ...) // Always Some — empty vec = 0
columns
```
This bug is masked when `optimize_projections` pushes down explicit
projections before reaching the FFI layer. It surfaces when custom
`ExtensionPlanNode` implementations cause `optimize_projections` to skip a
subtree, leaving `projection: None` on child `TableScan` nodes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]