alamb opened a new pull request, #10054: URL: https://github.com/apache/arrow-rs/pull/10054
# Which issue does this PR close? Follow-up / test coverage for #9755 (`arrow-select: fuse inline Utf8View/BinaryView filter coalescing`). > [!NOTE] > This branch is **stacked on top of #9755** — that PR is not yet merged to `main`, so the diff here also shows its feature commit (`f546a78`). The contribution in this PR is the **single test commit** on top; the intent is to add it to #9755 (or merge alongside it). Reviewers should focus on the `*tests*` changes. # Rationale for this change While reviewing #9755 I found, via `cargo llvm-cov`, that the new fused inline-view copy fast path had **zero test coverage**. The path is only chosen when the filter is sparse enough (`selected * 16 <= filter.len()`, i.e. ≤6.25% selectivity), but every `*_filtered_inline` test in the PR uses `sparse_filter` = `idx % 8 == 0` (12.5%), so they all fall back to the materialized `filter` kernel before reaching the fused code. llvm-cov on the unmodified PR showed **0 executions** of: - `push_batch_with_fused_inline_filter` copy tail (`coalesce.rs`) - `append_inline_views_by_filter` and `append_nulls_by_filter` (`coalesce/byte_view.rs`) - the success path of `InProgressByteViewArray::copy_rows_by_filter` # What changes are included in this PR? Tests only — no production code is touched: - **`coalesce.rs`** — end-to-end `BatchCoalescer` tests at 5% selectivity (`idx % 20 == 0`) with all-inline values: single `BinaryView`, single `StringView`, a mixed `Int32`/`Float64`/`StringView`/`BinaryView` batch (which trips `FilterBuilder::optimize`, exercising the *materialized* selection), and a case whose buffered rows cross `target_batch_size` so the in-loop `finish_buffered_batch` flush is covered. - **`coalesce/byte_view.rs`** — direct `copy_rows_by_filter` unit tests covering the lazy and materialized `Indices` and `Slices` arms of `append_inline_views_by_filter` for both view types, asserted against the reference `filter` kernel. - **`coalesce/primitive.rs`** — a `Slices`-selection test covering the `copy_rows_by_selection` fallback arm of the primitive override. After these tests, every **reachable** line of the fused path is covered. The unsafe view copy passes under **MIRI** for all new tests. # Are there any user-facing changes? No. --- ### Note for the author of #9755 (not blocking) The `None` and `All` arms of `append_inline_views_by_filter` are **unreachable in production** (the dispatch shortcuts all-selected at `coalesce.rs` `selected_count == num_rows` → `push_batch`, and empty at `selected_count == 0` → early return), so I did not add tests that pretend they are live paths. Worth noting: those two arms are also subtly inconsistent — if `copy_rows_by_filter` were ever called with an `All`/`None`-strategy predicate on a byte-view array **that has nulls**, `append_nulls_by_filter` → `filter_null_mask` → `filter_bits` would hit `unreachable!()` and panic, even though `append_inline_views_by_filter` handles those arms. Consider either dropping the `None`/`All` arms (matching `filter_bits`) or making the two helpers consistent. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
