alamb opened a new pull request, #10054:
URL: https://github.com/apache/arrow-rs/pull/10054

   # Which issue does this PR close?
   
   Follow-up / test coverage for #9755 (`arrow-select: fuse inline 
Utf8View/BinaryView filter coalescing`).
   
   > [!NOTE]
   > This branch is **stacked on top of #9755** — that PR is not yet merged to 
`main`, so the diff here also shows its feature commit (`f546a78`). The 
contribution in this PR is the **single test commit** on top; the intent is to 
add it to #9755 (or merge alongside it). Reviewers should focus on the 
`*tests*` changes.
   
   # Rationale for this change
   
   While reviewing #9755 I found, via `cargo llvm-cov`, that the new fused 
inline-view copy fast path had **zero test coverage**. The path is only chosen 
when the filter is sparse enough (`selected * 16 <= filter.len()`, i.e. ≤6.25% 
selectivity), but every `*_filtered_inline` test in the PR uses `sparse_filter` 
= `idx % 8 == 0` (12.5%), so they all fall back to the materialized `filter` 
kernel before reaching the fused code.
   
   llvm-cov on the unmodified PR showed **0 executions** of:
   - `push_batch_with_fused_inline_filter` copy tail (`coalesce.rs`)
   - `append_inline_views_by_filter` and `append_nulls_by_filter` 
(`coalesce/byte_view.rs`)
   - the success path of `InProgressByteViewArray::copy_rows_by_filter`
   
   # What changes are included in this PR?
   
   Tests only — no production code is touched:
   
   - **`coalesce.rs`** — end-to-end `BatchCoalescer` tests at 5% selectivity 
(`idx % 20 == 0`) with all-inline values: single `BinaryView`, single 
`StringView`, a mixed `Int32`/`Float64`/`StringView`/`BinaryView` batch (which 
trips `FilterBuilder::optimize`, exercising the *materialized* selection), and 
a case whose buffered rows cross `target_batch_size` so the in-loop 
`finish_buffered_batch` flush is covered.
   - **`coalesce/byte_view.rs`** — direct `copy_rows_by_filter` unit tests 
covering the lazy and materialized `Indices` and `Slices` arms of 
`append_inline_views_by_filter` for both view types, asserted against the 
reference `filter` kernel.
   - **`coalesce/primitive.rs`** — a `Slices`-selection test covering the 
`copy_rows_by_selection` fallback arm of the primitive override.
   
   After these tests, every **reachable** line of the fused path is covered. 
The unsafe view copy passes under **MIRI** for all new tests.
   
   # Are there any user-facing changes?
   
   No.
   
   ---
   
   ### Note for the author of #9755 (not blocking)
   
   The `None` and `All` arms of `append_inline_views_by_filter` are 
**unreachable in production** (the dispatch shortcuts all-selected at 
`coalesce.rs` `selected_count == num_rows` → `push_batch`, and empty at 
`selected_count == 0` → early return), so I did not add tests that pretend they 
are live paths. Worth noting: those two arms are also subtly inconsistent — if 
`copy_rows_by_filter` were ever called with an `All`/`None`-strategy predicate 
on a byte-view array **that has nulls**, `append_nulls_by_filter` → 
`filter_null_mask` → `filter_bits` would hit `unreachable!()` and panic, even 
though `append_inline_views_by_filter` handles those arms. Consider either 
dropping the `None`/`All` arms (matching `filter_bits`) or making the two 
helpers consistent.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to