zanmato1984 commented on PR #47377:
URL: https://github.com/apache/arrow/pull/47377#issuecomment-4303926566

   @pitrou Thanks a lot for taking the time and for the candid feedback - 
totally understood on the time constraints.
   
   I agree the execution layer is already overly complicated. In this PR I 
tried to keep the added complexity moderate and localized: when no selection 
vector is present, the existing execution path is unchanged (aside from a 
couple of null checks in the iterator). The selection-aware path is only 
activated when an ExecBatch carries a `SelectionVector`, and kernels can opt 
into a single additional entry point (`selective_exec`) with a well-defined 
fallback (gather/Take -> exec -> scatter). If it would make review easier, I’m 
also happy to split this into smaller PRs.
   
   On performance: the selection path is only intended to be exercised from the 
upcoming special-form work (#47374), as a narrow and semantically explicit 
entry point. The primary goal there is semantic correctness under vectorized 
execution (e.g. guarding against errors like division-by-zero in rows where the 
condition is false), and I think paying some overhead is acceptable for that. I 
posted benchmark results earlier 
(https://github.com/apache/arrow/pull/47377#issuecomment-3287513306); in 
summary, sparse execution wins strongly at low selectivity, while the worst 
regressions (up to ~4x) are from the generic dense fallback when a kernel 
doesn’t provide `selective_exec` (extra gather/scatter calls), rather than from 
indirection itself.
   
   Chunked inputs: yes, this PR works with chunked arrays. Both the chunked + 
selection cases are covered by unit tests 
(`TestCallScalarFunctionPreallocationCases.ChunkedSelectiveSparse/Dense` in 
`cpp/src/arrow/compute/exec_test.cc`). Output chunking may differ (often a 
single chunk), but semantics are preserved.
   
   Finally, I hear you on representation: an index-only SelectionVector loses 
contiguous-span information and may limit batch/SIMD opportunities. I haven’t 
fully designed a mixed representation yet, but I agree it may be the right 
long-term direction. If you have a preferred API shape (hybrid runs+indices, a 
generalized “selection” object, or a different exec signature), I’d really 
appreciate guidance - I’d rather adjust before we cement the API.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to