zanmato1984 commented on issue #41094:
URL: https://github.com/apache/arrow/issues/41094#issuecomment-2137580313

   Hi @felipecrv , I'm interested in this and doing some POC in my local 
(thanks for the instructive proposal!). One thing that I'm having trouble with 
is that, to allow the kernels to become "selection vector aware" incrementally:
   > ### Incremental improvement
   > It's unrealistic to expect we will specialize all kernel functions to 
handle the optional selection vector parameter, but we should at least push 
them all the way down to `Function::Execute`. If a `Function` doesn't handle 
selection vectors, then a call is dispatched for all the values (as it is 
today) and code similar to `Take` is used to gather the values that the 
selection vector selects. The more functions become aware of selection vectors, 
the less we have to rely on this slow code.
   
   We not only `Take` the selected rows to pass to a kernel that has not yet 
been selection vector aware, but also "scatter" the "partial" output by putting 
the rows back to the corresponding position to the original input rows. However 
I couldn't find any existing facility for this scatter operation as handy as 
`Take`/`Filter` for gather. Do you know any? Or do we need a `Scatter` 
function, or something more light-weight like `Concatenate` [1], to be 
implemented first? Thanks.
   
   [1] 
https://github.com/apache/arrow/blob/4a2df663bc88c73b863e0c0036160f7f936574c2/cpp/src/arrow/array/concatenate.h#L34-L35


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to