zanmato1984 commented on issue #41094:
URL: https://github.com/apache/arrow/issues/41094#issuecomment-2144240210

   > In the implementation, maybe we can first implement gather-scatter? Since 
Arrow doesn't gurantee a data type supports un-ordered write currently ( which 
is different from velox )
   
   1) According to @felipecrv , we don't have an existing "scatter" function, 
so yes we'll have to implement that first.
   2) When this "scatter" function is available, we can build a "naive" 
evaluation of special form on top of it (as well as "gather") - by doing a 
centralized "gather the input by the selection vector, pass the selected rows 
to the dumb kernel, and scatter the kernel's output back to the actual output". 
This requires NO kernel's awareness of selection vector, making available of an 
incremental approach of 3.
   3) Gradually make kernel support selection vector as an optional parameter, 
i.e., selection-vector-aware. Once all done, we don't need 2 any more.
   
   I think both 2 and 3 could potentially benefit from leveraging special 
attributes of specific data types such as list/string-view, ree and dict, 
though I'm not exactly sure how. I'm now working on an overall framework, maybe 
things will become clearer when I get there. I can use some help/comment from 
you guys then :)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to