felipecrv commented on issue #41094:
URL: https://github.com/apache/arrow/issues/41094#issuecomment-2150787705

   > I think both 2 and 3 could potentially benefit from leveraging special 
attributes of specific data types such as list/string-view, ree and dict, 
though I'm not exactly sure how.
   
   The leverage is always a function of type+kernel. The first kernels that 
deserve good specialization are `array_take` (the "gather") and the scatter.
   
   For types that support out-of-order writing (list-view, dict, 
string-view...) you can scatter incrementally:
   
   ```cpp
   scatter(branch0, sel0, &output)
   scatter(branch1, sel1, &output);
   ...
   scatter(branchn, seln, &output);
   ```
   
   (this assumes all the selection vectors are disjoint)
   
   For types that need in-order appending, you will need all selection vectors 
and merge them:
   
   ```cpp
   selections = MinHeapOfSelections{{branch0, sel0, 0}, {branch1, sel1, 0}, 
..., {branchn, seln, 0}};
   while (!selections.empty()) {
     i = min_selection(selections);
     output_builder.AppendFrom(selections[i].branch, selections[i].pos);
     selections.ExtractMin(/*hint=*/i);
   }
   ```
   
   These branches and selection vectors are described in this comment about 
evaluation of case-when 
https://github.com/apache/arrow/issues/41453#issuecomment-2150735331
   
   > I'm now working on an overall framework, maybe things will become clearer 
when I get there. I can use some help/comment from you guys then :)
   
   The details will become clear only when you try to draft a big change for 
sure, but for the sake of review an interesting first step would be the 
representation of the `cond` special form in `compute::Expression` with 
strict/eager evaluation that you later replace.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to