alihan-synnada commented on issue #13620:
URL: https://github.com/apache/datafusion/issues/13620#issuecomment-2514077118

   [PoC 
Link](https://github.com/synnada-ai/datafusion-upstream/tree/feature/take_with_iter_poc)
 **It requires a patched version of `arrow-buffer` that derives `Clone` for 
`BitIndexIterator`. The benchmark might be misleading because I had trouble 
with lifetimes and ended up using `Box::leak` as a last resort.**
   
   I'm not very confident in the way I set up the benchmark but I think the 
results are promising. Note that the selectivity only goes up to 30 because 
it's really slow after that point.
   
   Batch size is in log2 on the chart (i.e. batch size 13 means 2^13 *(8192)*). 
So it isn't really useful for the default batch size of 8192 but anything 
between 2^4 *(16)* and 2^10 *(1024)* might benefit from it.
   
   
![Figure_1](https://github.com/user-attachments/assets/08ca2ffb-cc57-47a2-b7b3-b7871bf06866)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to