ariel-miculas opened a new pull request, #22558:
URL: https://github.com/apache/datafusion/pull/22558

   ## Which issue does this PR close?
   Related a bit to https://github.com/apache/datafusion/issues/22526
   
   Needs rebasing once https://github.com/apache/datafusion/pull/22416 is merged
   
   ## Rationale for this change
   split_off does this:
   > Returns a newly allocated vector containing the elements in the range [at, 
len). After the call, the original vector will be left containing the elements 
[0, at) with its previous capacity unchanged.
   
   which is bad when taking a small slice from a large Vec, for two reasons:
   * it will allocate memory for the remaining elements, which are a lot more 
than n
   * it will return a Vec with a very large capacity compared to its length
   
   split_vec_min_alloc still has some issues: 
https://github.com/apache/datafusion/issues/22548 but it uses drain + collect 
when n is small, which is better because it only allocates for the initial n 
elements and doesn't inflate the capacity
   
   ## What changes are included in this PR?
   
   
   ## Are these changes tested?
   Yes
   
   ## Are there any user-facing changes?
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to