HippoBaro commented on PR #9831:
URL: https://github.com/apache/arrow-rs/pull/9831#issuecomment-4385881175

   Thank you @etseidl and @alamb for pushing back! The regression was fairly 
straightforward: the compact level representation added extra 
branching/dispatch on a hot path. At first I thought this was the price to pay 
for speeding up the `Uniform` and `Absent` level representations. This was 
particularly expensive for list columns, because each non-empty list row called 
back into child level generation, reaching `write_leaf` for primitive children.
   
   It turns out that I hadn't considered a good opportunity to batch writes 
there as well. We now batch consecutive non-empty list rows into a single child 
level write, then walk the appended repetition levels backwards to mark 
list-row boundaries.
   
   On my laptop these benchmarks show low single-digit improvements for the 
`list_primitive` cases, but your mileage may vary. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to