RyanMarcus commented on PR #6752:
URL: https://github.com/apache/arrow-rs/pull/6752#issuecomment-2486929566
My benchmark is a worst-case scenario, so every run length is 1, and thus
the average is 1 as well. Not a realistic scenario, but illustrative of the
worst case.
If we increase all run lengths to 10 (which is the average in my
application, at least) and keeping the logical data size the same, the results
are:
```
With PrimitiveBuilder:
cast run end to flat time: [11.740 ms 11.778 ms 11.818 ms]
With MutableArrayData:
cast run end to flat time: [21.837 ms 21.917 ms 22.000 ms]
```
Both approaches get faster, but the relative gap is larger.
The current version hits the compromise you mentioned: the specialized
kernel is used for {i/u/f}{8/16/32/64}, and the interpretation-powered one is
used for the rest.
I am not fully confident I correctly modified all of the `MutableArrayData`
to handle internals, but the tests at least pass. If you want, I can swap back
to the "dummy" version I proposed earlier, since the specialized kernel should
hit the most common cases.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]