Antoine Pitrou created ARROW-10026:
--------------------------------------
Summary: [C++] Improve kernel performance on small batches
Key: ARROW-10026
URL: https://issues.apache.org/jira/browse/ARROW-10026
Project: Apache Arrow
Issue Type: Task
Components: C++
Reporter: Antoine Pitrou
It seems that invoking some kernels on smallish batches has quite an overhead:
{code}
ArrayArrayKernel<Add, Int32Type>/32768/100 2860 ns
2859 ns 245195 bytes_per_second=10.6727G/s items_per_second=2.86494G/s
null_percent=1 size=32.768k
ArrayArrayKernel<Add, Int32Type>/32768/0 2752 ns
2751 ns 249316 bytes_per_second=11.093G/s items_per_second=2.97775G/s
null_percent=0 size=32.768k
ArrayArrayKernel<Add, Int32Type>/524288/100 18633 ns
18630 ns 36548 bytes_per_second=26.2097G/s items_per_second=7.03561G/s
null_percent=1 size=524.288k
ArrayArrayKernel<Add, Int32Type>/524288/0 18260 ns
18257 ns 38245 bytes_per_second=26.7451G/s items_per_second=7.17933G/s
null_percent=0 size=524.288k
{code}
We should investigate and try to lighten the overhead.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)