[
https://issues.apache.org/jira/browse/ARROW-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198197#comment-17198197
]
Frank Du commented on ARROW-10026:
----------------------------------
Seems most batch for ArrayData now is exactly a copy of current, worth to add a
check before Slice?
Per perf, no Slice anymore for scalar-arithmetic-benchmark, and the data
improved slightly(3%)
{code:java}
diff --git a/cpp/src/arrow/compute/exec.cc b/cpp/src/arrow/compute/exec.cc
index 435d7dd74..7c33a7198 100644
--- a/cpp/src/arrow/compute/exec.cc
+++ b/cpp/src/arrow/compute/exec.cc
@@ -171,7 +171,12 @@ bool ExecBatchIterator::Next(ExecBatch* batch) {
if (args_[i].is_scalar()) {
batch->values[i] = args_[i].scalar();
} else if (args_[i].is_array()) {
- batch->values[i] = args_[i].array()->Slice(position_, iteration_size);
+ if (position_ || iteration_size != length_) {
+ batch->values[i] = args_[i].array()->Slice(position_, iteration_size);
+ } else {
+ batch->values[i] = args_[i].array();
+ }
} else {
const ChunkedArray& carr = *args_[i].chunked_array();
const auto& chunk = carr.chunk(chunk_indexes_[i]);
{code}
> [C++] Improve kernel performance on small batches
> -------------------------------------------------
>
> Key: ARROW-10026
> URL: https://issues.apache.org/jira/browse/ARROW-10026
> Project: Apache Arrow
> Issue Type: Task
> Components: C++
> Reporter: Antoine Pitrou
> Priority: Major
>
> It seems that invoking some kernels on smallish batches has quite an overhead:
> {code}
> ArrayArrayKernel<Add, Int32Type>/32768/100 2860 ns
> 2859 ns 245195 bytes_per_second=10.6727G/s
> items_per_second=2.86494G/s null_percent=1 size=32.768k
> ArrayArrayKernel<Add, Int32Type>/32768/0 2752 ns
> 2751 ns 249316 bytes_per_second=11.093G/s items_per_second=2.97775G/s
> null_percent=0 size=32.768k
> ArrayArrayKernel<Add, Int32Type>/524288/100 18633 ns
> 18630 ns 36548 bytes_per_second=26.2097G/s
> items_per_second=7.03561G/s null_percent=1 size=524.288k
> ArrayArrayKernel<Add, Int32Type>/524288/0 18260 ns
> 18257 ns 38245 bytes_per_second=26.7451G/s
> items_per_second=7.17933G/s null_percent=0 size=524.288k
> {code}
> We should investigate and try to lighten the overhead.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)