AlenkaF commented on code in PR #40358: URL: https://github.com/apache/arrow/pull/40358#discussion_r1533872400
########## cpp/src/arrow/tensor_benchmark.cc: ########## @@ -0,0 +1,72 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +#include "benchmark/benchmark.h" + +#include "arrow/record_batch.h" +#include "arrow/testing/gtest_util.h" +#include "arrow/testing/random.h" +#include "arrow/type.h" +#include "arrow/util/benchmark_util.h" + +namespace arrow { + +template <typename ValueType> +static void BatchToTensorSimple(benchmark::State& state) { + RegressionArgs args(state); + std::shared_ptr<DataType> ty = TypeTraits<ValueType>::type_singleton(); + + const int64_t kNumRows = args.size; Review Comment: I guess not. What I understand, at least, is that the number for `items_per_second` should be approx `bytes_per_second` divided by the size of the type. Joris advised me what I could try to debug this but I am not finding anything I could grasp. I am not really sure if it makes a difference if I only use `state.SetBytesProcessed` without `state.SetItemsProcessed`. It also looks OK if I just leave both of them out: ``` Running /var/folders/gw/q7wqd4tx18n_9t4kbkd0bj1m0000gn/T/arrow-archery-vd706e0e/WORKSPACE/build/release/arrow-tensor-benchmark Run on (8 X 24 MHz CPU s) CPU Caches: L1 Data 64 KiB L1 Instruction 128 KiB L2 Unified 4096 KiB (x8) Load Average: 15.15, 15.26, 13.19 ----------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations UserCounters... ----------------------------------------------------------------------------------------------------- BatchToTensorSimple<UInt8Type>/65536 429847 ns 429439 ns 1582 bytes_per_second=145.539Mi/s null_percent=0 size=65.536k BatchToTensorSimple<UInt8Type>/4194304 56283753 ns 44952231 ns 13 bytes_per_second=88.9833Mi/s null_percent=0 size=4.1943M BatchToTensorSimple<UInt16Type>/65536 470726 ns 462170 ns 1607 bytes_per_second=135.232Mi/s null_percent=0 size=65.536k BatchToTensorSimple<UInt16Type>/4194304 44393589 ns 37141214 ns 14 bytes_per_second=107.697Mi/s null_percent=0 size=4.1943M BatchToTensorSimple<UInt32Type>/65536 440997 ns 439951 ns 1260 bytes_per_second=142.061Mi/s null_percent=0 size=65.536k BatchToTensorSimple<UInt32Type>/4194304 43955912 ns 36447556 ns 18 bytes_per_second=109.747Mi/s null_percent=0 size=4.1943M BatchToTensorSimple<UInt64Type>/65536 432952 ns 431213 ns 1369 bytes_per_second=144.94Mi/s null_percent=0 size=65.536k BatchToTensorSimple<UInt64Type>/4194304 40377762 ns 36827529 ns 17 bytes_per_second=108.614Mi/s null_percent=0 size=4.1943M BatchToTensorSimple<Int8Type>/65536 583566 ns 561105 ns 1667 bytes_per_second=111.387Mi/s null_percent=0 size=65.536k BatchToTensorSimple<Int8Type>/4194304 69477871 ns 51189900 ns 10 bytes_per_second=78.1404Mi/s null_percent=0 size=4.1943M BatchToTensorSimple<Int16Type>/65536 466828 ns 460938 ns 1379 bytes_per_second=135.593Mi/s null_percent=0 size=65.536k BatchToTensorSimple<Int16Type>/4194304 53699115 ns 43646833 ns 12 bytes_per_second=91.6447Mi/s null_percent=0 size=4.1943M BatchToTensorSimple<Int32Type>/65536 510174 ns 489199 ns 1380 bytes_per_second=127.76Mi/s null_percent=0 size=65.536k BatchToTensorSimple<Int32Type>/4194304 59453215 ns 43936000 ns 13 bytes_per_second=91.0415Mi/s null_percent=0 size=4.1943M BatchToTensorSimple<Int64Type>/65536 449931 ns 446273 ns 1581 bytes_per_second=140.049Mi/s null_percent=0 size=65.536k BatchToTensorSimple<Int64Type>/4194304 44797259 ns 38353000 ns 19 bytes_per_second=104.294Mi/s null_percent=0 size=4.1943M BatchToTensorSimple<HalfFloatType>/65536 501073 ns 470337 ns 1660 bytes_per_second=132.884Mi/s null_percent=0 size=65.536k BatchToTensorSimple<HalfFloatType>/4194304 57234822 ns 40693467 ns 15 bytes_per_second=98.2959Mi/s null_percent=0 size=4.1943M BatchToTensorSimple<FloatType>/65536 420881 ns 419577 ns 1389 bytes_per_second=148.96Mi/s null_percent=0 size=65.536k BatchToTensorSimple<FloatType>/4194304 41806079 ns 37133778 ns 18 bytes_per_second=107.719Mi/s null_percent=0 size=4.1943M BatchToTensorSimple<DoubleType>/65536 424610 ns 423430 ns 1346 bytes_per_second=147.604Mi/s null_percent=0 size=65.536k BatchToTensorSimple<DoubleType>/4194304 37983824 ns 35989222 ns 18 bytes_per_second=111.144Mi/s null_percent=0 size=4.1943M ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
