llama90 opened a new issue, #38569: URL: https://github.com/apache/arrow/issues/38569
### Describe the enhancement requested Refactor random generation utilizing [random.h](https://github.com/apache/arrow/blob/main/cpp/src/arrow/testing/random.h) instead of [generate_data.h](https://github.com/apache/arrow/blob/main/cpp/src/gandiva/tests/generate_data.h). This addresses the issue. * #38525 **Improvement** * Code reusability * Facilitates additional tests for various data types. **Remaining tasks** The following issues still need to be resolved. - [ ] Large Decimal for `DecimalAdd2Large` and `DecimalAdd3Large` **Question** * Some metric values (Time, CPU) in the benchmarks are varying. It's concerning whether this is alright. <details><summary>as-is</summary> ``` Unable to determine clock rate from sysctl: hw.cpufrequency: No such file or directory This does not affect benchmark measurements, only the metadata output. 2023-11-03T20:27:46+09:00 Running /Users/lama/workspace/arrow-build-test/cpp/cmake-build-debug/debug/gandiva-micro-benchmarks Run on (10 X 24.0942 MHz CPU s) CPU Caches: L1 Data 64 KiB L1 Instruction 128 KiB L2 Unified 4096 KiB (x10) Load Average: 4.01, 3.35, 3.58 ***WARNING*** Library was built as DEBUG. Timings may be affected. /Users/lama/workspace/arrow-build-test/cpp/src/gandiva/cache.cc:50: Creating gandiva cache with capacity of 500 /Users/lama/workspace/arrow-build-test/cpp/src/gandiva/engine.cc:129: Detected CPU Name : apple-m1 /Users/lama/workspace/arrow-build-test/cpp/src/gandiva/engine.cc:130: Detected CPU Features: ----------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------------------------- TimedTestAdd3/min_time:1.000 2740 us 2730 us 508 TimedTestBigNested/min_time:1.000 9750 us 9685 us 149 TimedTestExtractYear/min_time:1.000 8294 us 8210 us 172 TimedTestFilterAdd2/min_time:1.000 4181 us 4172 us 334 TimedTestFilterLike/min_time:1.000 13706 us 13669 us 102 TimedTestCastFloatFromString/min_time:1.000 71072 us 70941 us 20 TimedTestCastIntFromString/min_time:1.000 48323 us 42640 us 35 TimedTestAllocs/min_time:1.000 140487 us 137767 us 10 TimedTestOutputStringAllocs/min_time:1.000 228228 us 226211 us 6 TimedTestMultiOr/min_time:1.000 12905 us 12853 us 102 TimedTestInExpr/min_time:1.000 23907 us 23854 us 58 DecimalAdd2Fast/min_time:1.000 3868 us 3848 us 370 DecimalAdd2LeadingZeroes/min_time:1.000 7332 us 7252 us 195 DecimalAdd2LeadingZeroesWithDiv/min_time:1.000 26231 us 26121 us 54 DecimalAdd2Large/min_time:1.000 126812 us 126515 us 11 DecimalAdd3Fast/min_time:1.000 4282 us 4266 us 334 DecimalAdd3LeadingZeroes/min_time:1.000 10651 us 10635 us 131 DecimalAdd3LeadingZeroesWithDiv/min_time:1.000 64148 us 63833 us 22 DecimalAdd3Large/min_time:1.000 253900 us 251054 us 6 ``` </details> <details><summary>to-be</summary> ``` Unable to determine clock rate from sysctl: hw.cpufrequency: No such file or directory This does not affect benchmark measurements, only the metadata output. 2023-11-03T20:28:41+09:00 Running /Users/lama/workspace/arrow-latest/cpp/cmake-build-debug/debug/gandiva-micro-benchmarks Run on (10 X 24.0028 MHz CPU s) CPU Caches: L1 Data 64 KiB L1 Instruction 128 KiB L2 Unified 4096 KiB (x10) Load Average: 4.57, 3.62, 3.67 ***WARNING*** Library was built as DEBUG. Timings may be affected. /Users/lama/workspace/arrow-latest/cpp/src/gandiva/cache.cc:50: Creating gandiva cache with capacity of 500 /Users/lama/workspace/arrow-latest/cpp/src/gandiva/engine.cc:129: Detected CPU Name : apple-m1 /Users/lama/workspace/arrow-latest/cpp/src/gandiva/engine.cc:130: Detected CPU Features: ----------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------------------------- TimedTestAdd3/min_time:1.000 3232 us 2958 us 487 TimedTestBigNested/min_time:1.000 6359 us 6327 us 217 TimedTestExtractYear/min_time:1.000 8252 us 8228 us 172 TimedTestFilterAdd2/min_time:1.000 5819 us 5810 us 241 TimedTestFilterLike/min_time:1.000 14109 us 14092 us 99 TimedTestCastFloatFromString/min_time:1.000 79837 us 79717 us 17 TimedTestCastIntFromString/min_time:1.000 45557 us 45439 us 31 TimedTestAllocs/min_time:1.000 243130 us 242760 us 6 TimedTestOutputStringAllocs/min_time:1.000 332357 us 331799 us 4 TimedTestMultiOr/min_time:1.000 11269 us 10963 us 118 TimedTestInExpr/min_time:1.000 24069 us 23862 us 57 DecimalAdd2Fast/min_time:1.000 3771 us 3757 us 371 DecimalAdd2LeadingZeroes/min_time:1.000 40692 us 40636 us 34 DecimalAdd2LeadingZeroesWithDiv/min_time:1.000 110656 us 110515 us 13 DecimalAdd2Large/min_time:1.000 112098 us 111152 us 13 DecimalAdd3Fast/min_time:1.000 4151 us 4137 us 324 DecimalAdd3LeadingZeroes/min_time:1.000 78732 us 78590 us 18 DecimalAdd3LeadingZeroesWithDiv/min_time:1.000 236894 us 236280 us 6 DecimalAdd3Large/min_time:1.000 235912 us 235529 us 6 ``` </details> ### Component(s) C++ - Gandiva -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
