niyue commented on PR #39098:
URL: https://github.com/apache/arrow/pull/39098#issuecomment-1857862643
I did some more experiments using LLVM's new `JITLink` linker [1], which was
initially introduced in LLVM 9.0, and can be used together with LLJIT:
* when enabled, it helps to address the ASAN issue on my mac. However, it
doesn't address the `atexit` ASAN symbol issue in CI [2]
* I added its support to Gandiva in this PR:
* it is turned off by default since it is less mature compared to
`RuntimeDyld` JIT linker, and can be turned on by setting an environment
variable
* to keep the change minimum (around 40 lines of code), the current
implementation uses some API from LLVM 14, and for lower versions of LLVM,
JITLink won't be available
I added a new micro benchmark to verify expression compilation performance.
Although claimed to have some advantages [1][3], I don't observe any
performance improvement in my limited testing. Neither expression compilation
or expression evaluation is faster when `JITLink` is used, and it is roughly
the same speed compared to `RuntimeDyld` JIT linker.
# Links
[1] https://llvm.org/docs/JITLink.html
[2]
https://github.com/apache/arrow/actions/runs/7221391997/job/19676260548?pr=39098
[3] https://www.phoronix.com/news/LLVM-Lands-JITLink
# Microbenchmark
JITLink enabled micro benchmark, the first micro benchmark
`TimedTestExprCompilation` is the newly added benchmark for measuring
expression compilation speed.
```
Running ./release/gandiva-micro-benchmarks
Run on (10 X 24.0505 MHz CPU s)
CPU Caches:
L1 Data 64 KiB
L1 Instruction 128 KiB
L2 Unified 4096 KiB (x10)
Load Average: 3.87, 3.56, 5.22
/Users/ss/dev/projects/opensource/arrow/cpp/src/gandiva/cache.cc:50:
Creating gandiva cache with capacity of 500
/Users/ss/dev/projects/opensource/arrow/cpp/src/gandiva/engine.cc:247:
Detected CPU Name : cyclone
/Users/ss/dev/projects/opensource/arrow/cpp/src/gandiva/engine.cc:248:
Detected CPU Features: []
-----------------------------------------------------------------------------------------
Benchmark Time CPU
Iterations
-----------------------------------------------------------------------------------------
TimedTestExprCompilation/min_time:1.000 20201 us 20197 us
65
TimedTestAdd3/min_time:1.000 1120 us 1117 us
1247
TimedTestBigNested/min_time:1.000 7872 us 7872 us
177
TimedTestExtractYear/min_time:1.000 7249 us 7242 us
194
TimedTestFilterAdd2/min_time:1.000 2836 us 2836 us
494
TimedTestFilterLike/min_time:1.000 12290 us 12289 us
114
TimedTestCastFloatFromString/min_time:1.000 14320 us 14311 us
99
TimedTestCastIntFromString/min_time:1.000 14288 us 14282 us
99
TimedTestAllocs/min_time:1.000 33997 us 33993 us
41
TimedTestOutputStringAllocs/min_time:1.000 51085 us 51028 us
28
TimedTestMultiOr/min_time:1.000 12630 us 12590 us
113
TimedTestInExpr/min_time:1.000 2515 us 2514 us
557
DecimalAdd2Fast/min_time:1.000 2058 us 2057 us
695
DecimalAdd2LeadingZeroes/min_time:1.000 5158 us 5155 us
272
DecimalAdd2LeadingZeroesWithDiv/min_time:1.000 24319 us 24296 us
58
DecimalAdd2Large/min_time:1.000 115057 us 115046 us
12
DecimalAdd3Fast/min_time:1.000 2290 us 2289 us
612
DecimalAdd3LeadingZeroes/min_time:1.000 8781 us 8780 us
157
DecimalAdd3LeadingZeroesWithDiv/min_time:1.000 61239 us 61232 us
23
DecimalAdd3Large/min_time:1.000 237387 us 237198 us
6
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]