pitrou commented on pull request #6985:
URL: https://github.com/apache/arrow/pull/6985#issuecomment-637533180
Here are some benchmarks on my machine, with gcc 7.5:
* `ARROW_SIMD_LEVEL=AVX2`:
```
BM_DefinitionLevelsToBitmapRepeatedAllMissing 934 ns 934 ns
3004804 bytes_per_second=2.04274G/s
BM_DefinitionLevelsToBitmapRepeatedAllPresent 1327 ns 1327 ns
1908147 bytes_per_second=1.4377G/s
BM_DefinitionLevelsToBitmapRepeatedMostPresent 1725 ns 1725 ns
1649108 bytes_per_second=1.10569G/s
```
* `ARROW_SIMD_LEVEL=SSE4_2`:
```
BM_DefinitionLevelsToBitmapRepeatedAllMissing 1384 ns 1384 ns
2029778 bytes_per_second=1.37806G/s
BM_DefinitionLevelsToBitmapRepeatedAllPresent 2054 ns 2053 ns
1247469 bytes_per_second=951.163M/s
BM_DefinitionLevelsToBitmapRepeatedMostPresent 2124 ns 2124 ns
1303578 bytes_per_second=919.503M/s
```
* `ARROW_SIMD_LEVEL=NONE`:
```
BM_DefinitionLevelsToBitmapRepeatedAllMissing 925 ns 925 ns
3025393 bytes_per_second=2.06245G/s
BM_DefinitionLevelsToBitmapRepeatedAllPresent 1505 ns 1504 ns
1881938 bytes_per_second=1.26785G/s
BM_DefinitionLevelsToBitmapRepeatedMostPresent 1725 ns 1725 ns
1598820 bytes_per_second=1.10599G/s
```
So it seems that gcc's SSE4.2 auto-vectorization may lead to suboptimal
code, but that's not a problem for this PR.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]