mikemccand commented on PR #15779:
URL: https://github.com/apache/lucene/pull/15779#issuecomment-3997216238
On nightly benchy box (`beast3`, Ryzen Threadripper 3990X, before:
```
38092046 terms loaded
done shuffling
Inserted 38092046 terms in 20841.91 ms, unique term 38092046
Inserted 38092046 terms in 21000.09 ms, unique term 38092046
Inserted 38092046 terms in 21635.49 ms, unique term 38092046
Inserted 38092046 terms in 20560.37 ms, unique term 38092046
Performance counter stats for '/usr/lib/jvm/java-25-openjdk/bin/java -cp
.:lucene/core/build/classes/java/main25:lucene/core/build/classes/java/main BHT
/lucenedata/enwiki/\
allterms-20110115.txt':
0 context-switches:u # 0.0 cs/sec
cs_per_second
0 cpu-migrations:u # 0.0
migrations/sec migrations_per_second
554,477 page-faults:u # 3849.9
faults/sec page_faults_per_second
144,022.41 msec task-clock:u # 1.6 CPUs
CPUs_utilized
4,099,191,298 L1-dcache-load-misses:u # 3.5 %
l1d_miss_rate (20.04%)
17,137,351 L1-icache-load-misses:u # 0.2 %
l1i_miss_rate (20.02%)
1,287,102,957 branch-misses:u # 3.0 %
branch_miss_rate (20.00%)
42,486,918,231 branches:u # 295.0 M/sec
branch_frequency (20.00%)
556,536,771,158 cpu-cycles:u # 3.9 GHz
cycles_frequency (30.03%)
246,271,809,438 instructions:u # 0.4
instructions insn_per_cycle (30.05%)
18,978,848,529 stalled-cycles-frontend:u # 0.03
frontend_cycles_idle (20.04%)
1,060,428,248 dTLB-loads:u # 27.1 %
dtlb_miss_rate (20.08%)
245,693 iTLB-loads:u # 132.8 %
itlb_miss_rate (20.06%)
90.699653506 seconds time elapsed
132.112774000 seconds user
12.021535000 seconds sys
```
After:
```
38092046 terms loaded
done shuffling
Inserted 38092046 terms in 11263.41 ms, unique term 38092046
Inserted 38092046 terms in 12925.52 ms, unique term 38092046
Inserted 38092046 terms in 12718.04 ms, unique term 38092046
Inserted 38092046 terms in 12635.16 ms, unique term 38092046
Performance counter stats for '/usr/lib/jvm/java-25-openjdk/bin/java -cp
.:lucene/core/build/classes/java/main25:lucene/core/build/classes/java/main BHT
/lucenedata/enwiki/\
allterms-20110115.txt':
0 context-switches:u # 0.0 cs/sec
cs_per_second
0 cpu-migrations:u # 0.0
migrations/sec migrations_per_second
41,869 page-faults:u # 365.2
faults/sec page_faults_per_second
114,640.36 msec task-clock:u # 2.1 CPUs
CPUs_utilized
3,491,553,643 L1-dcache-load-misses:u # 2.7 %
l1d_miss_rate (20.06%)
15,855,892 L1-icache-load-misses:u # 0.2 %
l1i_miss_rate (20.08%)
1,271,522,632 branch-misses:u # 2.6 %
branch_miss_rate (20.09%)
48,708,599,021 branches:u # 424.9 M/sec
branch_frequency (20.08%)
430,787,025,878 cpu-cycles:u # 3.8 GHz
cycles_frequency (30.10%)
285,622,442,684 instructions:u # 0.7
instructions insn_per_cycle (30.06%)
18,208,623,146 stalled-cycles-frontend:u # 0.04
frontend_cycles_idle (20.05%)
621,501,564 dTLB-loads:u # 3.9 %
dtlb_miss_rate (20.04%)
272,769 iTLB-loads:u # 53.7 %
itlb_miss_rate (20.03%)
55.867613269 seconds time elapsed
102.551131000 seconds user
12.034597000 seconds sys
```
Nice! Note the amazing drop in `dtlb_miss_rate`, which I think is a cache
the CPU keeps lose for mapping virtual -> physical address. So the better
locality pays off.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]