mikemccand commented on PR #15779:
URL: https://github.com/apache/lucene/pull/15779#issuecomment-3991869954
I tested the PR (pre-compute hash's), on a Raptorlake i9-13900K, 192 GB RAM,
Arch Lijnux.
I don't know what all the perf stats mean, but I see 1.4 -> 1.7
CPUs_utilized changed:
Before:
```
38092046 terms loaded
done shuffling
Inserted 38092046 terms in 12691.45 ms, unique term 38092046
Inserted 38092046 terms in 12688.31 ms, unique term 38092046
Inserted 38092046 terms in 12607.45 ms, unique term 38092046
Inserted 38092046 terms in 12537.87 ms, unique term 38092046
Performance counter stats for '/usr/lib/jvm/java-25-openjdk/bin/java -cp
.:lucene/core/build/classes/java/main25:lucene/core/build/classes/java/main BHT
/lucenedata/enwiki/allterms-20110115.txt':
8560 context-switches # 108.1 cs/sec
cs_per_second
287 cpu-migrations # 3.6
migrations/sec migrations_per_second
33899 page-faults # 428.0
faults/sec page_faults_per_second
79211.49 msec task-clock # 1.4 CPUs
CPUs_utilized
2273016325 cpu_core/L1-dcache-load-misses/ # nan %
l1d_miss_rate (29.09%)
1541526215 cpu_core/LLC-loads/ # 73.2 %
llc_miss_rate (13.84%)
1405307756 cpu_core/branch-misses/ # 2.8 %
branch_miss_rate (20.77%)
49549979427 cpu_core/branches/ # 625.5 M/sec
branch_frequency (27.68%)
441832760975 cpu_core/cpu-cycles/ # 5.6 GHz
cycles_frequency (34.59%)
283012369585 cpu_core/instructions/ # 0.6
instructions insn_per_cycle (41.48%)
83593449949 cpu_core/dTLB-loads/ # 0.1 %
dtlb_miss_rate (48.33%)
88342760 cpu_atom/L1-icache-load-misses/ # 0.7 %
l1i_miss_rate (17.36%)
128880274 cpu_atom/LLC-loads/ # 0.2 %
llc_miss_rate (11.44%)
85602625 cpu_atom/branch-misses/ # 1.0 %
branch_miss_rate (8.54%)
5128977191 cpu_atom/branches/ # 64.8 M/sec
branch_frequency (13.66%)
61038440913 cpu_atom/cpu-cycles/ # 0.8 GHz
cycles_frequency (18.21%)
34060988052 cpu_atom/instructions/ # 0.6
instructions insn_per_cycle (22.70%)
11407392575 cpu_atom/dTLB-loads/ # 0.0 %
dtlb_miss_rate (27.22%)
TopdownL1 (cpu_core) # 8.5 %
tma_bad_speculation
# 12.2 %
tma_frontend_bound (58.25%)
# 33.2 %
tma_backend_bound
# 46.0 %
tma_retiring (58.25%)
TopdownL1 (cpu_atom) # 81.9 %
tma_backend_bound (27.06%)
# 4.2 %
tma_frontend_bound (19.02%)
# -6.3 %
tma_bad_speculation
# 20.3 %
tma_retiring (17.47%)
55.165335221 seconds time elapsed
76.435898000 seconds user
2.268276000 seconds sys
```
After:
```
8092046 terms loaded
done shuffling
Inserted 38092046 terms in 7715.29 ms, unique term 38092046
Inserted 38092046 terms in 7696.81 ms, unique term 38092046
Inserted 38092046 terms in 7704.62 ms, unique term 38092046
Inserted 38092046 terms in 7586.43 ms, unique term 38092046
Performance counter stats for '/usr/lib/jvm/java-25-openjdk/bin/java -cp
/l/trunk:lucene/core/build/classes/java/main25:lucene/core/build/classes/java/main
BHT /lucenedata/enwiki/allterms-20110115.txt':
8710 context-switches # 147.3 cs/sec
cs_per_second
334 cpu-migrations # 5.6
migrations/sec migrations_per_second
34616 page-faults # 585.4
faults/sec page_faults_per_second
59128.21 msec task-clock # 1.7 CPUs
CPUs_utilized
1561004563 cpu_core/L1-dcache-load-misses/ # nan %
l1d_miss_rate (27.12%)
984009712 cpu_core/LLC-loads/ # 73.1 %
llc_miss_rate (14.29%)
1341577909 cpu_core/branch-misses/ # 2.8 %
branch_miss_rate (21.76%)
47205893532 cpu_core/branches/ # 798.4 M/sec
branch_frequency (28.98%)
299702122270 cpu_core/cpu-cycles/ # 5.1 GHz
cycles_frequency (36.20%)
274395763972 cpu_core/instructions/ # 0.9
instructions insn_per_cycle (43.41%)
85722776251 cpu_core/dTLB-loads/ # 0.1 %
dtlb_miss_rate (47.47%)
61455978 cpu_atom/L1-icache-load-misses/ # 0.4 %
l1i_miss_rate (11.69%)
165263411 cpu_atom/LLC-loads/ # 0.6 %
llc_miss_rate (8.66%)
104379304 cpu_atom/branch-misses/ # 0.9 %
branch_miss_rate (6.72%)
12291571297 cpu_atom/branches/ # 207.9 M/sec
branch_frequency (6.66%)
123652422399 cpu_atom/cpu-cycles/ # 2.1 GHz
cycles_frequency (8.85%)
77079071643 cpu_atom/instructions/ # 0.6
instructions insn_per_cycle (11.01%)
27617125715 cpu_atom/dTLB-loads/ # 0.0 %
dtlb_miss_rate (11.78%)
TopdownL1 (cpu_core) # 8.5 %
tma_bad_speculation
# 11.4 %
tma_frontend_bound (54.24%)
# 36.5 %
tma_backend_bound
# 43.6 %
tma_retiring (54.24%)
TopdownL1 (cpu_atom) # 80.3 %
tma_backend_bound (11.68%)
# 2.6 %
tma_frontend_bound (11.73%)
# 4.5 %
tma_bad_speculation
# 12.5 %
tma_retiring (11.75%)
35.300626738 seconds time elapsed
56.434061000 seconds user
2.299231000 seconds sys
```
This is on latest Lucene `main` branch
(#182ee9c4cc3bc52ace12e699248b750377a3aa2f) using your benchy (I just added
code to load terms from a file one per line). I tested on an export of terms
from Wikipedia `en`:
```
import org.apache.lucene.util.BytesRef;
import org.apache.lucene.util.BytesRefHash;
// /usr/lib/jvm/java-25-openjdk/bin/javac -cp
lucene/core/build/classes/java/main25:lucene/core/build/classes/java/main
BHT.java; perf stat -dd /usr/lib/jvm/java-25-openjdk/bin/java -cp
.:lucene/core/build/classes/java/main2\
5:lucene/core/build/classes/java/main BHT
/lucenedata/enwiki/allterms-20110115.txt
public class BHT {
public static void main(String[] args) throws IOException {
BytesRef[] terms = loadTerms(Paths.get(args[0]));
for (int iter=0;iter<1;iter++) {
insert(terms, 4);
}
}
private static BytesRef[] loadTerms(Path path) throws IOException {
final List<BytesRef> terms = new ArrayList<>();
try (java.util.stream.Stream<String> lines = Files.lines(path)) {
// Process each line as it is read
lines.forEach(line -> {
terms.add(new BytesRef(line.trim()));
});
}
System.out.println(terms.size() + " terms loaded");
Collections.shuffle(terms);
System.out.println("done shuffling");
return terms.toArray(new BytesRef[0]);
}
private static void insert(BytesRef[] testData, int round) {
for (int r = 0; r < round; r++) {
BytesRefHash hash = new BytesRefHash();
int uniqueCount = 0;
long start = System.nanoTime();
for (BytesRef ref : testData) {
int pos = hash.add(ref);
if (pos >= 0) {
uniqueCount += 1;
}
}
long insertTimeNs = System.nanoTime() - start;
System.out.printf(
"Inserted %d terms in %.2f ms, unique term %d\n",
testData.length, insertTimeNs / 1_000_000.0, uniqueCount);
/*
System.out.printf(
"rehashTimes %d, rehashTimeMs %d, calcHashTimeMs %d\n",
hash.rehashTimes, hash.rehashTimeMs, hash.calcHashTimeMs);
*/
}
}
}
```
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]