aruggero commented on code in PR #16034:
URL: https://github.com/apache/lucene/pull/16034#discussion_r3202353225
##########
lucene/core/src/java/org/apache/lucene/util/hnsw/AbstractHnswGraphSearcher.java:
##########
@@ -96,14 +96,89 @@ protected static void scoreEntryPoints(
assert scores != null && scores.length >= eps.length;
scorer.bulkScore(eps, scores, eps.length);
results.incVisitedCount(eps.length);
+ float[] siblingScores = null;
+ int[] siblingsOrd = new int[0];
for (int i = 0; i < eps.length; i++) {
float score = scores[i];
int ep = eps[i];
visited.set(ep);
candidates.add(ep, score);
if (acceptOrds == null || acceptOrds.get(ep)) {
+ // Fetch siblingsOrd BEFORE collect() so the parent is not yet in the
heap
+ int numSiblingsToVisit = 0;
+ // The instanceof check is needed: this method is also called with a
GraphBuilderKnnCollector
+ if (results instanceof OrdinalTranslatedKnnCollector collector) {
+ if (collector.isSiblingExpansionCollector()) {
+ siblingsOrd = collector.getSiblingOrdinals(ep, visited,
siblingsOrd);
+ if (siblingsOrd.length > 0) {
+ // how many siblingsOrd are actually scored to avoid exceeding
the visit budget.
+ // controls the early termination condition. early terminates
the search if we reach
+ // visitLimit nodes
+ // if this visit limit is high we just navigate the graph until
we do not have any node
+ // with a score higher than the ones already collected
+ // Current values it could assume:
+ // - No filter → Integer.MAX_VALUE (no constraint tighter than
the full segment)
+ // - With filter → cardinality (no constraint tighter than the
full accepted set)
+ numSiblingsToVisit =
+ (int) Math.min(siblingsOrd.length, results.visitLimit() -
results.visitedCount());
Review Comment:
Hi @benwtrent ,
In your opinion, does it make sense to you to cut off the number of siblings
to explore, looking at the `results.visitLimit()`?
From what @alessandrobenedetti and I have seen, this limit seems to be set
to:
- No filter → Integer.MAX_VALUE (no constraint tighter than the full
segment)
- With filter → cardinality (no constraint tighter than the full accepted
set)
As described in the comment above...
What does this represent? It doesn't seem to be set as a query parameter.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]