ML-dev-crypto commented on issue #15606: URL: https://github.com/apache/lucene/issues/15606#issuecomment-3849130759
Hi maintainers PLEASE ASSIGN THUS ISSUE TO ME I’d like to take this issue and work on expanding bulk scoring usage in the remaining places where we still rely on single-vector scoring. From reviewing the code and the discussion here, I see several clear candidates where RandomVectorScorer.score() / VectorScorer.score() are still used and could benefit from the bulk scorer API: Areas I plan to focus on HNSW graph construction HnswGraphBuilder#diversityCheck This looks like a strong candidate for bulk scoring during diversity checks, and I’m currently reviewing this path in detail. NeighborArray#isWorstNonDiverse Could potentially batch vector comparisons instead of scoring one-by-one. Higher-level vector scorers DiversifyingChildrenVectorScorer#nextParent Investigate bulk-scoring children vectors rather than individual scoring calls. VectorSimilarityScorerSupplier Explore whether it can implement or delegate to BulkScorer when available. Value sources FullPrecisionFloatVectorSimilarityValuesSource VectorSimilarityValuesSource These currently expose DoubleValues, which lacks bulk interfaces, but there may be opportunities to extend or adapt these paths to support bulk scoring where possible. Direct VectorUtil usage KMeans BpVectorReorderer These may be harder to refactor, but I’ll evaluate whether limited bulk scoring can still be applied safely. Goal The goal would be to: Reduce remaining single-vector scoring hot paths Improve consistency of bulk scoring usage across ANN and higher-level search flows Deliver measurable performance improvements without changing scoring semantics Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
