RKSPD commented on PR #15472: URL: https://github.com/apache/lucene/pull/15472#issuecomment-3671836533
> The `JVector` specific KNN query seems to have some interesting query-time hyper-parameters: > > ``` > private final int overQueryFactor; > private final float threshold; > private final float rerankFloor; > private final boolean usePruning; > ``` > > Does Lucene's KNN query have corollaries for these? In my experience with benchmarking, overQueryFactor, threshold, rerankFloor = 0 kept the performance metrics similar to Lucene HNSW for small index speed testing (Cohere 768, 200k docs on luceneutil). Using the knnPerfTest run parameters we can use the fanout/overSample levers to test apples/apples performance vs lucene. Also @abernardi597 for testing multi-threaded performance, maybe check if knnPerfTest numIndexThreads = 1 can lead to better benchmarks? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
