alessandrobenedetti commented on PR #926: URL: https://github.com/apache/lucene/pull/926#issuecomment-1163269198
Hi @msokolov @mayya-sharipova and @jtibshirani , I have finally finished my performance tests. Initially the results were worse in this branch, I found that suspicious as I expected the removal of the BoundChecker and the removal of the reverse mechanism to outweigh the additional division in the distance measure during graph building and searching. After a deep investigation I found the culprit (you see it in the latest commit). After that fix, the results are very encouraging. There are strong speedup for both angular and euclidean distances, both for indexing and searching. If this is validated we are getting a great cleanup of the code and also a nice performance boost. I'll have my colleague @eliaporciani to repeat the tests on Apple M1. The following tests were executed on Intellij running the org.apache.lucene.util.hnsw.KnnGraphTester. 2.4 GHz 8-Core Intel Core i9 - 32 GB 2667 MHz DDR4 `INDEXING EUCLIDEAN -beamWidthIndex 100 -maxConn 16 -ndoc 80000 -reindex -docs /Users/sease/JavaProjects/ann-benchmarks/ann_benchmarks/datasets/sift-128-euclidean.hdf5 -metric euclidean ORIGINAL IW 0 [2022-06-22T14:00:12.647030Z; main]: 64335 msec to write vectors IW 0 [2022-06-22T14:01:57.425108Z; main]: 65710 msec to write vectors IW 0 [2022-06-22T14:03:18.052900Z; main]: 64817 msec to write vectors THIS BRANCH IW 0 [2022-06-22T14:04:50.683607Z; main]: 6597 msec to write vectors IW 0 [2022-06-22T14:05:34.090801Z; main]: 6687 msec to write vectors IW 0 [2022-06-22T14:06:00.268309Z; main]: 6564 msec to write vectors INDEXING ANGULAR -beamWidthIndex 100 -maxConn 16 -ndoc 80000 -reindex -docs /Users/sease/JavaProjects/ann-benchmarks/ann_benchmarks/datasets/lastfm-64-dot.hdf5 -metric angular ORIGINAL IW 0 [2022-06-22T13:55:45.401310Z; main]: 32897 msec to write vectors IW 0 [2022-06-22T13:56:39.737642Z; main]: 33255 msec to write vectors IW 0 [2022-06-22T13:57:31.172709Z; main]: 32576 msec to write vectors THIS BRANCH IW 0 [2022-06-22T13:52:06.085790Z; main]: 25261 msec to write vectors IW 0 [2022-06-22T13:52:51.022766Z; main]: 25775 msec to write vectors IW 0 [2022-06-22T13:53:47.565833Z; main]: 24523 msec to write vectors` `SEARCH EUCLIDEAN -niter 500 -beamWidthIndex 100 -maxConn 16 -ndoc 80000 -reindex -docs /Users/sease/JavaProjects/ann-benchmarks/ann_benchmarks/datasets/sift-128-euclidean.hdf5 -search /Users/sease/JavaProjects/ann-benchmarks/ann_benchmarks/datasets/sift-128-euclidean.hdf5 -metric euclidean ORIGINAL completed 500 searches in 1026 ms: 487 QPS CPU time=1025ms completed 500 searches in 1030 ms: 485 QPS CPU time=1029ms completed 500 searches in 1031 ms: 484 QPS CPU time=1030ms THIS BRANCH completed 500 searches in 46 ms: 10869 QPS CPU time=46ms completed 500 searches in 46 ms: 10869 QPS CPU time=46ms completed 500 searches in 47 ms: 10638 QPS CPU time=46ms SEARCH ANGULAR -niter 500 -beamWidthIndex 100 -maxConn 16 -ndoc 80000 -reindex -docs /Users/sease/JavaProjects/ann-benchmarks/ann_benchmarks/datasets/lastfm-64-dot.hdf5 -search /Users/sease/JavaProjects/ann-benchmarks/ann_benchmarks/datasets/lastfm-64-dot.hdf5 -metric angular ORIGINAL completed 500 searches in 154 ms: 3246 QPS CPU time=153ms completed 500 searches in 162 ms: 3086 QPS CPU time=162ms completed 500 searches in 166 ms: 3012 QPS CPU time=166ms THIS BRANCH completed 500 searches in 62 ms: 8064 QPS CPU time=62ms completed 500 searches in 65 ms: 7692 QPS CPU time=65ms completed 500 searches in 63 ms: 7936 QPS CPU time=62ms ` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org