Pulkitg64 commented on PR #15003:
URL: https://github.com/apache/lucene/pull/15003#issuecomment-3499486112
@benwtrent on your suggestion of using a different threshold for
reconnecting nodes, I re-ran benchmark with 100k docs, maxConn=32 ,10% random
deletions, just to see the impact on time and recall. Here is the comparison:
* Baseline: (Current mainline code):
* Recall: 0.926
* forceMergeTime: 136.24
* Candidate-0: (This PR) (Considering nodes as disconnected for which number
of connections is less than 50% of maxConn):
* No. of disconnected nodes at level 0: 64571 (64k out of 90k)
* Recall: 0.923
* forceMergeTime: 99.66 sec
* Candidate-1 (Considering nodes as disconnected for which number of
connections is less than 50% of their previous number of connections):
* No. of disconnected nodes at level 0: 17
* Recall: 0.876
* forceMergeTime: 1.24 sec
* Candidate-2 (Considering nodes as disconnected for which number of
connections is less than 75% of their previous number of connections):
* No. of disconnected nodes at level 0: 855
* Recall: 0.880
* forceMergeTime: 2.48 sec
* Candidate-3 (Considering nodes as disconnected for which number of
connections is less than 25% of maxConn):
* No. of disconnected nodes at level 0: 35495
* Recall: 0.907
* forceMergeTime: 50.95
(Just for comparison)
* Without any reconnection (Just dropping nodes):
* Recall: 0.877
* forceMergeTime: 1.12 sec
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]