Hi Everyone, I was looking into a scenario where PeerSync failed even when we had a high number maxNumLogsToKeep ( 200 ) and numRecordsToKeep ( 200000 )
The log excerpt is at https://gist.github.com/vthacker/fb536c6f1146dd0d7513afb9960a10e3 and I am still trying to pinpoint the actual cause . It looks to me that the replica has more number of documents till that version ( numVersions ) than the leader and I can't tell why. Does this look like a bug? While trying to reproduce it locally here is one scenario that I ran into : 1. I kept a very low numRecordsToKeep ( 5 ) . Indexed like 3 or 4 docs while the replica was down and then started it up. PeerSync failed because of https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/update/PeerSync.java#L655 . Do we need to do a threshold check when we are verifying via fingerprinting if the indexes are the same or not? From my understanding we can avoid this check when fingerprinting is enabled but wanted to check before filing a Jira
