Erick, I was referring to the Achieved Replication Factor section of the Solr reference guide <https://cwiki.apache.org/confluence/display/solr/Read+and+Write+Side+Fault+Tolerance> Maybe I'm misreading it. If an update succeeds on the leader but fails on the replica, it's a success for the client - the replica should get updated when its on line again later. For whatever reason one or more updates never got synced - certainly the root cause was the server being bounced.
I don't retry failed attempts because the app is supposed to record a failure so it can be rerun later. I may need to revisit my error handling. In my case updates are made in batches of up to 5000 docs. An explicit commit is issued at logical completion locations between batches that may be after just one doc, or 20000. Auto commit is at 10 seconds, and auto soft commit 1 second. But to fix the current situation, deleting a replica is the way to go? It happens to be very small - lucky me. -- View this message in context: http://lucene.472066.n3.nabble.com/Replicas-for-same-shard-not-in-sync-tp4272236p4272721.html Sent from the Solr - User mailing list archive at Nabble.com.