Hi everyone, There seems to be an intermittent issue in my solrcloud deployment. I have a tlog + pull replica setup.
I noticed that a new pull replica got stuck in recovery trying to replicate from the leader again and again, and we eventually drained it (it drains automatically after 10 minutes if it fails to become "up"). Here are the logs: *Dec 17 2024, 20:38:44 Recovery failed - trying again... (1)Dec 17 2024, 20:38:44 Error while trying to recoverDec 17 2024, 20:38:44 No files to download for index generation: 9122Dec 17 2024, 20:38:44 Wait [8] seconds before trying to recover again (attempt=2)Dec 17 2024, 20:38:44 Follower's version: 1734429265730Dec 17 2024, 20:38:44 Follower's generation: 9120Dec 17 2024, 20:38:44 Leader's version: 1734431017515Dec 17 2024, 20:38:44 Leader's generation: 9122Dec 17 2024, 20:38:44 Last replication failed, so I'll force replicationDec 17 2024, 20:38:44 Attempting to replicate from core [collection_o2_shard1_replica_t15913] on node [http://10.7.172.240:8080/solr <http://10.7.172.240:8080/solr>].Dec 17 2024, 20:38:44 collection_o2_shard1_replica_p15981 stopping background replication from leaderDec 17 2024, 20:38:44 Stopping background replicate from leader processDec 17 2024, 20:38:44 Starting Replication Recovery.* The weird thing is that indexing frequency on this cluster is decent, so there should have been files present between generation 9122 and 9120. The pull replica does not have a very old index either. Moreover even if no files were found, why would replication fail? Note: the replica was in 9120 because before replica addition into the cluster we download a backup so that it does not need a full sync from the leader. Help would be appreciated. Thanks in advance