Hi everyone,

There seems to be an intermittent issue in my solrcloud deployment. I have
a tlog + pull replica setup.

I noticed that a new pull replica got stuck in recovery trying to replicate
from the leader again and again, and we eventually drained it (it drains
automatically after 10 minutes if it fails to become "up").

Here are the logs:












*Dec 17 2024, 20:38:44 Recovery failed - trying again... (1)Dec 17 2024,
20:38:44 Error while trying to recoverDec 17 2024, 20:38:44 No files to
download for index generation: 9122Dec 17 2024, 20:38:44 Wait [8] seconds
before trying to recover again (attempt=2)Dec 17 2024, 20:38:44 Follower's
version: 1734429265730Dec 17 2024, 20:38:44 Follower's generation: 9120Dec
17 2024, 20:38:44 Leader's version: 1734431017515Dec 17 2024, 20:38:44
Leader's generation: 9122Dec 17 2024, 20:38:44 Last replication failed, so
I'll force replicationDec 17 2024, 20:38:44 Attempting to replicate from
core [collection_o2_shard1_replica_t15913] on node
[http://10.7.172.240:8080/solr <http://10.7.172.240:8080/solr>].Dec 17
2024, 20:38:44 collection_o2_shard1_replica_p15981 stopping background
replication from leaderDec 17 2024, 20:38:44 Stopping background replicate
from leader processDec 17 2024, 20:38:44 Starting Replication Recovery.*

The weird thing is that indexing frequency on this cluster is decent, so
there should have been files present between generation 9122 and 9120. The
pull replica does not have a very old index either. Moreover even if no
files were found, why would replication fail?

Note: the replica was in 9120 because before replica addition into the
cluster we download a backup so that it does not need a full sync from the
leader.

Help would be appreciated.

Thanks in advance

Reply via email to