[ https://issues.apache.org/jira/browse/SOLR-9446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Noble Paul resolved SOLR-9446. ------------------------------ Resolution: Fixed Fix Version/s: master (7.0) 6.3 > Leader failure after creating a freshly replicated index can send nodes into > recovery even if index was not changed > ------------------------------------------------------------------------------------------------------------------- > > Key: SOLR-9446 > URL: https://issues.apache.org/jira/browse/SOLR-9446 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) > Reporter: Pushkar Raste > Assignee: Noble Paul > Priority: Minor > Fix For: 6.3, master (7.0) > > Attachments: SOLR-9446.patch > > > We noticed this issue while migrating solr index from machines {{A1, A2 and > A3}} to {{B1, B2, B3}}. We followed following steps (and there were no > updates during the migration process). > * Index had replicas on machines {{A1, A2, A3}}. Let's say {{A1}} was the > leader at the time > * We added 3 more replicas {{B1, B2 and B3}}. These nodes synced with the by > replication. These fresh nodes do not have tlogs. > * We shut down one of the old nodes ({{A3}}). > * We then shut down the leader ({{A1}}) > * New leader got elected (let's say {{A2}}) became the new leader > * Leader asked all the replicas to sync with it > * Fresh nodes (ones without tlogs), first tried PeerSync but since there was > no frame of reference, PeerSync failed and fresh nodes fail back on to try > replication > Although replication would not copy all the segments again, it seems like we > can short circuit sync to put nodes back in active state as soon as possible. > If in case freshly replicated index becomes leader for some reason, it can > still send nodes (both other freshly replicated indexes and old replicas) > into recovery. Here is the scenario > * Freshly replicated becomes the leader. > * New leader however asks all the replicas to sync with it. > * Replicas (including old one) ask for versions from the leader, but the > leader has no update logs, hence replicas can not compute missing versions > and falls back to replication -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org