[
https://issues.apache.org/jira/browse/SOLR-8227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14995439#comment-14995439
]
Sandeep J commented on SOLR-8227:
---------------------------------
Very valid concerns/comments.
By design Solr is a CP model, so theoretically speaking we should be able to
recover from any active replica. But to make it happen without leaving the
system in an inconsistent state (i.e without compromising ‘C’) is an
implementation level discussion in my opinion.
The main problem that we have seen in our prod environment is that during peak
traffic if few nodes go into recovery and SnapPuller gets into action we get an
‘Oh Snap’ moment :) So what I am trying to say is that Peer Syncs are light
weight as compared to full replication.
I understand the co-ordination concept that Yonik mentioned that goes on
between leader and recovering node, but that seems to apply for live updates
isn’t it? Or I am missing something. In full replication, I would believe that
index files(which is hard committed data) are being copied from source to
destination, so if somehow this heavy duty operation can be offloaded from the
leader it will help.
Also, full replication can take minutes to complete depending on the size of
the index, traffic and network. In an environment where we have one leader and
20 replicas, all the recovering nodes go to the leader who is also busy with
reads/writes. During this time window of recovery the leader can also change or
go into recovery, as mentioned by Tim. So even today after full replication
from the leader the recovered node should perform some kind of sanity check
with the latest leader, just to make sure that it has not missed any updates.
Piggy backing on Ishan’s proposal.
If we put this sanity check or peer sync from the latest leader after the
replication then even if we do replication from active replica I think we
should be good. So here is the new refined proposal:
1. Recovering node gets an active replica (it could be leader)
2. After the peer sync, replication is started from the active replica found in
#1
3. Once replication is complete, recovering node gets the leader node from the
cluster state.
4. Recovering node performs a check with the leader.
5. If the number of missed updates is small such that there is no need for full
recovery, recovering node takes the updates from the leader.
6. If the number of missed updates is still large and it needs full recovery,
go back to step #1. Though likelihood of this scenario is less and it can be
avoided by having a larger size of ‘UpdateLog’. Ref:
https://issues.apache.org/jira/browse/SOLR-6359 .
Varun mentioned earlier that even during recovery phase the node gets updates
from the leader, so the likelihood of running into #5, #6 seems bleak but it
could happen with network partitions or long recovery times.
Now this proposal is based on my minimal knowledge of how a solr node
identifies that it has missed some updates, if someone can shed some light on
it that would really help me understand how step #4 takes place.
> Recovering replicas should be able to recover from any active replica
> ---------------------------------------------------------------------
>
> Key: SOLR-8227
> URL: https://issues.apache.org/jira/browse/SOLR-8227
> Project: Solr
> Issue Type: Improvement
> Reporter: Varun Thacker
>
> Currently when a replica goes into recovery it uses the leader to recover. It
> first tries to do a PeerSync. If thats not successful it does a
> replication. Most of the times it ends up doing a full replication because
> segment merging, autoCommits causing segments to be formed differently on the
> replicas ( We should explore improving that in another issue ) .
> But when many replicas are recovering and hitting the leader, the leader can
> become a bottleneck. Since Solr is a CP system , we should be able to recover
> from any of the 'active' replicas instead of just the leader.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]