[
https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983016#comment-13983016
]
Timothy Potter commented on SOLR-5468:
--------------------------------------
Starting to work on this ...
First, I think "majority quorum" is too strong for what we really need at the
moment; for now it seems sufficient to let users decide how many replicas a
write must succeed on to be considered successful. In other words, we can
introduce a new, optional integer property when creating a new collection -
minActiveReplicas (need a better name), which defaults to 1 (current behavior).
If >1, then an update won't succeed unless it is ack'd by at least that many
replicas. Activating this feature doesn't make much sense unless a collection
has RF > 2.
The biggest hurdle to adding this behavior is the asynchronous / streaming
based approach leaders use to forward updates on to replicas. The current
implementation uses a callback error handler to deal with failed update
requests (from leader to replica) and simply considers an update successful if
it works on the leader. Part of the complexity is that the leader processes the
update before even attempting to forward on to the replica so there would need
to be some "backing out" work to remove an update that succeeded on the leader
but failed on the replicas. This is starting to get messy ;-)
Another key point here is this feature simply moves the problem from the Solr
server to the client application, i.e. it's a fail-faster approach where a
client indexing app gets notified that writes are not succeeding on enough
replicas to meet the desired threshold. The client application still has to
decide what to do when writes fail.
Lastly, batches! What happens if half of a batch (sent by a client) succeeds
and the other half fails (due to losing a replica in the middle of processing
the batch)? Another idea I had is maybe this isn't a collection-level property,
maybe it is set on a per-request basis?
> Option to enforce a majority quorum approach to accepting updates in SolrCloud
> ------------------------------------------------------------------------------
>
> Key: SOLR-5468
> URL: https://issues.apache.org/jira/browse/SOLR-5468
> Project: Solr
> Issue Type: New Feature
> Components: SolrCloud
> Affects Versions: 4.5
> Environment: All
> Reporter: Timothy Potter
> Priority: Minor
>
> I've been thinking about how SolrCloud deals with write-availability using
> in-sync replica sets, in which writes will continue to be accepted so long as
> there is at least one healthy node per shard.
> For a little background (and to verify my understanding of the process is
> correct), SolrCloud only considers active/healthy replicas when acknowledging
> a write. Specifically, when a shard leader accepts an update request, it
> forwards the request to all active/healthy replicas and only considers the
> write successful if all active/healthy replicas ack the write. Any down /
> gone replicas are not considered and will sync up with the leader when they
> come back online using peer sync or snapshot replication. For instance, if a
> shard has 3 nodes, A, B, C with A being the current leader, then writes to
> the shard will continue to succeed even if B & C are down.
> The issue is that if a shard leader continues to accept updates even if it
> loses all of its replicas, then we have acknowledged updates on only 1 node.
> If that node, call it A, then fails and one of the previous replicas, call it
> B, comes back online before A does, then any writes that A accepted while the
> other replicas were offline are at risk to being lost.
> SolrCloud does provide a safe-guard mechanism for this problem with the
> leaderVoteWait setting, which puts any replicas that come back online before
> node A into a temporary wait state. If A comes back online within the wait
> period, then all is well as it will become the leader again and no writes
> will be lost. As a side note, sys admins definitely need to be made more
> aware of this situation as when I first encountered it in my cluster, I had
> no idea what it meant.
> My question is whether we want to consider an approach where SolrCloud will
> not accept writes unless there is a majority of replicas available to accept
> the write? For my example, under this approach, we wouldn't accept writes if
> both B&C failed, but would if only C did, leaving A & B online. Admittedly,
> this lowers the write-availability of the system, so may be something that
> should be tunable?
> From Mark M: Yeah, this is kind of like one of many little features that we
> have just not gotten to yet. I’ve always planned for a param that let’s you
> say how many replicas an update must be verified on before responding
> success. Seems to make sense to fail that type of request early if you notice
> there are not enough replicas up to satisfy the param to begin with.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]