[
https://issues.apache.org/jira/browse/SOLR-8619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15122781#comment-15122781
]
Jason Gerlowski edited comment on SOLR-8619 at 1/29/16 2:22 AM:
----------------------------------------------------------------
Throwing in my 2 cents. New to SolrCloud, so feel free to ignore...
+1 for having a check to ensure that a replica isn't marked as a leader unless
it's had a chance to sync with a leader.
+1 for having ADDREPLICA calls fail if there are no active replicas. I'd be
fine with allowing API users to create not-ready-for-leadership replicas if
there was a great way of conveying that caveat to them. But short of adding a
replica-state option to CLUSTERSTATUS to convey this caveat, I can't think of a
good way to do this. IMO, it seems cleaner conceptually to prevent users up
front from getting into this state. Bit hand-wavy though, so take this
rationale with a grain of salt.
was (Author: gerlowskija):
Throwing in my 2 cents. New to SolrCloud, so feel free to ignore...
+1 for having a check to ensure that a replica isn't marked as a leader unless
it's had a chance to sync with a leader.
+1 for having ADDREPLICA calls fail if there are no active replicas. I'd be
fine with allowing API users to create not-ready-for-leadership replicas if
there was a great way of conveying that caveat to them. But short of adding a
replica-state option to CLUSTERSTATUS, I can't think of a good way to do this.
IMO, it seems cleaner conceptually to prevent users up front from getting into
this state. Bit hand-wavy though, so take this rationale with a grain of salt.
> A new replica should not become leader when all current replicas are down as
> it leads to data loss
> --------------------------------------------------------------------------------------------------
>
> Key: SOLR-8619
> URL: https://issues.apache.org/jira/browse/SOLR-8619
> Project: Solr
> Issue Type: Bug
> Reporter: Anshum Gupta
>
> Here's what I'm talking about:
> * Start a 2 node solrcloud cluster
> * Create a 1 shard/1 replica collection
> * Add documents
> * Shut down the node that has the only active shard
> * ADDREPLICA for the shard/collection, so Solr would attempt to add a new
> replica on the other node
> * Solr waits for a while before this replica becomes an active leader.
> * Index a few new docs
> * Bring up the old node
> * The replica comes up, with it's old index and then syncs to only contain
> the docs from the new leader.
> All old documents are lost in this case
> Here are a few things that might work here:
> 1. Reject an ADDREPLICA call if all current replicas for the shard are down.
> Considering the new replica can not sync from anyone, it doesn't make sense
> for this replica to even come up
> 2. The replica shouldn't become active/leader unless either it was the last
> known leader or active before it went into recovering state
> unless there are no other replicas in the clusterstate.
> This might very well be related to SOLR-8173 but we should add a check to
> ADDREPLICA as well.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]