[
https://issues.apache.org/jira/browse/SOLR-7569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14877101#comment-14877101
]
Ishan Chattopadhyaya commented on SOLR-7569:
--------------------------------------------
I had the same dilemma while naming this. Recover does seem like it will fix
things if anything is broken, which can be misleading since at this time we
aren't doing anything other than helping fix the LIR state to bring the shard
back up.
On the other hand, I am not sure about force leader, because we aren't really
forcing a leader, but just paving things for an election to happen. I'm really
not totally sure either way.
How about keeping this as recover shard, documenting this as an advanced API
which can potentially cause data loss, and then later add whatever else we need
to recover the system from to this API itself?
> Create an API to force a leader election between nodes
> ------------------------------------------------------
>
> Key: SOLR-7569
> URL: https://issues.apache.org/jira/browse/SOLR-7569
> Project: Solr
> Issue Type: New Feature
> Components: SolrCloud
> Reporter: Shalin Shekhar Mangar
> Assignee: Shalin Shekhar Mangar
> Labels: difficulty-medium, impact-high
> Attachments: SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch,
> SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch,
> SOLR-7569.patch, SOLR-7569.patch, SOLR-7569_lir_down_state_test.patch
>
>
> There are many reasons why Solr will not elect a leader for a shard e.g. all
> replicas' last published state was recovery or due to bugs which cause a
> leader to be marked as 'down'. While the best solution is that they never get
> into this state, we need a manual way to fix this when it does get into this
> state. Right now we can do a series of dance involving bouncing the node
> (since recovery paths between bouncing and REQUESTRECOVERY are different),
> but that is difficult when running a large cluster. Although it is possible
> that such a manual API may lead to some data loss but in some cases, it is
> the only possible option to restore availability.
> This issue proposes to build a new collection API which can be used to force
> replicas into recovering a leader while avoiding data loss on a best effort
> basis.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]