[
https://issues.apache.org/jira/browse/SOLR-12087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410578#comment-16410578
]
Lucene/Solr QA commented on SOLR-12087:
---------------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color}
| {color:red} SOLR-12087 does not apply to master. Rebase required? Wrong
Branch? See
https://wiki.apache.org/solr/HowToContribute#Creating_the_patch_file for help.
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-12087 |
| JIRA Patch URL |
https://issues.apache.org/jira/secure/attachment/12915636/SOLR-12087.patch |
| Console output |
https://builds.apache.org/job/PreCommit-SOLR-Build/12/console |
| Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
This message was automatically generated.
> Deleting replicas sometimes fails and causes the replicas to exist in the
> down state
> ------------------------------------------------------------------------------------
>
> Key: SOLR-12087
> URL: https://issues.apache.org/jira/browse/SOLR-12087
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: SolrCloud
> Affects Versions: 7.2
> Reporter: Jerry Bao
> Assignee: Cao Manh Dat
> Priority: Critical
> Attachments: SOLR-12087.patch, SOLR-12087.patch, SOLR-12087.patch,
> SOLR-12087.test.patch, Screen Shot 2018-03-16 at 11.50.32 AM.png
>
>
> Sometimes when deleting replicas, the replica fails to be removed from the
> cluster state. This occurs especially when deleting replicas en mass; the
> resulting cause is that the data is deleted but the replicas aren't removed
> from the cluster state. Attempting to delete the downed replicas causes
> failures because the core does not exist anymore.
> This also occurs when trying to move replicas, since that move is an add and
> delete.
> Some more information regarding this issue; when the MOVEREPLICA command is
> issued, the new replica is created successfully but the replica to be deleted
> fails to be removed from state.json (the core is deleted though) and we see
> two logs spammed.
> # The node containing the leader replica continually (read every second)
> attempts to initiate recovery on the replica and fails to do so because the
> core does not exist. As a result it continually publishes a down state for
> the replica to zookeeper.
> # The deleted replica node spams that it cannot locate the core because it's
> been deleted.
> During this period of time, we see an increase in ZK network connectivity
> overall, until the replica is finally deleted (spamming DELETEREPLICA on the
> shard until its removed from the state)
> My guess is there's two issues at hand here:
> # The leader continually attempts to recover a downed replica that is
> unrecoverable because the core does not exist.
> # The replica to be deleted is having trouble being deleted from state.json
> in ZK.
> This is mostly consistent for my use case. I'm running 7.2.1 with 66 nodes.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]