Houston Putman created SOLR-17821:
-------------------------------------

             Summary: InstallShardData and Recover do not handle failures 
gracefully
                 Key: SOLR-17821
                 URL: https://issues.apache.org/jira/browse/SOLR-17821
             Project: Solr
          Issue Type: Bug
          Components: Backup/Restore
            Reporter: Houston Putman


Whenever a ShardInstall or Recover command succeeds, the shard zk terms will 
only be updated to reflect that they are not zero anymore. This is actually 
handled down in the InstallCoreData cmd, so if 1 core recover/install succeeds, 
then the zk terms will all be either untouched (if the terms are non-zero to 
start) or will all be set to 1. This does not handle errors gracefully.

What we actually want to do is increase the terms of the successful replicas, 
and then the non-successful replicas can start to recover from the successful 
ones. If the leader was unsuccessful, it should give up leadership because its 
shard term is no longer the highest.

Since shardInstall requires collections be read-only, we also need to fix the 
issues with read-only and recovery.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to