[ 
https://issues.apache.org/jira/browse/SOLR-4933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686097#comment-13686097
 ] 

Mark Miller commented on SOLR-4933:
-----------------------------------

On a first pass the 500 error *looks* like it's coming from...

A sub shard that is created on the split command has just become a leader - it 
says it has no replicas during the sync phase.

At around the same time, a request to wait on seeing a certain state fails 
because the node that it is made to complains it is not the leader.

{noformat}
oasc.OverseerCollectionProcessor.processResponse ERROR Error from shard: 
127.0.0.1:41393/fo/l 
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: We are 
not the leader
{noformat}

The wait that fails here seems to be:
{noformat}
        // wait for parent leader to acknowledge the sub-shard core
        log.info("Asking parent leader to wait for: " + subShardName + " to be 
alive on: " + nodeName);
        CoreAdminRequest.WaitForState cmd = new CoreAdminRequest.WaitForState();
        cmd.setCoreName(subShardName);
        cmd.setNodeName(nodeName);
        cmd.setCoreNodeName(nodeName + "_" + subShardName);
        cmd.setState(ZkStateReader.ACTIVE);
        cmd.setCheckLive(true);
        cmd.setOnlyIfLeader(true);
        sendShardRequest(nodeName, new ModifiableSolrParams(cmd.getParams()));
{noformat}

There a variety of reasons the leader might be briefly changing. There may be 
more to dig up here, but it looks like it also might be a good idea to be 
willing to retry this on this type of error.
                
> org.apache.solr.cloud.ShardSplitTest.testDistribSearch fails often with a 500 
> error.
> ------------------------------------------------------------------------------------
>
>                 Key: SOLR-4933
>                 URL: https://issues.apache.org/jira/browse/SOLR-4933
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>            Reporter: Mark Miller
>             Fix For: 5.0, 4.4
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to