[ 
https://issues.apache.org/jira/browse/SOLR-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439176#comment-13439176
 ] 

Mark Miller commented on SOLR-3721:
-----------------------------------

connection-loss should not be involved here - the general move for 
connection-loss is to retry - not go into any recoveries - until you hit a 
session-timeout. There is an exception in the leader election agl where you 
don't want to retry, but that is another story).

(which you may know, but for clarification to those following along).
                
> Multiple concurrent recoveries of same shard?
> ---------------------------------------------
>
>                 Key: SOLR-3721
>                 URL: https://issues.apache.org/jira/browse/SOLR-3721
>             Project: Solr
>          Issue Type: Bug
>          Components: multicore, SolrCloud
>    Affects Versions: 4.0
>         Environment: Using our own Solr release based on Apache revision 
> 1355667 from 4.x branch. Our changes to the Solr version is our solutions to 
> TLT-3178 etc., and should have no effect on this issue.
>            Reporter: Per Steffensen
>              Labels: concurrency, multicore, recovery, solrcloud
>             Fix For: 4.0
>
>         Attachments: recovery_in_progress.png, recovery_start_finish.log
>
>
> We run a performance/endurance test on a 7 Solr instance SolrCloud setup and 
> eventually Solrs lose ZK connections and go into recovery. BTW the recovery 
> often does not ever succeed, but we are looking into that. While doing that I 
> noticed that, according to logs, multiple recoveries are in progress at the 
> same time for the same shard. That cannot be intended and I can certainly 
> imagine that it will cause some problems.
> It is just the logs that are wrong, did I make some mistake, or is this a 
> real bug?
> See attached grep from log, grepping only on "Finished recovery" and 
> "Starting recovery" logs.
> {code}
> grep -B 1 "Finished recovery\|Starting recovery" solr9.log solr8.log 
> solr7.log solr6.log solr5.log solr4.log solr3.log solr2.log solr1.log 
> solr0.log > recovery_start_finish.log
> {code}
> It can be hard to get an overview of the log, but I have generated a graph 
> showing (based alone on "Started recovery" and "Finished recovery" logs) how 
> many recoveries are in progress at any time for the different shards. See 
> attached recovery_in_progress.png. The graph is also a little hard to get an 
> overview of (due to the many shards) but it is clear that for several shards 
> there are multiple recoveries going on at the same time, and that several 
> recoveries never succeed.
> Regards, Per Steffensen

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to