[
https://issues.apache.org/jira/browse/SOLR-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated SOLR-3721:
---------------------------
Assignee: Mark Miller
assigning to mark since it sounds like he is actively working on this
> Multiple concurrent recoveries of same shard?
> ---------------------------------------------
>
> Key: SOLR-3721
> URL: https://issues.apache.org/jira/browse/SOLR-3721
> Project: Solr
> Issue Type: Bug
> Components: multicore, SolrCloud
> Affects Versions: 4.0
> Environment: Using our own Solr release based on Apache revision
> 1355667 from 4.x branch. Our changes to the Solr version is our solutions to
> TLT-3178 etc., and should have no effect on this issue.
> Reporter: Per Steffensen
> Assignee: Mark Miller
> Labels: concurrency, multicore, recovery, solrcloud
> Fix For: 4.0
>
> Attachments: recovery_in_progress.png, recovery_start_finish.log
>
>
> We run a performance/endurance test on a 7 Solr instance SolrCloud setup and
> eventually Solrs lose ZK connections and go into recovery. BTW the recovery
> often does not ever succeed, but we are looking into that. While doing that I
> noticed that, according to logs, multiple recoveries are in progress at the
> same time for the same shard. That cannot be intended and I can certainly
> imagine that it will cause some problems.
> It is just the logs that are wrong, did I make some mistake, or is this a
> real bug?
> See attached grep from log, grepping only on "Finished recovery" and
> "Starting recovery" logs.
> {code}
> grep -B 1 "Finished recovery\|Starting recovery" solr9.log solr8.log
> solr7.log solr6.log solr5.log solr4.log solr3.log solr2.log solr1.log
> solr0.log > recovery_start_finish.log
> {code}
> It can be hard to get an overview of the log, but I have generated a graph
> showing (based alone on "Started recovery" and "Finished recovery" logs) how
> many recoveries are in progress at any time for the different shards. See
> attached recovery_in_progress.png. The graph is also a little hard to get an
> overview of (due to the many shards) but it is clear that for several shards
> there are multiple recoveries going on at the same time, and that several
> recoveries never succeed.
> Regards, Per Steffensen
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]