[
https://issues.apache.org/jira/browse/SOLR-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15388541#comment-15388541
]
Erick Erickson commented on SOLR-7280:
--------------------------------------
Well, I'll still test it for fun, if _you're_worried I should be more se. But
it sounds like you're getting more comfortable with the approach.
Anyway, a couple of things:
bq: If you simply do that and walk away and come back in the morning, it will
work....
These aren't Junit tests, but scripts because of similar concerns I had. They:
1> bring up and down real solr JVMs via shell scripts, so the default 3 minute
timeout is in place That said, making it longer (say 20 minutes?) would
emphasize the issue....and...
2> I have a monitor process that records how long it took for all the replicas
to come up so I can see any anomalies.
3> brings JVMs up and down in different orders to avoid happening to have all
the leaders on the first node that comes up.
I've been nervous that the way I'm testing certainly isn't foolproof.
bq: registering in ZK got spun off into it's own thread
Hmm, interesting since the symptom I'm seeing is being unable to spawn native
threads and a large spike in threads during startup (steady-state 1,200
threads, startup started crapping out around 2,600)..... upping the Xmx or Xss
(or both) doesn't matter and bumping the ulimit didn't either....
> Load cores in sorted order and tweak coreLoadThread counts to improve cluster
> stability on restarts
> ---------------------------------------------------------------------------------------------------
>
> Key: SOLR-7280
> URL: https://issues.apache.org/jira/browse/SOLR-7280
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud
> Reporter: Shalin Shekhar Mangar
> Assignee: Noble Paul
> Fix For: 6.2, 5.5.3
>
> Attachments: SOLR-7280-5x.patch, SOLR-7280-5x.patch,
> SOLR-7280-5x.patch, SOLR-7280-test.patch, SOLR-7280.patch, SOLR-7280.patch
>
>
> In SOLR-7191, Damien mentioned that by loading solr cores in a sorted order
> and tweaking some of the coreLoadThread counts, he was able to improve the
> stability of a cluster with thousands of collections. We should explore some
> of these changes and fold them into Solr.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]