[
https://issues.apache.org/jira/browse/SOLR-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ben DeMott updated SOLR-10284:
------------------------------
Affects Version/s: 7.0
7.1
7.2
> Solr connection to Standalone node in Ensemble causes cluster failure
> ---------------------------------------------------------------------
>
> Key: SOLR-10284
> URL: https://issues.apache.org/jira/browse/SOLR-10284
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: SolrCloud
> Affects Versions: 6.3, 6.4, 7.0, 7.1, 7.2
> Environment: Solrcloud, with Zookeeper <any version>
> Reporter: Ben DeMott
> Priority: Major
>
> I posted this issue on the Dev mailing list and was encouraged to create a
> Jira ticket. This isn't a bug per-se.
> Solr connects / reconnects to "Standalone" Zookeeper nodes, within an
> ensemble cluster, which causes absolute havoc.
> I work for Dice.com, as one of the core search developers.
> I'm happy to write a patch, as we'll probably do that internally anyways. I
> just want to get consensus from the community about how to provide the best
> solution.
> My original email describing the issue:
> http://mail-archives.apache.org/mod_mbox/lucene-dev/201703.mbox/raw/%3CCACbtCQ2cSPA8NbnqCbXZE9nZdT40xFHjpUhAOqUnd%3DqZaRMEsA%40mail.gmail.com%3E/2
> Proposed Solution:
> My thought was an explicit setting in solr.in.sh "ZK_STANDALONE" (which would
> default to TRUE for the solr.in.sh file found next to bin/solr). Upon
> connection or reconnection of the Zookeeper Client, it would ask the server
> "are you standalone", and disconnect if it is and ZK_STANDALONE=false, and
> try the next host. If all hosts are in standalone, an error would be shown -
> "No zookeeper hosts available, that aren't in standalone operation - The
> setting ZK_STANDALONE=false prevents connecting to a standalone Zookeeper"
> In order to urge users to use the setting, I would possibly also have a
> warning shown in the logs, if your ZK_HOSTS is set, has multiple hosts in the
> connection string, and ZK_STANDALONE is not false.
> I can't think of any implicit way to internalize a setting.... Other than....
> ZK_HOSTS connection string setting has multiple hosts, there should be no
> scenario in which any node is standalone, so you could assume there should be
> no standalone servers. But maybe an explicit setting is preferable.
> This solution should be:
> 1.) backwards compatible
> 2.) have very little performance impact (1 extra call upon connection to ZK)
> 3.) isolated to one part of the code.
> *Update 6/26/2017:*
> I started working on this, and it occurred to me the same issue exists for
> *SolrJ* clients. So SolrJ might be the place to make this change. I'm not
> sure yet.
> A SolrJ client that has a multi-zk-node connection string that connects (even
> temporarily) to a zk host that is standalone will believe there are no Solr
> hosts that can answer the query, and you'll get the following error.
> {{CloudSolrClient - Request to collection efc-profiles-match-col failed due
> to (510) org.apache.solr.common.SolrException: Could not find a healthy node
> to handle the request.}}
> I am not as familiar with the SolrJ codebase ... so I'll have to do some
> digging.
> Instead of moving onto a different Zookeeper host, the SolrJ client will
> think everything is fully working, just no Solr Hosts or Collections
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]