[ 
https://issues.apache.org/jira/browse/SOLR-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben DeMott updated SOLR-10284:
------------------------------
    Affects Version/s: 7.0
                       7.1
                       7.2

> Solr connection to Standalone node in Ensemble causes cluster failure
> ---------------------------------------------------------------------
>
>                 Key: SOLR-10284
>                 URL: https://issues.apache.org/jira/browse/SOLR-10284
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 6.3, 6.4, 7.0, 7.1, 7.2
>         Environment: Solrcloud, with Zookeeper <any version>
>            Reporter: Ben DeMott
>            Priority: Major
>
> I posted this issue on the Dev mailing list and was encouraged to create a 
> Jira ticket.  This isn't a bug per-se.
> Solr connects / reconnects to "Standalone" Zookeeper nodes, within an 
> ensemble cluster, which causes absolute havoc. 
> I work for Dice.com, as one of the core search developers.
> I'm happy to write a patch, as we'll probably do that internally anyways.  I 
> just want to get consensus from the community about how to provide the best 
> solution.
> My original email describing the issue: 
> http://mail-archives.apache.org/mod_mbox/lucene-dev/201703.mbox/raw/%3CCACbtCQ2cSPA8NbnqCbXZE9nZdT40xFHjpUhAOqUnd%3DqZaRMEsA%40mail.gmail.com%3E/2
> Proposed Solution:
> My thought was an explicit setting in solr.in.sh "ZK_STANDALONE" (which would 
> default to TRUE for the solr.in.sh file found next to bin/solr).  Upon 
> connection or reconnection of the Zookeeper Client, it would ask the server 
> "are you standalone", and disconnect if it is and ZK_STANDALONE=false, and 
> try the next host.  If all hosts are in standalone, an error would be shown - 
> "No zookeeper hosts available, that aren't in standalone operation - The 
> setting ZK_STANDALONE=false prevents connecting to a standalone Zookeeper"
> In order to urge users to use the setting, I would possibly also have a 
> warning shown in the logs, if your ZK_HOSTS is set, has multiple hosts in the 
> connection string, and ZK_STANDALONE is not false.
> I can't think of any implicit way to internalize a setting.... Other than.... 
>  ZK_HOSTS connection string setting has multiple hosts, there should be no 
> scenario in which any node is standalone, so you could assume there should be 
> no standalone servers.  But maybe an explicit setting is preferable.
> This solution should be:
> 1.) backwards compatible
> 2.) have very little performance impact (1 extra call upon connection to ZK)
> 3.) isolated to one part of the code.
> *Update 6/26/2017:*
> I started working on this, and it occurred to me the same issue exists for 
> *SolrJ* clients.  So SolrJ might be the place to make this change. I'm not 
> sure yet.
> A SolrJ client that has a multi-zk-node connection string that connects (even 
> temporarily) to a zk host that is standalone will believe there are no Solr 
> hosts that can answer the query, and you'll get the following error.  
> {{CloudSolrClient - Request to collection efc-profiles-match-col failed due 
> to (510) org.apache.solr.common.SolrException: Could not find a healthy node 
> to handle the request.}}
> I am not as familiar with the SolrJ codebase ... so I'll have to do some 
> digging.
> Instead of moving onto a different Zookeeper host, the SolrJ client will 
> think everything is fully working, just no Solr Hosts or Collections
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to