Keith Laban created SOLR-8599:
---------------------------------

             Summary: Errors in construction of SolrZooKeeper cause  Solr to 
into inconsistent state
                 Key: SOLR-8599
                 URL: https://issues.apache.org/jira/browse/SOLR-8599
             Project: Solr
          Issue Type: Bug
          Components: SolrCloud
            Reporter: Keith Laban


We originally saw this happen due to a DNS exception (see stack trace below). 
Although any exception thrown in the constructor of SolrZooKeeper or the parent 
class, ZooKeeper, will cause DefaultConnectionStrategy to fail to update the 
zookeeper client. Once it gets into this state, it will not try to connect 
again until the process is restarted. The node itself will also respond 
successfully to query requests, but not to update requests.

Two things should be address here:
1) Fix the error handling and issue some number of retries
2) If we are stuck in a state like this stop responding to all requests 

{code}
2016-01-23 13:49:20.222 ERROR ConnectionManager [main-EventThread] - 
:java.net.UnknownHostException: HOSTNAME: unknown error
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at 
org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380)
at org.apache.solr.common.cloud.SolrZooKeeper.<init>(SolrZooKeeper.java:41)
at 
org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:53)
at 
org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:132)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2016-01-23 13:49:20.222 INFO ConnectionManager [main-EventThread] - 
Connected:false
2016-01-23 13:49:20.222 INFO ClientCnxn [main-EventThread] - EventThread shut 
down
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to