- We have solr6.1.0 cluster running on production with 1 shard and 5
replicas.
- Zookeeper quorum on 3 nodes.
- Using a chroot in zookeeper to segregate the configs from other
collections.
- Using solrj5.1.0 as our client to query solr.



Usually things work fine but on and off we witness this exception coming up:
=============================================================
org.apache.solr.common.SolrException: Could not load collection from
ZK:sprod
    at
org.apache.solr.common.cloud.ZkStateReader.getCollectionLive(ZkStateReader.java:815)
    at
org.apache.solr.common.cloud.ZkStateReader$5.get(ZkStateReader.java:477)
    at
org.apache.solr.client.solrj.impl.CloudSolrClient.getDocCollection(CloudSolrClient.java:1174)
    at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:807)
    at
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:782)
--
Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /collections/sprod/state.json
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
    at
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:311)
    at
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:308)
    at
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)
    at
org.apache.solr.common.cloud.SolrZkClient.exists(SolrZkClient.java:308)
--
org.apache.solr.common.SolrException: Could not load collection from
ZK:sprod
    at
org.apache.solr.common.cloud.ZkStateReader.getCollectionLive(ZkStateReader.java:815)
    at
org.apache.solr.common.cloud.ZkStateReader$5.get(ZkStateReader.java:477)
    at
org.apache.solr.client.solrj.impl.CloudSolrClient.getDocCollection(CloudSolrClient.java:1174)
    at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:807)
    at
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:782)
--
Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /collections/sprod/state.json
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
    at
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:311)
    at
org.apache.solr.common.cloud.SolrZkClient$5.execute(SolrZkClient.java:308)
    at
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)
    at
org.apache.solr.common.cloud.SolrZkClient.exists(SolrZkClient.java:308)
=============================================================





This is our zoo.cfg:
======================================
tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=5
syncLimit=2
server.1=192.168.70.27:2888:3888
server.2=192.168.70.64:2889:3889
server.3=192.168.70.26:2889:3889
maxClientCnxns=300
maxSessionTimeout=90000
=======================================





This is our solr.xml on server side
=======================================

<solr>

  <solrcloud>

    <str name="host">${host:}</str>
    <int name="hostPort">${jetty.port:8983}</int>
    <str name="hostContext">${hostContext:solr}</str>

    <bool name="genericCoreNodeNames">${genericCoreNodeNames:true}</bool>

    <int name="zkClientTimeout">${zkClientTimeout:30000}</int>
    <int name="distribUpdateSoTimeout">${distribUpdateSoTimeout:600000}</int>
    <int name="distribUpdateConnTimeout">${distribUpdateConnTimeout:60000}</int>
    <str 
name="zkCredentialsProvider">${zkCredentialsProvider:org.apache.solr.common.cloud.DefaultZkCredentialsProvider}</str>
    <str 
name="zkACLProvider">${zkACLProvider:org.apache.solr.common.cloud.DefaultZkACLProvider}</str>

  </solrcloud>

  <shardHandlerFactory name="shardHandlerFactory"
    class="HttpShardHandlerFactory">
    <int name="socketTimeout">${socketTimeout:600000}</int>
    <int name="connTimeout">${connTimeout:60000}</int>
  </shardHandlerFactory>
</solr>

=======================================




Any help appreciated.

Regards,
Piyush

Reply via email to