Re: SolrCloud: CloudSolrServer Zookeeper disconnects and re-connects with heavy memory usage consumption.

Luis Cappa Banda Wed, 12 Dec 2012 10:03:05 -0800

I´ve read the following in SolrCloud FAQ:

*"Q:* I'm seeing lot's of session timeout exceptions - what to do?


   -

   *A:* Try raising the ZooKeeper
<http://wiki.apache.org/solr/ZooKeeper> session
   timeout by editing solr.xml - see the zkClientTimeout attribute. The
   minimum session timeout is 2 times your
ZooKeeper<http://wiki.apache.org/solr/ZooKeeper> defined
   tickTime. The maximum is 20 times the tickTime. The default tickTime is 2
   seconds. You should avoiding raising this for no good reason, but it should
   be high enough that you don't see a lot of false session timeouts due to
   load, network lag, or garbage collection pauses. Some environments might
   need to go as high as 30-60 seconds."


Any suggestion or recommendation? What about increasing tickTime to 10
seconds with zkClientTimeout = 30 seconds?


2012/12/12 Luis Cappa Banda <luisca...@gmail.com>

> Hello everyone.
>
> I have developed and stand alone WebApp with a custom API that dispatches
> queries to SolrCloud using CloudSolrServer implementation to do that. I´m
> testing with a single Zookeeper instance installed in an Amazon instance.
> Solr servers are deployed in two Amazon instances and I have one intance
> more which contains the custom search API engine that I told before. I´m
> using a *30000ms *of Zookeeper *zkConnectdTimeout *and *zkClientTimeout*.
>
>
> With that scenario I´ve noticed that everything works fine with
> CloudSolrServer but frequently I see loggin traces as the following:
>
>
> *2012-12-12 17:35:41,932 30486688
> [http-bio-8080-exec-7-SendThread(amazon-dns:9000)] INFO
>  org.apache.zookeeper.ClientCnxn  - Client session timed out, have not
> heard from server in 67044ms for sessionid 0x13b8a4218720055, closing
> socket connection and attempting reconnect*
> *
> *
> *2012-12-12 17:35:41,996 30486752
> [http-bio-8080-exec-8-SendThread(amazon-dns:9000)] INFO
>  org.apache.zookeeper.ClientCnxn  - Client session timed out, have not
> heard from server in 67301ms for sessionid 0x13b8a4218720052, closing
> socket connection and attempting reconnect*
> *
> *
> *2012-12-12 17:35:42,077 30522458
> [pool-1-thread-1-SendThread(amazon-dns:9000)] INFO
>  org.apache.zookeeper.ClientCnxn  - Client session timed out, have not
> heard from server in 67299ms for sessionid 0x13b8a4218720053, closing
> socket connection and attempting reconnect*
> *
> *
> *2012-12-12 17:35:42,286 30487042 [http-bio-8080-exec-7-EventThread] INFO
>  org.apache.solr.common.cloud.ConnectionManager  - Watcher
> org.apache.solr.common.cloud.ConnectionManager@20c5f562name:ZooKeeperConnection
>  Watcher:amazon-dns:9000 got event WatchedEvent
> state:Disconnected type:None path:null path:null type:None*
>
>
>
> The message is clear: nothing have been heard from the server in
> 67seconds. It´s strange, because Zookeeper Amazon instance is up and
> Zookeeper the service is up. Also a connection problem would be extremely
> strange because communication between Amazon instances is asumed to be
> always on.
>
> After that, I start seeing logging traces as:
>
>
> *2012-12-12 17:37:15,501 30580257 [http-bio-8080-exec-7-EventThread] INFO
>  org.apache.solr.common.cloud.ZkStateReader  - Updating cluster state from
> ZooKeeper...*
> *
> *
> *2012-12-12 17:37:15,510 30615891 [pool-1-thread-2-EventThread] INFO
>  org.apache.solr.common.cloud.ConnectionManager  - Waiting for client to
> connect to ZooKeeper*
> *
> *
> *2012-12-12 17:37:15,512 30580268 [http-bio-8080-exec-7-EventThread] INFO
>  org.apache.solr.common.cloud.DefaultConnectionStrategy  - Reconnected to
> ZooKeeper*
> *
> *
> *2012-12-12 17:37:15,541 30580297 [http-bio-8080-exec-7-EventThread] INFO
>  org.apache.solr.common.cloud.ConnectionManager  - Connected:true*
> *
> *
> *2012-12-12 17:37:15,541 30580297 [http-bio-8080-exec-7-EventThread] INFO
>  org.apache.zookeeper.ClientCnxn  - EventThread shut down*
>
>
>
> But the *big problem* is that when this kind of
> disconnnect-reconnect-disconnect-reconnect behavior happens the WebApp
> seems to be blocked (it looks like CloudSolrServer Zookeeper status update
> is blocking) and I continue receiven search queries. The result is that
> memory increases and increases and the search engine Web App module gets
> almost blocked. It seems that this kind of Zookeeper status update is
> heavy-memory-consumer and also blocking.
>
> Does anyone experienced a behavior like that? Any tips or suggestions?
>
>
> Thank you very much in advance for your help.
>
> Regards,
>
> --
>
> - Luis Cappa
>
>


-- 

- Luis Cappa

Re: SolrCloud: CloudSolrServer Zookeeper disconnects and re-connects with heavy memory usage consumption.

Reply via email to