[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918063#action_12918063
 ] 

Patrick Hunt commented on ZOOKEEPER-885:
----------------------------------------

A question on the user list this AM triggered a brain cell, have you tuned  
"maxClientCnxns" in your server configuration?

See this configuration param in the docs "maxClientCnxns":
http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_advancedConfiguration

If all your clients are from a single IP, and if this server config is not 
changed from the default (10) then you will only be able to maintain 30 
sessions, with a max of 10 sessions from a single IP per server.

You can check for something like the following in your server log files:

Too many connections from /########## - max is ######

(### replaced with ipaddr and max setting respectively)


Also, your comments included the following for latency from the server 
"(0/10/91)", which means that the max session response time to the client you 
are seeing is 91 milliseconds. This rules out the server being responsible for 
causing the clients to expire. If the max was above your timeout (10sec) then 
it might be possible, but the server is having no problems replying to the 
client in this case, even under the load you are applying.

Can you attach one/more of your server logs which display the problem (ie 
during the time the clients are getting expired). That will help to track this 
down.


> Zookeeper drops connections under moderate IO load
> --------------------------------------------------
>
>                 Key: ZOOKEEPER-885
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-885
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.2.2
>         Environment: Debian (Lenny)
> 1Gb RAM
> swap disabled
> 100Mb heap for zookeeper
>            Reporter: Alexandre Hardy
>            Priority: Critical
>         Attachments: WatcherTest.java
>
>
> A zookeeper server under minimum load, with a number of clients watching 
> exactly one node will fail to maintain the connection when the machine is 
> subjected to moderate IO load.
> In a specific test example we had three zookeeper servers running on 
> dedicated machines with 45 clients connected, watching exactly one node. The 
> clients would disconnect after moderate load was added to each of the 
> zookeeper servers with the command:
> {noformat}
> dd if=/dev/urandom of=/dev/mapper/nimbula-test
> {noformat}
> The {{dd}} command transferred data at a rate of about 4Mb/s.
> The same thing happens with
> {noformat}
> dd if=/dev/zero of=/dev/mapper/nimbula-test
> {noformat}
> It seems strange that such a moderate load should cause instability in the 
> connection.
> Very few other processes were running, the machines were setup to test the 
> connection instability we have experienced. Clients performed no other read 
> or mutation operations.
> Although the documents state that minimal competing IO load should present on 
> the zookeeper server, it seems reasonable that moderate IO should not cause 
> problems in this case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to