[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arshad Mohammad updated ZOOKEEPER-2570:
---------------------------------------
    Description: 
ZooKeeper clients are timed out when ZooKeeper servers are very busy. Clients 
throw below exception and fail all the pending operations
{code}
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
{code}
Clients log bellow information
{noformat}
2016-09-22 01:49:08,001 [myid:127.0.0.1:11228] - WARN  
[main-SendThread(127.0.0.1:11228):ClientCnxn$SendThread@1181] - Client session 
timed out, have not heard from server in 13908ms for sessionid 0x20000d21b280000
2016-09-22 01:49:08,001 [myid:127.0.0.1:11228] - INFO  
[main-SendThread(127.0.0.1:11228):ClientCnxn$SendThread@1229] - Client session 
timed out, have not heard from server in 13908ms for sessionid 
0x20000d21b280000, closing socket connection and attempting reconnect
{noformat}
*STEPS TO REPRODECE:*
# Create multi operation
{code}
List<Op> ops = new ArrayList<Op>();
        for (int i = 0; i < N; i++) {
            Op create = Op.create(rootNode + "/" + i, "".getBytes(), 
ZooDefs.Ids.OPEN_ACL_UNSAFE,
                    CreateMode.PERSISTENT);
            ops.add(create);
        }
{code}
Chose N in such a way that the total multi operation request  size is less than 
but near 1 MB.  For bigger request size increase jute.maxbuffer in servers
# Submit the multi operation request
{code} zooKeeper.multi(ops);{code} 
# After repeating above steps few times issue is reproduced


  was:
ZooKeeper server expires the client session when server is continuously under 
higher load. Below steps can reproduce the issue
# Create multi operation
{code}
List<Op> ops = new ArrayList<Op>();
        for (int i = 0; i < N; i++) {
            Op create = Op.create(rootNode + "/" + i, "".getBytes(), 
ZooDefs.Ids.OPEN_ACL_UNSAFE,
                    CreateMode.PERSISTENT);
            ops.add(create);
        }
{code}
Chose N in such a way that the total multi operation request  size is less than 
but near 1 MB.  For bigger request size increase jute.maxbuffer in servers
# Submit the multi operation request
{code} zooKeeper.multi(ops);{code} 
# After repeating above steps few times client throws  
{{ConnectionLossException}}  and at server one can find log  "Expiring session 
0x100b0ff5ecc0003, timeout of xxxxms exceeded"

Normally server expires session when it is not receiving ping from the client 
for longer than  the client's session time-out. But in this case client is 
continuously doing operation with the server.  So server should not expire the 
session.



> ZooKeeper clients are timed out when ZooKeeper servers are very busy
> --------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2570
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2570
>             Project: ZooKeeper
>          Issue Type: Bug
>            Reporter: Arshad Mohammad
>            Assignee: Arshad Mohammad
>            Priority: Critical
>
> ZooKeeper clients are timed out when ZooKeeper servers are very busy. Clients 
> throw below exception and fail all the pending operations
> {code}
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 
> = ConnectionLoss
>       at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
> {code}
> Clients log bellow information
> {noformat}
> 2016-09-22 01:49:08,001 [myid:127.0.0.1:11228] - WARN  
> [main-SendThread(127.0.0.1:11228):ClientCnxn$SendThread@1181] - Client 
> session timed out, have not heard from server in 13908ms for sessionid 
> 0x20000d21b280000
> 2016-09-22 01:49:08,001 [myid:127.0.0.1:11228] - INFO  
> [main-SendThread(127.0.0.1:11228):ClientCnxn$SendThread@1229] - Client 
> session timed out, have not heard from server in 13908ms for sessionid 
> 0x20000d21b280000, closing socket connection and attempting reconnect
> {noformat}
> *STEPS TO REPRODECE:*
> # Create multi operation
> {code}
> List<Op> ops = new ArrayList<Op>();
>         for (int i = 0; i < N; i++) {
>             Op create = Op.create(rootNode + "/" + i, "".getBytes(), 
> ZooDefs.Ids.OPEN_ACL_UNSAFE,
>                     CreateMode.PERSISTENT);
>             ops.add(create);
>         }
> {code}
> Chose N in such a way that the total multi operation request  size is less 
> than but near 1 MB.  For bigger request size increase jute.maxbuffer in 
> servers
> # Submit the multi operation request
> {code} zooKeeper.multi(ops);{code} 
> # After repeating above steps few times issue is reproduced



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to