[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093167#comment-13093167
 ] 

Eric Hwang commented on ZOOKEEPER-706:
--------------------------------------

I am seeing this issue coming up quite a bit right now. Some clients are 
getting continually disconnected/reconnected with that error message from what 
I assume is the large number of watches. For example:

2011-08-29 13:26:43,750 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - 
Accepted socket connection from /10.141.241.188:32866
2011-08-29 13:26:43,775 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client 
attempting to renew session 0x1319819fcd1000b at /10.141.241.188:32866
2011-08-29 13:26:43,775 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established 
session 0x1319819fcd1000b with negotiated timeout 10000 for client 
/10.141.241.188:32866
2011-08-29 13:26:43,775 - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639] - Exception 
causing close of session 0x1319819fcd1000b due to java.io.IOException: Len 
error 1150247
2011-08-29 13:26:43,775 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket 
connection for client /10.141.241.188:32866 which had sessionid 
0x1319819fcd1000b
2011-08-29 13:26:47,276 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - 
Accepted socket connection from /10.141.241.188:32871
2011-08-29 13:26:47,298 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client 
attempting to renew session 0x1319819fcd1000b at /10.141.241.188:32871
2011-08-29 13:26:47,298 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established 
session 0x1319819fcd1000b with negotiated timeout 10000 for client 
/10.141.241.188:32871
2011-08-29 13:26:47,298 - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639] - Exception 
causing close of session 0x1319819fcd1000b due to java.io.IOException: Len 
error 1150247
2011-08-29 13:26:47,300 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket 
connection for client /10.141.241.188:32871 which had sessionid 
0x1319819fcd1000b
2011-08-29 13:26:51,124 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - 
Accepted socket connection from /10.141.241.188:32879
2011-08-29 13:26:51,142 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client 
attempting to renew session 0x1319819fcd1000b at /10.141.241.188:32879
2011-08-29 13:26:51,143 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established 
session 0x1319819fcd1000b with negotiated timeout 10000 for client 
/10.141.241.188:32879
2011-08-29 13:26:51,143 - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639] - Exception 
causing close of session 0x1319819fcd1000b due to java.io.IOException: Len 
error 1150247
2011-08-29 13:26:51,143 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket 
connection for client /10.141.241.188:32879 which had sessionid 
0x1319819fcd1000b
2011-08-29 13:26:53,985 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - 
Accepted socket connection from /10.141.241.188:32885
2011-08-29 13:26:54,006 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client 
attempting to renew session 0x1319819fcd1000b at /10.141.241.188:32885
2011-08-29 13:26:54,007 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established 
session 0x1319819fcd1000b with negotiated timeout 10000 for client 
/10.141.241.188:32885
2011-08-29 13:26:54,007 - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639] - Exception 
causing close of session 0x1319819fcd1000b due to java.io.IOException: Len 
error 1150247
2011-08-29 13:26:54,007 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket 
connection for client /10.141.241.188:32885 which had sessionid 
0x1319819fcd1000b
2011-08-29 13:26:57,495 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - 
Accepted socket connection from /10.141.241.188:32892
2011-08-29 13:26:57,513 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client 
attempting to renew session 0x1319819fcd1000b at /10.141.241.188:32892
2011-08-29 13:26:57,513 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established 
session 0x1319819fcd1000b with negotiated timeout 10000 for client 
/10.141.241.188:32892
2011-08-29 13:26:57,514 - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639] - Exception 
causing close of session 0x1319819fcd1000b due to java.io.IOException: Len 
error 1150247
2011-08-29 13:26:57,514 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket 
connection for client /10.141.241.188:32892 which had sessionid 
0x1319819fcd1000b
2011-08-29 13:26:59,937 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - 
Accepted socket connection from /10.141.241.188:32897
2011-08-29 13:26:59,958 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client 
attempting to renew session 0x1319819fcd1000b at /10.141.241.188:32897
2011-08-29 13:26:59,958 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established 
session 0x1319819fcd1000b with negotiated timeout 10000 for client 
/10.141.241.188:32897
2011-08-29 13:26:59,958 - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639] - Exception 
causing close of session 0x1319819fcd1000b due to java.io.IOException: Len 
error 1150247
2011-08-29 13:26:59,958 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket 
connection for client /10.141.241.188:32897 which had sessionid 
0x1319819fcd1000b
2011-08-29 13:27:03,711 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - 
Accepted socket connection from /10.141.241.188:32904

It would be great to get this fixed in an upcoming release since it is 
impacting us quite a bit.


> large numbers of watches can cause session re-establishment to fail
> -------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-706
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-706
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: c client, java client
>    Affects Versions: 3.1.2, 3.2.2, 3.3.0
>            Reporter: Patrick Hunt
>            Priority: Critical
>             Fix For: 3.5.0
>
>
> If a client sets a large number of watches the "set watches" operation during 
> session re-establishment can fail.
> for example:
>  WARN  [NIOServerCxn.Factory:22801:NIOServerCnxn@417] - Exception causing 
> close of session 0xe727001201a4ee7c due to java.io.IOException: Len error 
> 4348380
> in this case the client was a web monitoring app and had set both data and 
> child watches on > 32k znodes.
> there are two issues I see here we need to fix:
> 1) handle this case properly (split up the set watches into multiple calls I 
> guess...)
> 2) the session should have expired after the "timeout". however we seem to 
> consider any message from the client as re-setting the expiration on the 
> server side. Probably we should only consider messages from the client that 
> are sent during an established session, otherwise we can see this situation 
> where the session is not established however the session is not expired 
> either. Perhaps we should create another JIRA for this particular issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to