[
https://issues.apache.org/jira/browse/ZOOKEEPER-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093167#comment-13093167
]
Eric Hwang commented on ZOOKEEPER-706:
--------------------------------------
I am seeing this issue coming up quite a bit right now. Some clients are
getting continually disconnected/reconnected with that error message from what
I assume is the large number of watches. For example:
2011-08-29 13:26:43,750 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] -
Accepted socket connection from /10.141.241.188:32866
2011-08-29 13:26:43,775 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client
attempting to renew session 0x1319819fcd1000b at /10.141.241.188:32866
2011-08-29 13:26:43,775 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established
session 0x1319819fcd1000b with negotiated timeout 10000 for client
/10.141.241.188:32866
2011-08-29 13:26:43,775 - WARN
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639] - Exception
causing close of session 0x1319819fcd1000b due to java.io.IOException: Len
error 1150247
2011-08-29 13:26:43,775 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket
connection for client /10.141.241.188:32866 which had sessionid
0x1319819fcd1000b
2011-08-29 13:26:47,276 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] -
Accepted socket connection from /10.141.241.188:32871
2011-08-29 13:26:47,298 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client
attempting to renew session 0x1319819fcd1000b at /10.141.241.188:32871
2011-08-29 13:26:47,298 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established
session 0x1319819fcd1000b with negotiated timeout 10000 for client
/10.141.241.188:32871
2011-08-29 13:26:47,298 - WARN
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639] - Exception
causing close of session 0x1319819fcd1000b due to java.io.IOException: Len
error 1150247
2011-08-29 13:26:47,300 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket
connection for client /10.141.241.188:32871 which had sessionid
0x1319819fcd1000b
2011-08-29 13:26:51,124 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] -
Accepted socket connection from /10.141.241.188:32879
2011-08-29 13:26:51,142 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client
attempting to renew session 0x1319819fcd1000b at /10.141.241.188:32879
2011-08-29 13:26:51,143 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established
session 0x1319819fcd1000b with negotiated timeout 10000 for client
/10.141.241.188:32879
2011-08-29 13:26:51,143 - WARN
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639] - Exception
causing close of session 0x1319819fcd1000b due to java.io.IOException: Len
error 1150247
2011-08-29 13:26:51,143 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket
connection for client /10.141.241.188:32879 which had sessionid
0x1319819fcd1000b
2011-08-29 13:26:53,985 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] -
Accepted socket connection from /10.141.241.188:32885
2011-08-29 13:26:54,006 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client
attempting to renew session 0x1319819fcd1000b at /10.141.241.188:32885
2011-08-29 13:26:54,007 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established
session 0x1319819fcd1000b with negotiated timeout 10000 for client
/10.141.241.188:32885
2011-08-29 13:26:54,007 - WARN
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639] - Exception
causing close of session 0x1319819fcd1000b due to java.io.IOException: Len
error 1150247
2011-08-29 13:26:54,007 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket
connection for client /10.141.241.188:32885 which had sessionid
0x1319819fcd1000b
2011-08-29 13:26:57,495 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] -
Accepted socket connection from /10.141.241.188:32892
2011-08-29 13:26:57,513 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client
attempting to renew session 0x1319819fcd1000b at /10.141.241.188:32892
2011-08-29 13:26:57,513 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established
session 0x1319819fcd1000b with negotiated timeout 10000 for client
/10.141.241.188:32892
2011-08-29 13:26:57,514 - WARN
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639] - Exception
causing close of session 0x1319819fcd1000b due to java.io.IOException: Len
error 1150247
2011-08-29 13:26:57,514 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket
connection for client /10.141.241.188:32892 which had sessionid
0x1319819fcd1000b
2011-08-29 13:26:59,937 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] -
Accepted socket connection from /10.141.241.188:32897
2011-08-29 13:26:59,958 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@770] - Client
attempting to renew session 0x1319819fcd1000b at /10.141.241.188:32897
2011-08-29 13:26:59,958 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580] - Established
session 0x1319819fcd1000b with negotiated timeout 10000 for client
/10.141.241.188:32897
2011-08-29 13:26:59,958 - WARN
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639] - Exception
causing close of session 0x1319819fcd1000b due to java.io.IOException: Len
error 1150247
2011-08-29 13:26:59,958 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket
connection for client /10.141.241.188:32897 which had sessionid
0x1319819fcd1000b
2011-08-29 13:27:03,711 - INFO
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] -
Accepted socket connection from /10.141.241.188:32904
It would be great to get this fixed in an upcoming release since it is
impacting us quite a bit.
> large numbers of watches can cause session re-establishment to fail
> -------------------------------------------------------------------
>
> Key: ZOOKEEPER-706
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-706
> Project: ZooKeeper
> Issue Type: Bug
> Components: c client, java client
> Affects Versions: 3.1.2, 3.2.2, 3.3.0
> Reporter: Patrick Hunt
> Priority: Critical
> Fix For: 3.5.0
>
>
> If a client sets a large number of watches the "set watches" operation during
> session re-establishment can fail.
> for example:
> WARN [NIOServerCxn.Factory:22801:NIOServerCnxn@417] - Exception causing
> close of session 0xe727001201a4ee7c due to java.io.IOException: Len error
> 4348380
> in this case the client was a web monitoring app and had set both data and
> child watches on > 32k znodes.
> there are two issues I see here we need to fix:
> 1) handle this case properly (split up the set watches into multiple calls I
> guess...)
> 2) the session should have expired after the "timeout". however we seem to
> consider any message from the client as re-setting the expiration on the
> server side. Probably we should only consider messages from the client that
> are sent during an established session, otherwise we can see this situation
> where the session is not established however the session is not expired
> either. Perhaps we should create another JIRA for this particular issue.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira