[jira] Commented: (ZOOKEEPER-542) c-client can spin when server unresponsive

2009-10-06 Thread Christian Wiedmann (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12762713#action_12762713
 ] 

Christian Wiedmann commented on ZOOKEEPER-542:
--

I don't really know how to do an automated test for this, since the spinning is 
not visible outside of the API.  The manual test I used is to kill -STOP the 
server and then wait until the client tries to reconnect while running strace 
on the I/O thread (I'm using python bindings, btw).  Pre-patch the strace shows 
repeated calls to poll, with POLLOUT set on the server fd.  Post-patch, POLLOUT 
is not set, and there is no spinning.

 c-client can spin when server unresponsive
 --

 Key: ZOOKEEPER-542
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.2.0, 3.2.1
Reporter: Christian Wiedmann
Assignee: Christian Wiedmann
 Fix For: 3.3.0

 Attachments: ZOOKEEPER-542.patch, ZOOKEEPER-542.patch


 Due to a mismatch between zookeeper_interest() and zookeeper_process(), when 
 the zookeeper server is unresponsive the client can spin when reconnecting to 
 the server.
 In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is 
 data to be sent, but flush_send_queue() only writes the data if the state is 
 ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
 This probably doesn't affect production, but I had a runaway process in a 
 development deployment that caused performance issues on the node.  This is 
 easy to reproduce in a single node environment by doing a kill -STOP on the 
 server and waiting for the session timeout.
 Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (ZOOKEEPER-542) c-client can spin when server unresponsive

2009-10-05 Thread Christian Wiedmann (JIRA)
c-client can spin when server unresponsive
--

 Key: ZOOKEEPER-542
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.2.0
Reporter: Christian Wiedmann


Due to a mismatch between zookeeper_interest() and zookeeper_process(), when 
the zookeeper server is unresponsive the client can spin when reconnecting to 
the server.

In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is data 
to be sent, but flush_send_queue() only writes the data if the state is 
ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.

This probably doesn't affect production, but I had a runaway process in a 
development deployment that caused performance issues on the node.  This is 
easy to reproduce in a single node environment by doing a kill -STOP on the 
server and waiting for the session timeout.

Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-542) c-client can spin when server unresponsive

2009-10-05 Thread Christian Wiedmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Wiedmann updated ZOOKEEPER-542:
-

Attachment: ZOOKEEPER-542.patch

 c-client can spin when server unresponsive
 --

 Key: ZOOKEEPER-542
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-542
 Project: Zookeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.2.0
Reporter: Christian Wiedmann
 Attachments: ZOOKEEPER-542.patch


 Due to a mismatch between zookeeper_interest() and zookeeper_process(), when 
 the zookeeper server is unresponsive the client can spin when reconnecting to 
 the server.
 In particular, zookeeper_interest() adds ZOOKEEPER_WRITE whenever there is 
 data to be sent, but flush_send_queue() only writes the data if the state is 
 ZOO_CONNECTED_STATE.  When in ZOO_ASSOCIATING_STATE, this results in spinning.
 This probably doesn't affect production, but I had a runaway process in a 
 development deployment that caused performance issues on the node.  This is 
 easy to reproduce in a single node environment by doing a kill -STOP on the 
 server and waiting for the session timeout.
 Patch to be added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.