[jira] Commented: (ZOOKEEPER-344) doIO in NioServerCnxn: Exception causing close of session : cause is read error
[ https://issues.apache.org/jira/browse/ZOOKEEPER-344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683447#action_12683447 ] bryan thompson commented on ZOOKEEPER-344: -- Patrick, I did not try to coordinate the client and server logs but rather drew representative samples from each. As far as I can tell it is more of the same in both logs. However, I will correlate the events and the zxids and see if I can get that debug trace you suggested. -bryan doIO in NioServerCnxn: Exception causing close of session : cause is read error - Key: ZOOKEEPER-344 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-344 Project: Zookeeper Issue Type: Bug Components: java client, server Affects Versions: 3.1.0 Environment: jdk1.6.0_07 Linux blade2 2.6.27.7-134.fc10.x86_64 #1 SMP Mon Dec 1 22:21:35 EST 2008 x86_64 x86_64 x86_64 GNU/Linux Reporter: bryan thompson Fix For: 3.2.0 I have been having a problem with zookeeper 3.0.1 and now with 3.1.0 where I see a lot of expired sessions. I am using a 16 node cluster which is all on the same local network. There is a single zookeeper instance (these are benchmarking runs). The problem appears to be correlated with either run time or system load.\ Personally I think that it is system load because I have session session expired events under a Windows platform running zookeeper and the application (i.e., everthing is local) when the application load suddenly spikes. To me this suggests that the client is not able to renew (ping) the zookeeper service in a timely manner and is expired. But the log messages below with the read error suggest that maybe there is something else going on? Zookeeper Configuration #Wed Mar 18 12:41:05 GMT-05:00 2009 clientPort=2181 dataDir=/var/bigdata/benchmark/zookeeper/1 syncLimit=2 dataLogDir=/var/bigdata/benchmark/zookeeper/1 tickTime=2000 Some representative log messages are below. Client side messages (from our app) ERROR [main-EventThread] com.bigdata.zookeeper.ZLockImpl$ZLockWatcher.process(ZLockImpl.java:400) 2009-03-18 13:35:40,335 - Session expired: WatchedEvent: Server state change. New state: Expired : zpath=/benchmark/jobs/com.bigdata.service.jini.benchmark.ThroughputMaster/test_1/client1160/locknode ERROR [main-EventThread] com.bigdata.zookeeper.ZLockImpl$ZLockWatcher.process(ZLockImpl.java:400) 2009-03-18 13:35:40,335 - Session expired: WatchedEvent: Server state change. New state: Expired : zpath=/benchmark/jobs/com.bigdata.service.jini.benchmark.ThroughputMaster/test_1/client1356/locknode Server side messages: WARN [NIOServerCxn.Factory:2181] org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:417) 2009-03-18 13:06:57,252 - Exception causing close of session 0x1201aac14300022 due to java.io.IOException: Read error WARN [NIOServerCxn.Factory:2181] org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:417) 2009-03-18 13:06:58,198 - Exception causing close of session 0x1201aac143f due to java.io.IOException: Read error -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-341) regression in QuorumPeerMain, tickTime from config is lost, cannot start quorum
[ https://issues.apache.org/jira/browse/ZOOKEEPER-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683485#action_12683485 ] Hudson commented on ZOOKEEPER-341: -- Integrated in ZooKeeper-trunk #258 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/258/]) . regression in QuorumPeerMain, tickTime from config is lost, cannot start quorum (phunt via mahadev) regression in QuorumPeerMain, tickTime from config is lost, cannot start quorum --- Key: ZOOKEEPER-341 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-341 Project: Zookeeper Issue Type: Bug Components: quorum, server Reporter: Patrick Hunt Assignee: Patrick Hunt Priority: Blocker Fix For: 3.1.1, 3.2.0 Attachments: ZOOKEEPER-341.patch ZOOKEEPER 330/336 caused a regression in QuorumPeerMain -- cannot reliably start a cluster due to missing tickTime. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-316) configure option --without-cppunit does not work
[ https://issues.apache.org/jira/browse/ZOOKEEPER-316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-316: --- Component/s: tests c client configure option --without-cppunit does not work Key: ZOOKEEPER-316 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-316 Project: Zookeeper Issue Type: Bug Components: c client, tests Affects Versions: 3.1.0 Reporter: Mahadev konar configure option --without-cppunit does not work. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-314) add wiki docs for bookeeper.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-314: --- Component/s: contrib-bookkeeper add wiki docs for bookeeper. Key: ZOOKEEPER-314 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-314 Project: Zookeeper Issue Type: Improvement Components: contrib-bookkeeper Affects Versions: 3.1.0 Reporter: Mahadev konar Fix For: 3.2.0 we should have a wiki page for bookeeper for users to take a cursory look at what it is. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-338) zk hosts should be resolved periodically for loadbalancing amongst zk servers.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-338: --- Component/s: c client Updated as c client component - is this an issue for either the java server/client? zk hosts should be resolved periodically for loadbalancing amongst zk servers. -- Key: ZOOKEEPER-338 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-338 Project: Zookeeper Issue Type: New Feature Components: c client Affects Versions: 3.0.0, 3.0.1, 3.1.0 Reporter: Mahadev konar The list of host names passed to ZK init method is resolved only once. Had a corresponding DNS entry been changed, it would not be refreshed by the ZK library,effectively preventing from proper load balancing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-315) add forrest docs for bookkeeper.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-315: --- Component/s: contrib-bookkeeper add forrest docs for bookkeeper. Key: ZOOKEEPER-315 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-315 Project: Zookeeper Issue Type: Improvement Components: contrib-bookkeeper Affects Versions: 3.1.0 Reporter: Mahadev konar Fix For: 3.2.0 we should have forrest docs for bookkeeper for - how to install bookkeeper - usage model - programming examples for users - FAQ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-344) doIO in NioServerCnxn: Exception causing close of session : cause is read error
[ https://issues.apache.org/jira/browse/ZOOKEEPER-344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683596#action_12683596 ] Patrick Hunt commented on ZOOKEEPER-344: That's fine. I guess what I mean is that it would be interesting to see the debug logs for both the server and client at the time that the issue(s) start. We might get more insight into what is happening if we can do that. doIO in NioServerCnxn: Exception causing close of session : cause is read error - Key: ZOOKEEPER-344 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-344 Project: Zookeeper Issue Type: Bug Components: java client, server Affects Versions: 3.1.0 Environment: jdk1.6.0_07 Linux blade2 2.6.27.7-134.fc10.x86_64 #1 SMP Mon Dec 1 22:21:35 EST 2008 x86_64 x86_64 x86_64 GNU/Linux Reporter: bryan thompson Fix For: 3.2.0 I have been having a problem with zookeeper 3.0.1 and now with 3.1.0 where I see a lot of expired sessions. I am using a 16 node cluster which is all on the same local network. There is a single zookeeper instance (these are benchmarking runs). The problem appears to be correlated with either run time or system load.\ Personally I think that it is system load because I have session session expired events under a Windows platform running zookeeper and the application (i.e., everthing is local) when the application load suddenly spikes. To me this suggests that the client is not able to renew (ping) the zookeeper service in a timely manner and is expired. But the log messages below with the read error suggest that maybe there is something else going on? Zookeeper Configuration #Wed Mar 18 12:41:05 GMT-05:00 2009 clientPort=2181 dataDir=/var/bigdata/benchmark/zookeeper/1 syncLimit=2 dataLogDir=/var/bigdata/benchmark/zookeeper/1 tickTime=2000 Some representative log messages are below. Client side messages (from our app) ERROR [main-EventThread] com.bigdata.zookeeper.ZLockImpl$ZLockWatcher.process(ZLockImpl.java:400) 2009-03-18 13:35:40,335 - Session expired: WatchedEvent: Server state change. New state: Expired : zpath=/benchmark/jobs/com.bigdata.service.jini.benchmark.ThroughputMaster/test_1/client1160/locknode ERROR [main-EventThread] com.bigdata.zookeeper.ZLockImpl$ZLockWatcher.process(ZLockImpl.java:400) 2009-03-18 13:35:40,335 - Session expired: WatchedEvent: Server state change. New state: Expired : zpath=/benchmark/jobs/com.bigdata.service.jini.benchmark.ThroughputMaster/test_1/client1356/locknode Server side messages: WARN [NIOServerCxn.Factory:2181] org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:417) 2009-03-18 13:06:57,252 - Exception causing close of session 0x1201aac14300022 due to java.io.IOException: Read error WARN [NIOServerCxn.Factory:2181] org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:417) 2009-03-18 13:06:58,198 - Exception causing close of session 0x1201aac143f due to java.io.IOException: Read error -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [VOTE] Release ZooKeeper 3.1.1 (candidate 1)
+1, compiles and tests pass for me. -Flavio On Mar 19, 2009, at 10:51 PM, Patrick Hunt wrote: +1, c/java tests all pass and I also specifically verified that the regression seen in rc0 has been addressed. Patrick Patrick Hunt wrote: I've created a new candidate (rc1) that fixes a regression found during review: https://issues.apache.org/jira/browse/ZOOKEEPER-341 The release notes were also updated to reflect this change. Otw there are no other changes. *** Please download, test and VOTE before the *** vote closes EOD on Monday March 23.*** http://people.apache.org/~phunt/zookeeper-3.1.1-candidate-1/ Should we release this? Patrick