[jira] Commented: (ZOOKEEPER-344) doIO in NioServerCnxn: Exception causing close of session : cause is read error

2009-03-19 Thread bryan thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683447#action_12683447
 ] 

bryan thompson commented on ZOOKEEPER-344:
--

Patrick, I did not try to coordinate the client and server logs but rather drew 
representative samples from each.  As far as I can tell it is more of the same 
in both logs.  However, I will correlate the events and the zxids and see if I 
can get that debug trace you suggested. -bryan

 doIO in NioServerCnxn: Exception causing close of session : cause is read 
 error
 -

 Key: ZOOKEEPER-344
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-344
 Project: Zookeeper
  Issue Type: Bug
  Components: java client, server
Affects Versions: 3.1.0
 Environment: jdk1.6.0_07
 Linux blade2 2.6.27.7-134.fc10.x86_64 #1 SMP Mon Dec 1 22:21:35 EST 2008 
 x86_64 x86_64 x86_64 GNU/Linux
Reporter: bryan thompson
 Fix For: 3.2.0


 I have been having a problem with zookeeper 3.0.1 and now with 3.1.0 where I 
 see a lot of expired sessions.  I am using a 16 node cluster which is all on 
 the same local network.  There is a single zookeeper instance (these are 
 benchmarking runs).
 The problem appears to be correlated with either run time or system load.\
 Personally I think that it is system load because I have session session 
 expired events under a Windows platform running zookeeper and the application 
 (i.e., everthing is local) when the application load suddenly spikes.  To me 
 this suggests that the client is not able to renew (ping) the zookeeper 
 service in a timely manner and is expired.  But the log messages below with 
 the read error suggest that maybe there is something else going on?
 Zookeeper Configuration
 #Wed Mar 18 12:41:05 GMT-05:00 2009
 clientPort=2181
 dataDir=/var/bigdata/benchmark/zookeeper/1
 syncLimit=2
 dataLogDir=/var/bigdata/benchmark/zookeeper/1
 tickTime=2000
 Some representative log messages are below.
 Client side messages (from our app)
 ERROR [main-EventThread] 
 com.bigdata.zookeeper.ZLockImpl$ZLockWatcher.process(ZLockImpl.java:400) 
 2009-03-18 13:35:40,335 - Session expired: WatchedEvent: Server state change. 
 New state: Expired : 
 zpath=/benchmark/jobs/com.bigdata.service.jini.benchmark.ThroughputMaster/test_1/client1160/locknode
 ERROR [main-EventThread] 
 com.bigdata.zookeeper.ZLockImpl$ZLockWatcher.process(ZLockImpl.java:400) 
 2009-03-18 13:35:40,335 - Session expired: WatchedEvent: Server state change. 
 New state: Expired : 
 zpath=/benchmark/jobs/com.bigdata.service.jini.benchmark.ThroughputMaster/test_1/client1356/locknode
 Server side messages:
  WARN [NIOServerCxn.Factory:2181] 
 org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:417) 
 2009-03-18 13:06:57,252 - Exception causing close of session 
 0x1201aac14300022 due to java.io.IOException: Read error
  WARN [NIOServerCxn.Factory:2181] 
 org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:417) 
 2009-03-18 13:06:58,198 - Exception causing close of session 
 0x1201aac143f due to java.io.IOException: Read error

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-341) regression in QuorumPeerMain, tickTime from config is lost, cannot start quorum

2009-03-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683485#action_12683485
 ] 

Hudson commented on ZOOKEEPER-341:
--

Integrated in ZooKeeper-trunk #258 (See 
[http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/258/])
.  regression in QuorumPeerMain, tickTime from config is lost, cannot start 
quorum (phunt via mahadev)


 regression in QuorumPeerMain, tickTime from config is lost, cannot start 
 quorum
 ---

 Key: ZOOKEEPER-341
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-341
 Project: Zookeeper
  Issue Type: Bug
  Components: quorum, server
Reporter: Patrick Hunt
Assignee: Patrick Hunt
Priority: Blocker
 Fix For: 3.1.1, 3.2.0

 Attachments: ZOOKEEPER-341.patch


 ZOOKEEPER 330/336 caused a regression in QuorumPeerMain -- cannot reliably 
 start a cluster due to missing tickTime.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-316) configure option --without-cppunit does not work

2009-03-19 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-316:
---

Component/s: tests
 c client

 configure option --without-cppunit does not work
 

 Key: ZOOKEEPER-316
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-316
 Project: Zookeeper
  Issue Type: Bug
  Components: c client, tests
Affects Versions: 3.1.0
Reporter: Mahadev konar

 configure option --without-cppunit does not work.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-314) add wiki docs for bookeeper.

2009-03-19 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-314:
---

Component/s: contrib-bookkeeper

 add wiki docs for bookeeper.
 

 Key: ZOOKEEPER-314
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-314
 Project: Zookeeper
  Issue Type: Improvement
  Components: contrib-bookkeeper
Affects Versions: 3.1.0
Reporter: Mahadev konar
 Fix For: 3.2.0


 we should have a wiki page for bookeeper for users to take a cursory look at 
 what it is.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-338) zk hosts should be resolved periodically for loadbalancing amongst zk servers.

2009-03-19 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-338:
---

Component/s: c client

Updated as c client component - is this an issue for either the java 
server/client?

 zk hosts should be resolved periodically for loadbalancing amongst zk servers.
 --

 Key: ZOOKEEPER-338
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-338
 Project: Zookeeper
  Issue Type: New Feature
  Components: c client
Affects Versions: 3.0.0, 3.0.1, 3.1.0
Reporter: Mahadev konar

 The list of host names passed to ZK init method is resolved only once. Had a 
 corresponding DNS entry been changed, it
 would not be refreshed by the ZK library,effectively preventing from proper 
 load balancing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-315) add forrest docs for bookkeeper.

2009-03-19 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-315:
---

Component/s: contrib-bookkeeper

 add forrest docs for bookkeeper.
 

 Key: ZOOKEEPER-315
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-315
 Project: Zookeeper
  Issue Type: Improvement
  Components: contrib-bookkeeper
Affects Versions: 3.1.0
Reporter: Mahadev konar
 Fix For: 3.2.0


 we should have forrest docs  for bookkeeper for 
 - how to install bookkeeper
 - usage model
 - programming examples for users
 - FAQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-344) doIO in NioServerCnxn: Exception causing close of session : cause is read error

2009-03-19 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12683596#action_12683596
 ] 

Patrick Hunt commented on ZOOKEEPER-344:


That's fine. I guess what I mean is that it would be interesting to see the 
debug logs
for both the server and client at the time that the issue(s) start. We might get
more insight into what is happening if we can do that.


 doIO in NioServerCnxn: Exception causing close of session : cause is read 
 error
 -

 Key: ZOOKEEPER-344
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-344
 Project: Zookeeper
  Issue Type: Bug
  Components: java client, server
Affects Versions: 3.1.0
 Environment: jdk1.6.0_07
 Linux blade2 2.6.27.7-134.fc10.x86_64 #1 SMP Mon Dec 1 22:21:35 EST 2008 
 x86_64 x86_64 x86_64 GNU/Linux
Reporter: bryan thompson
 Fix For: 3.2.0


 I have been having a problem with zookeeper 3.0.1 and now with 3.1.0 where I 
 see a lot of expired sessions.  I am using a 16 node cluster which is all on 
 the same local network.  There is a single zookeeper instance (these are 
 benchmarking runs).
 The problem appears to be correlated with either run time or system load.\
 Personally I think that it is system load because I have session session 
 expired events under a Windows platform running zookeeper and the application 
 (i.e., everthing is local) when the application load suddenly spikes.  To me 
 this suggests that the client is not able to renew (ping) the zookeeper 
 service in a timely manner and is expired.  But the log messages below with 
 the read error suggest that maybe there is something else going on?
 Zookeeper Configuration
 #Wed Mar 18 12:41:05 GMT-05:00 2009
 clientPort=2181
 dataDir=/var/bigdata/benchmark/zookeeper/1
 syncLimit=2
 dataLogDir=/var/bigdata/benchmark/zookeeper/1
 tickTime=2000
 Some representative log messages are below.
 Client side messages (from our app)
 ERROR [main-EventThread] 
 com.bigdata.zookeeper.ZLockImpl$ZLockWatcher.process(ZLockImpl.java:400) 
 2009-03-18 13:35:40,335 - Session expired: WatchedEvent: Server state change. 
 New state: Expired : 
 zpath=/benchmark/jobs/com.bigdata.service.jini.benchmark.ThroughputMaster/test_1/client1160/locknode
 ERROR [main-EventThread] 
 com.bigdata.zookeeper.ZLockImpl$ZLockWatcher.process(ZLockImpl.java:400) 
 2009-03-18 13:35:40,335 - Session expired: WatchedEvent: Server state change. 
 New state: Expired : 
 zpath=/benchmark/jobs/com.bigdata.service.jini.benchmark.ThroughputMaster/test_1/client1356/locknode
 Server side messages:
  WARN [NIOServerCxn.Factory:2181] 
 org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:417) 
 2009-03-18 13:06:57,252 - Exception causing close of session 
 0x1201aac14300022 due to java.io.IOException: Read error
  WARN [NIOServerCxn.Factory:2181] 
 org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:417) 
 2009-03-18 13:06:58,198 - Exception causing close of session 
 0x1201aac143f due to java.io.IOException: Read error

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] Release ZooKeeper 3.1.1 (candidate 1)

2009-03-19 Thread Flavio Junqueira

+1, compiles and tests pass for me.

-Flavio

On Mar 19, 2009, at 10:51 PM, Patrick Hunt wrote:

+1, c/java tests all pass and I also specifically verified that the  
regression seen in rc0 has been addressed.


Patrick

Patrick Hunt wrote:
I've created a new candidate (rc1) that fixes a regression found  
during review:

https://issues.apache.org/jira/browse/ZOOKEEPER-341
The release notes were also updated to reflect this change.
Otw there are no other changes.
*** Please download, test and VOTE before the
*** vote closes EOD on Monday March 23.***
http://people.apache.org/~phunt/zookeeper-3.1.1-candidate-1/
Should we release this?
Patrick