ZooKeeper-trunk-solaris - Build # 751 - Still Failing
See https://builds.apache.org/job/ZooKeeper-trunk-solaris/751/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 209234 lines...] [junit] 2013-12-05 09:00:38,546 [myid:] - INFO [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@219] - accept thread exitted run method [junit] 2013-12-05 09:00:38,547 [myid:] - INFO [main:ZooKeeperServer@428] - shutting down [junit] 2013-12-05 09:00:38,547 [myid:] - INFO [main:SessionTrackerImpl@183] - Shutting down [junit] 2013-12-05 09:00:38,547 [myid:] - INFO [main:PrepRequestProcessor@972] - Shutting down [junit] 2013-12-05 09:00:38,547 [myid:] - INFO [main:SyncRequestProcessor@190] - Shutting down [junit] 2013-12-05 09:00:38,547 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@156] - PrepRequestProcessor exited loop! [junit] 2013-12-05 09:00:38,548 [myid:] - INFO [SyncThread:0:SyncRequestProcessor@168] - SyncRequestProcessor exited! [junit] 2013-12-05 09:00:38,548 [myid:] - INFO [main:FinalRequestProcessor@442] - shutdown of request processor complete [junit] 2013-12-05 09:00:38,548 [myid:] - INFO [main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221 [junit] 2013-12-05 09:00:38,549 [myid:] - INFO [main:JMXEnv@133] - ensureOnly:[] [junit] 2013-12-05 09:00:38,550 [myid:] - INFO [main:ClientBase@414] - STARTING server [junit] 2013-12-05 09:00:38,550 [myid:] - INFO [main:ZooKeeperServer@149] - Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 6 datadir /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test8506972050192409857.junit.dir/version-2 snapdir /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test8506972050192409857.junit.dir/version-2 [junit] 2013-12-05 09:00:38,551 [myid:] - INFO [main:NIOServerCnxnFactory@670] - Configuring NIO connection handler with 10s sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 kB direct buffers. [junit] 2013-12-05 09:00:38,551 [myid:] - INFO [main:NIOServerCnxnFactory@683] - binding to port 0.0.0.0/0.0.0.0:11221 [junit] 2013-12-05 09:00:38,552 [myid:] - INFO [main:FileSnap@83] - Reading snapshot /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test8506972050192409857.junit.dir/version-2/snapshot.b [junit] 2013-12-05 09:00:38,554 [myid:] - INFO [main:FileTxnSnapLog@297] - Snapshotting: 0xb to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test8506972050192409857.junit.dir/version-2/snapshot.b [junit] 2013-12-05 09:00:38,556 [myid:] - INFO [main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221 [junit] 2013-12-05 09:00:38,556 [myid:] - INFO [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@296] - Accepted socket connection from /127.0.0.1:41400 [junit] 2013-12-05 09:00:38,557 [myid:] - INFO [NIOWorkerThread-1:NIOServerCnxn@828] - Processing stat command from /127.0.0.1:41400 [junit] 2013-12-05 09:00:38,557 [myid:] - INFO [NIOWorkerThread-1:NIOServerCnxn$StatCommand@677] - Stat command output [junit] 2013-12-05 09:00:38,558 [myid:] - INFO [NIOWorkerThread-1:NIOServerCnxn@999] - Closed socket connection for client /127.0.0.1:41400 (no session established for client) [junit] 2013-12-05 09:00:38,558 [myid:] - INFO [main:JMXEnv@133] - ensureOnly:[InMemoryDataTree, StandaloneServer_port] [junit] 2013-12-05 09:00:38,560 [myid:] - INFO [main:JMXEnv@105] - expect:InMemoryDataTree [junit] 2013-12-05 09:00:38,560 [myid:] - INFO [main:JMXEnv@108] - found:InMemoryDataTree org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree [junit] 2013-12-05 09:00:38,560 [myid:] - INFO [main:JMXEnv@105] - expect:StandaloneServer_port [junit] 2013-12-05 09:00:38,560 [myid:] - INFO [main:JMXEnv@108] - found:StandaloneServer_port org.apache.ZooKeeperService:name0=StandaloneServer_port-1 [junit] 2013-12-05 09:00:38,560 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@57] - FINISHED TEST METHOD testQuota [junit] 2013-12-05 09:00:38,561 [myid:] - INFO [main:ClientBase@451] - tearDown starting [junit] 2013-12-05 09:00:38,634 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down [junit] 2013-12-05 09:00:38,634 [myid:] - INFO [main:ZooKeeper@777] - Session: 0x142c1fcc864 closed [junit] 2013-12-05 09:00:38,635 [myid:] - INFO [main:ClientBase@421] - STOPPING server [junit] 2013-12-05 09:00:38,635 [myid:] - INFO
[jira] [Created] (ZOOKEEPER-1831) Document remove watches details to the guide
Rakesh R created ZOOKEEPER-1831: --- Summary: Document remove watches details to the guide Key: ZOOKEEPER-1831 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1831 Project: ZooKeeper Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R This JIRA is for documenting the details of removing the watches -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (ZOOKEEPER-1829) Umbrella jira for removing watches that are no longer of interest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated ZOOKEEPER-1829: Fix Version/s: 3.5.0 Umbrella jira for removing watches that are no longer of interest - Key: ZOOKEEPER-1829 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1829 Project: ZooKeeper Issue Type: New Feature Components: java client, server Reporter: Rakesh R Assignee: Rakesh R Priority: Critical Fix For: 3.5.0 -- This message was sent by Atlassian JIRA (v6.1#6144)
ZooKeeper-3.4-WinVS2008_java - Build # 367 - Still Failing
See https://builds.apache.org/job/ZooKeeper-3.4-WinVS2008_java/367/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 186942 lines...] [junit] 2013-12-05 10:01:54,757 [myid:] - INFO [main:FinalRequestProcessor@415] - shutdown of request processor complete [junit] 2013-12-05 10:01:54,758 [myid:] - INFO [main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221 [junit] 2013-12-05 10:01:55,035 [myid:] - INFO [main-SendThread(127.0.0.1:11221):ClientCnxn$SendThread@968] - Opening socket connection to server 127.0.0.1/127.0.0.1:11221. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration) [junit] 2013-12-05 10:01:55,756 [myid:] - INFO [main:JMXEnv@133] - ensureOnly:[] [junit] 2013-12-05 10:01:55,757 [myid:] - INFO [main:ClientBase@414] - STARTING server [junit] 2013-12-05 10:01:55,758 [myid:] - INFO [main:ZooKeeperServer@162] - Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 6 datadir f:\hudson\hudson-slave\workspace\ZooKeeper-3.4-WinVS2008_java\branch-3.4\build\test\tmp\test5572040804545770871.junit.dir\version-2 snapdir f:\hudson\hudson-slave\workspace\ZooKeeper-3.4-WinVS2008_java\branch-3.4\build\test\tmp\test5572040804545770871.junit.dir\version-2 [junit] 2013-12-05 10:01:55,764 [myid:] - INFO [main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:11221 [junit] 2013-12-05 10:01:55,768 [myid:] - INFO [main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221 [junit] 2013-12-05 10:01:55,769 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@197] - Accepted socket connection from /127.0.0.1:54752 [junit] 2013-12-05 10:01:55,770 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@817] - Processing stat command from /127.0.0.1:54752 [junit] 2013-12-05 10:01:55,864 [myid:] - INFO [Thread-5:NIOServerCnxn$StatCommand@653] - Stat command output [junit] 2013-12-05 10:01:55,866 [myid:] - INFO [Thread-5:NIOServerCnxn@997] - Closed socket connection for client /127.0.0.1:54752 (no session established for client) [junit] 2013-12-05 10:01:55,866 [myid:] - INFO [main:JMXEnv@133] - ensureOnly:[InMemoryDataTree, StandaloneServer_port] [junit] 2013-12-05 10:01:55,868 [myid:] - INFO [main:JMXEnv@105] - expect:InMemoryDataTree [junit] 2013-12-05 10:01:55,868 [myid:] - INFO [main:JMXEnv@108] - found:InMemoryDataTree org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree [junit] 2013-12-05 10:01:55,965 [myid:] - INFO [main:JMXEnv@105] - expect:StandaloneServer_port [junit] 2013-12-05 10:01:55,965 [myid:] - INFO [main:JMXEnv@108] - found:StandaloneServer_port org.apache.ZooKeeperService:name0=StandaloneServer_port-1 [junit] 2013-12-05 10:01:55,965 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@57] - FINISHED TEST METHOD testQuota [junit] 2013-12-05 10:01:55,966 [myid:] - INFO [main:ClientBase@451] - tearDown starting [junit] 2013-12-05 10:01:56,025 [myid:] - INFO [main-SendThread(127.0.0.1:11221):ClientCnxn$SendThread@849] - Socket connection established to 127.0.0.1/127.0.0.1:11221, initiating session [junit] 2013-12-05 10:01:56,025 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@197] - Accepted socket connection from /127.0.0.1:54745 [junit] 2013-12-05 10:01:56,026 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:ZooKeeperServer@861] - Client attempting to renew session 0x142c234d7f1 at /127.0.0.1:54745 [junit] 2013-12-05 10:01:56,068 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:ZooKeeperServer@617] - Established session 0x142c234d7f1 with negotiated timeout 3 for client /127.0.0.1:54745 [junit] 2013-12-05 10:01:56,069 [myid:] - INFO [main-SendThread(127.0.0.1:11221):ClientCnxn$SendThread@1228] - Session establishment complete on server 127.0.0.1/127.0.0.1:11221, sessionid = 0x142c234d7f1, negotiated timeout = 3 [junit] 2013-12-05 10:01:56,069 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x142c234d7f1 [junit] 2013-12-05 10:01:56,069 [myid:] - INFO [SyncThread:0:FileTxnLog@199] - Creating new log file: log.c [junit] 2013-12-05 10:01:56,094 [myid:] - INFO [main:ZooKeeper@684] - Session: 0x142c234d7f1 closed [junit] 2013-12-05 10:01:56,094 [myid:] - INFO [main:ClientBase@421] - STOPPING server [junit] 2013-12-05 10:01:56,095 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@509] - EventThread shut down [junit] 2013-12-05 10:01:56,096 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@347] - caught end of stream exception
[jira] [Updated] (ZOOKEEPER-1830) Support command line shell for removing watches
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated ZOOKEEPER-1830: Attachment: ZOOKEEPER-1830.patch Support command line shell for removing watches --- Key: ZOOKEEPER-1830 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1830 Project: ZooKeeper Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Priority: Critical Fix For: 3.5.0 Attachments: ZOOKEEPER-1830.patch This JIRA to discuss the command line shell for removing watches. Makes it easier to do ad-hoc testing. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (ZOOKEEPER-1830) Support command line shell for removing watches
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated ZOOKEEPER-1830: Fix Version/s: 3.5.0 Support command line shell for removing watches --- Key: ZOOKEEPER-1830 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1830 Project: ZooKeeper Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Priority: Critical Fix For: 3.5.0 Attachments: ZOOKEEPER-1830.patch This JIRA to discuss the command line shell for removing watches. Makes it easier to do ad-hoc testing. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (ZOOKEEPER-1830) Support command line shell for removing watches
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840150#comment-13840150 ] Rakesh R commented on ZOOKEEPER-1830: - Could you please have a look at the attached patch. This should be applied on top of ZOOKEEPER-442(as this contains the actual client api implementations) Delete watches command syntax : {code} deletewatches path [-c|-d|-a] [-l] c - represents childwatchers d - represents datawatchers a - represents both child/data watchers l - represents remove watchers locally when no server connection {code} Support command line shell for removing watches --- Key: ZOOKEEPER-1830 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1830 Project: ZooKeeper Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Priority: Critical Fix For: 3.5.0 Attachments: ZOOKEEPER-1830.patch This JIRA to discuss the command line shell for removing watches. Makes it easier to do ad-hoc testing. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Behavior when client disconnects
I think this is https://issues.apache.org/jira/browse/ZOOKEEPER-22 On Wed, Dec 4, 2013 at 8:50 PM, kishore g g.kish...@gmail.com wrote: Thanks Camille, Is it not violating the assumption that a client reads its own write (probably its ok in this case because client never got the ack for write from server) Consider the following simple code where one wants to know if something was successfully written. boolean success; try{ zk.write(p) success= true; }catch(Exception e) { //cannot assume write did not go through //read the value and see if you really wrote it success = zk.exists(p) zk.readStat(p).owner == me } Looks like if the connection breaks at zk.write(p), success can either be true or false. Probably the only way to make sure write was successful is try writing again when there is exception. Does this make sense? thanks, Kishore G On Wed, Dec 4, 2013 at 6:12 PM, Camille Fournier cami...@apache.org wrote: As far as I can tell from the code: c1 will send its last seen zxid to the server that it is trying to connect to. If that zxid is greater than the zxid of the server, the server will refuse the connection. In this case, if the client has not seen an ack, it is certainly possible that the last zxid seen will be the same as the zxid of the server it is connected to, so it will not see the result of w1 yet. C On Wed, Dec 4, 2013 at 3:37 PM, kishore g g.kish...@gmail.com wrote: Hi, Consider the following case 1. Client c1 sends a write(w1) to zk1 2. w1 gets ack from zk2 but not yet from zk3, but quorum is reached 3. By the time zk1 sends response back to c1, the connection breaks 4. c1 did not get the zxid for the latest transaction Now c1, depending on whether it connects to zk2 or zk3 might see that w1 was successful or failure. Is this analysis correct or will c1 automatically invoke a sync under the hoods when it gets disconnected and connected to another server? If no, how should one handle this scenario. Thanks, Kishore G
[jira] [Commented] (ZOOKEEPER-1459) Standalone ZooKeeperServer is not closing the transaction log files on shutdown
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840280#comment-13840280 ] Flavio Junqueira commented on ZOOKEEPER-1459: - Also, are you planning on updating the 3.4 patch? Is it only the dir name rootDir - tmpDir? Standalone ZooKeeperServer is not closing the transaction log files on shutdown --- Key: ZOOKEEPER-1459 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1459 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.4.0 Reporter: Rakesh R Assignee: Rakesh R Fix For: 3.4.6, 3.5.0 Attachments: ZOOKEEPER-1459-branch-3_4.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch When shutdown the standalone ZK server, its only clearing the zkdatabase and not closing the transaction log streams. When tries to delete the temporary files in unit tests on windows, its failing. ZooKeeperServer.java {noformat} if (zkDb != null) { zkDb.clear(); } {noformat} Suggestion to close the zkDb as follows, this inturn will take care transaction logs: {noformat} if (zkDb != null) { zkDb.clear(); try { zkDb.close(); } catch (IOException ie) { LOG.warn(Error closing logs , ie); } } {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (ZOOKEEPER-1459) Standalone ZooKeeperServer is not closing the transaction log files on shutdown
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840278#comment-13840278 ] Flavio Junqueira commented on ZOOKEEPER-1459: - Oops, you're right, runFromConfig already throws IOException, so I don't think we need to do anything else. Standalone ZooKeeperServer is not closing the transaction log files on shutdown --- Key: ZOOKEEPER-1459 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1459 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.4.0 Reporter: Rakesh R Assignee: Rakesh R Fix For: 3.4.6, 3.5.0 Attachments: ZOOKEEPER-1459-branch-3_4.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch When shutdown the standalone ZK server, its only clearing the zkdatabase and not closing the transaction log streams. When tries to delete the temporary files in unit tests on windows, its failing. ZooKeeperServer.java {noformat} if (zkDb != null) { zkDb.clear(); } {noformat} Suggestion to close the zkDb as follows, this inturn will take care transaction logs: {noformat} if (zkDb != null) { zkDb.clear(); try { zkDb.close(); } catch (IOException ie) { LOG.warn(Error closing logs , ie); } } {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
ZooKeeper-trunk-ibm6 - Build # 356 - Failure
See https://builds.apache.org/job/ZooKeeper-trunk-ibm6/356/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 305615 lines...] [junit] 2013-12-05 17:09:15,399 [myid:] - INFO [main:ZooKeeperServer@428] - shutting down [junit] 2013-12-05 17:09:15,399 [myid:] - INFO [main:SessionTrackerImpl@183] - Shutting down [junit] 2013-12-05 17:09:15,399 [myid:] - INFO [main:PrepRequestProcessor@972] - Shutting down [junit] 2013-12-05 17:09:15,399 [myid:] - INFO [main:SyncRequestProcessor@190] - Shutting down [junit] 2013-12-05 17:09:15,399 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@156] - PrepRequestProcessor exited loop! [junit] 2013-12-05 17:09:15,400 [myid:] - INFO [SyncThread:0:SyncRequestProcessor@168] - SyncRequestProcessor exited! [junit] 2013-12-05 17:09:15,400 [myid:] - INFO [main:FinalRequestProcessor@442] - shutdown of request processor complete [junit] 2013-12-05 17:09:15,401 [myid:] - INFO [main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221 [junit] 2013-12-05 17:09:15,402 [myid:] - INFO [main:JMXEnv@133] - ensureOnly:[] [junit] 2013-12-05 17:09:15,403 [myid:] - INFO [main:ClientBase@414] - STARTING server [junit] 2013-12-05 17:09:15,403 [myid:] - INFO [main:ZooKeeperServer@149] - Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 6 datadir /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-ibm6/trunk/build/test/tmp/test3145070332633105749.junit.dir/version-2 snapdir /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-ibm6/trunk/build/test/tmp/test3145070332633105749.junit.dir/version-2 [junit] 2013-12-05 17:09:15,404 [myid:] - INFO [main:NIOServerCnxnFactory@670] - Configuring NIO connection handler with 10s sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 kB direct buffers. [junit] 2013-12-05 17:09:15,407 [myid:] - INFO [main:NIOServerCnxnFactory@683] - binding to port 0.0.0.0/0.0.0.0:11221 [junit] 2013-12-05 17:09:15,409 [myid:] - INFO [main:FileSnap@83] - Reading snapshot /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-ibm6/trunk/build/test/tmp/test3145070332633105749.junit.dir/version-2/snapshot.b [junit] 2013-12-05 17:09:15,412 [myid:] - INFO [main:FileTxnSnapLog@297] - Snapshotting: 0xb to /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-ibm6/trunk/build/test/tmp/test3145070332633105749.junit.dir/version-2/snapshot.b [junit] 2013-12-05 17:09:15,414 [myid:] - INFO [main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221 [junit] 2013-12-05 17:09:15,414 [myid:] - INFO [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@296] - Accepted socket connection from /127.0.0.1:60822 [junit] 2013-12-05 17:09:15,415 [myid:] - INFO [NIOWorkerThread-1:NIOServerCnxn@828] - Processing stat command from /127.0.0.1:60822 [junit] 2013-12-05 17:09:15,415 [myid:] - INFO [NIOWorkerThread-1:NIOServerCnxn$StatCommand@677] - Stat command output [junit] 2013-12-05 17:09:15,416 [myid:] - INFO [NIOWorkerThread-1:NIOServerCnxn@999] - Closed socket connection for client /127.0.0.1:60822 (no session established for client) [junit] 2013-12-05 17:09:15,416 [myid:] - INFO [main:JMXEnv@133] - ensureOnly:[InMemoryDataTree, StandaloneServer_port] [junit] 2013-12-05 17:09:15,418 [myid:] - INFO [main:JMXEnv@105] - expect:InMemoryDataTree [junit] 2013-12-05 17:09:15,419 [myid:] - INFO [main:JMXEnv@108] - found:InMemoryDataTree org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree [junit] 2013-12-05 17:09:15,419 [myid:] - INFO [main:JMXEnv@105] - expect:StandaloneServer_port [junit] 2013-12-05 17:09:15,419 [myid:] - INFO [main:JMXEnv@108] - found:StandaloneServer_port org.apache.ZooKeeperService:name0=StandaloneServer_port-1 [junit] 2013-12-05 17:09:15,420 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@57] - FINISHED TEST METHOD testQuota [junit] 2013-12-05 17:09:15,420 [myid:] - INFO [main:ClientBase@451] - tearDown starting [junit] 2013-12-05 17:09:15,466 [myid:] - INFO [main:ZooKeeper@777] - Session: 0x142c3bc1f35 closed [junit] 2013-12-05 17:09:15,466 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down [junit] 2013-12-05 17:09:15,466 [myid:] - INFO [main:ClientBase@421] - STOPPING server [junit] 2013-12-05 17:09:15,466 [myid:] - INFO [ConnnectionExpirer:NIOServerCnxnFactory$ConnectionExpirerThread@583] - ConnnectionExpirerThread interrupted [junit] 2013-12-05 17:09:15,467 [myid:] - INFO [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@219] - accept thread exitted run method [junit] 2013-12-05 17:09:15,467 [myid:] - INFO
[jira] [Updated] (ZOOKEEPER-1459) Standalone ZooKeeperServer is not closing the transaction log files on shutdown
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated ZOOKEEPER-1459: Attachment: ZOOKEEPER-1459-branch-3_4.patch Standalone ZooKeeperServer is not closing the transaction log files on shutdown --- Key: ZOOKEEPER-1459 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1459 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.4.0 Reporter: Rakesh R Assignee: Rakesh R Fix For: 3.4.6, 3.5.0 Attachments: ZOOKEEPER-1459-branch-3_4.patch, ZOOKEEPER-1459-branch-3_4.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch When shutdown the standalone ZK server, its only clearing the zkdatabase and not closing the transaction log streams. When tries to delete the temporary files in unit tests on windows, its failing. ZooKeeperServer.java {noformat} if (zkDb != null) { zkDb.clear(); } {noformat} Suggestion to close the zkDb as follows, this inturn will take care transaction logs: {noformat} if (zkDb != null) { zkDb.clear(); try { zkDb.close(); } catch (IOException ie) { LOG.warn(Error closing logs , ie); } } {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (ZOOKEEPER-1459) Standalone ZooKeeperServer is not closing the transaction log files on shutdown
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840322#comment-13840322 ] Rakesh R commented on ZOOKEEPER-1459: - [~fpj] Attached branch 3.4 patch. Yes, I've modified the rootDir - tmpDir. Could you please have a look at it. Standalone ZooKeeperServer is not closing the transaction log files on shutdown --- Key: ZOOKEEPER-1459 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1459 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.4.0 Reporter: Rakesh R Assignee: Rakesh R Fix For: 3.4.6, 3.5.0 Attachments: ZOOKEEPER-1459-branch-3_4.patch, ZOOKEEPER-1459-branch-3_4.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch When shutdown the standalone ZK server, its only clearing the zkdatabase and not closing the transaction log streams. When tries to delete the temporary files in unit tests on windows, its failing. ZooKeeperServer.java {noformat} if (zkDb != null) { zkDb.clear(); } {noformat} Suggestion to close the zkDb as follows, this inturn will take care transaction logs: {noformat} if (zkDb != null) { zkDb.clear(); try { zkDb.close(); } catch (IOException ie) { LOG.warn(Error closing logs , ie); } } {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
Failed: ZOOKEEPER-1459 PreCommit Build #1817
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1459 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1817/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 68 lines...] [exec] == [exec] == [exec] [exec] [exec] patching file src/java/main/org/apache/zookeeper/server/ZooKeeperServerMain.java [exec] Hunk #1 succeeded at 97 (offset 4 lines). [exec] Hunk #2 FAILED at 101. [exec] Hunk #3 succeeded at 117 (offset -1 lines). [exec] 1 out of 3 hunks FAILED -- saving rejects to file src/java/main/org/apache/zookeeper/server/ZooKeeperServerMain.java.rej [exec] patching file src/java/test/org/apache/zookeeper/server/ZooKeeperServerMainTest.java [exec] PATCH APPLICATION FAILED [exec] [exec] [exec] [exec] [exec] -1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12617195/ZOOKEEPER-1459-branch-3_4.patch [exec] against trunk revision 1547702. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] -1 patch. The patch command could not apply the patch. [exec] [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1817//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] 9f2941ae7039dfa4d4bb1a28bafe2aad81b6ce29 logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD FAILED /home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1623: exec returned: 1 Total time: 1 minute 6 seconds Build step 'Execute shell' marked build as failure Archiving artifacts Recording test results Description set: ZOOKEEPER-1459 Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Commented] (ZOOKEEPER-1459) Standalone ZooKeeperServer is not closing the transaction log files on shutdown
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840324#comment-13840324 ] Hadoop QA commented on ZOOKEEPER-1459: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12617195/ZOOKEEPER-1459-branch-3_4.patch against trunk revision 1547702. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1817//console This message is automatically generated. Standalone ZooKeeperServer is not closing the transaction log files on shutdown --- Key: ZOOKEEPER-1459 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1459 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.4.0 Reporter: Rakesh R Assignee: Rakesh R Fix For: 3.4.6, 3.5.0 Attachments: ZOOKEEPER-1459-branch-3_4.patch, ZOOKEEPER-1459-branch-3_4.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch When shutdown the standalone ZK server, its only clearing the zkdatabase and not closing the transaction log streams. When tries to delete the temporary files in unit tests on windows, its failing. ZooKeeperServer.java {noformat} if (zkDb != null) { zkDb.clear(); } {noformat} Suggestion to close the zkDb as follows, this inturn will take care transaction logs: {noformat} if (zkDb != null) { zkDb.clear(); try { zkDb.close(); } catch (IOException ie) { LOG.warn(Error closing logs , ie); } } {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (ZOOKEEPER-1459) Standalone ZooKeeperServer is not closing the transaction log files on shutdown
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840348#comment-13840348 ] Rakesh R commented on ZOOKEEPER-1459: - As the ZOOKEEPER-1459-branch-3_4.patch is prepared on branch 3.4, I feel this failure is expected. Please excuse. Standalone ZooKeeperServer is not closing the transaction log files on shutdown --- Key: ZOOKEEPER-1459 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1459 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.4.0 Reporter: Rakesh R Assignee: Rakesh R Fix For: 3.4.6, 3.5.0 Attachments: ZOOKEEPER-1459-branch-3_4.patch, ZOOKEEPER-1459-branch-3_4.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch When shutdown the standalone ZK server, its only clearing the zkdatabase and not closing the transaction log streams. When tries to delete the temporary files in unit tests on windows, its failing. ZooKeeperServer.java {noformat} if (zkDb != null) { zkDb.clear(); } {noformat} Suggestion to close the zkDb as follows, this inturn will take care transaction logs: {noformat} if (zkDb != null) { zkDb.clear(); try { zkDb.close(); } catch (IOException ie) { LOG.warn(Error closing logs , ie); } } {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (ZOOKEEPER-1832) Add count of connected clients to submitted ganglia metrics
Ben Hartshorne created ZOOKEEPER-1832: - Summary: Add count of connected clients to submitted ganglia metrics Key: ZOOKEEPER-1832 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1832 Project: ZooKeeper Issue Type: Bug Reporter: Ben Hartshorne Priority: Minor The ganglia zookeeper plugin does not report the number of connected clients, though this information is available from the 'stat' command. -- This message was sent by Atlassian JIRA (v6.1#6144)
zookeeper pull request: add metric to count connected clients to ganglia zo...
GitHub user maplebed opened a pull request: https://github.com/apache/zookeeper/pull/9 add metric to count connected clients to ganglia zookeeper module More detail in https://issues.apache.org/jira/browse/ZOOKEEPER-1832 You can merge this pull request into a Git repository by running: $ git pull https://github.com/maplebed/zookeeper ZOOKEEPER-1832 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/9.patch commit 9be08f011baf05884a60631d63ca6d90a4a735b2 Author: Ben Hartshorne b...@parse.com Date: 2013-12-05T18:21:12Z add metric to count connected clients to ganglia zookeeper module
[jira] [Commented] (ZOOKEEPER-1832) Add count of connected clients to submitted ganglia metrics
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840369#comment-13840369 ] Ben Hartshorne commented on ZOOKEEPER-1832: --- Fix provided in https://github.com/apache/zookeeper/pull/9 Add count of connected clients to submitted ganglia metrics --- Key: ZOOKEEPER-1832 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1832 Project: ZooKeeper Issue Type: Bug Reporter: Ben Hartshorne Priority: Minor The ganglia zookeeper plugin does not report the number of connected clients, though this information is available from the 'stat' command. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (ZOOKEEPER-1382) Zookeeper server holds onto dead/expired session ids in the watch data structures
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Germán Blanco updated ZOOKEEPER-1382: - Attachment: ZOOKEEPER-1382.patch Updated with last comments. Zookeeper server holds onto dead/expired session ids in the watch data structures - Key: ZOOKEEPER-1382 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1382 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.4.5 Reporter: Neha Narkhede Assignee: Germán Blanco Priority: Critical Fix For: 3.4.6, 3.5.0 Attachments: ZOOKEEPER-1382-branch-3.4.patch, ZOOKEEPER-1382-branch-3.4.patch, ZOOKEEPER-1382-branch-3.4.patch, ZOOKEEPER-1382-branch-3.4.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382_3.3.4.patch I've observed that zookeeper server holds onto expired session ids in the watcher data structures. The result is the wchp command reports session ids that cannot be found through cons/dump and those expired session ids sit there maybe until the server is restarted. Here are snippets from the client and the server logs that lead to this state, for one particular session id 0x134485fd7bcb26f - There are 4 servers in the zookeeper cluster - 223, 224, 225 (leader), 226 and I'm using ZkClient to connect to the cluster From the application log - application.log.2012-01-26-325.gz:2012/01/26 04:56:36.177 INFO [ClientCnxn] [main-SendThread(223.prod:12913)] [application Session establishment complete on server 223.prod/172.17.135.38:12913, sessionid = 0x134485fd7bcb26f, negotiated timeout = 6000 application.log.2012-01-27.gz:2012/01/27 09:52:37.714 INFO [ClientCnxn] [main-SendThread(223.prod:12913)] [application] Client session timed out, have not heard from server in 9827ms for sessionid 0x134485fd7bcb26f, closing socket connection and attempting reconnect application.log.2012-01-27.gz:2012/01/27 09:52:38.191 INFO [ClientCnxn] [main-SendThread(226.prod:12913)] [application] Unable to reconnect to ZooKeeper service, session 0x134485fd7bcb26f has expired, closing socket connection On the leader zk, 225 - zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO [SessionTracker:ZooKeeperServer@314] - Expiring session 0x134485fd7bcb26f, timeout of 6000ms exceeded zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO [ProcessThread:-1:PrepRequestProcessor@391] - Processed session termination for sessionid: 0x134485fd7bcb26f On the server, the client was initially connected to, 223 - zookeeper.log.2012-01-26-223.gz:2012-01-26 04:56:36,173 - INFO [CommitProcessor:1:NIOServerCnxn@1580] - Established session 0x134485fd7bcb26f with negotiated timeout 6000 for client /172.17.136.82:45020 zookeeper.log.2012-01-27-223.gz:2012-01-27 09:52:34,018 - INFO [CommitProcessor:1:NIOServerCnxn@1435] - Closed socket connection for client /172.17.136.82:45020 which had sessionid 0x134485fd7bcb26f Here are the log snippets from 226, which is the server, the client reconnected to, before getting session expired event - 2012-01-27 09:52:38,190 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@770] - Client attempting to renew session 0x134485fd7bcb26f at /172.17.136.82:49367 2012-01-27 09:52:38,191 - INFO [QuorumPeer:/0.0.0.0:12913:NIOServerCnxn@1573] - Invalid session 0x134485fd7bcb26f for client /172.17.136.82:49367, probably expired 2012-01-27 09:52:38,191 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@1435] - Closed socket connection for client /172.17.136.82:49367 which had sessionid 0x134485fd7bcb26f wchp output from 226, taken on 01/30 - nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f *226.*wchp* | wc -l 3 wchp output from 223, taken on 01/30 - nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f *223.*wchp* | wc -l 0 cons output from 223 and 226, taken on 01/30 - nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f *226.*cons* | wc -l 0 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f *223.*cons* | wc -l 0 So, what seems to have happened is that the client was able to re-register the watches on the new server (226), after it got disconnected from 223, inspite of having an expired session id. In NIOServerCnxn, I saw that after suspecting that a session is expired, a server removes the cnxn and its watches from its internal data structures. But before that it allows more requests to be processed even if the session is expired - // Now that the session is ready we can start receiving
[jira] [Updated] (ZOOKEEPER-1382) Zookeeper server holds onto dead/expired session ids in the watch data structures
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Germán Blanco updated ZOOKEEPER-1382: - Attachment: ZOOKEEPER-1382-branch-3.4.patch Zookeeper server holds onto dead/expired session ids in the watch data structures - Key: ZOOKEEPER-1382 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1382 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.4.5 Reporter: Neha Narkhede Assignee: Germán Blanco Priority: Critical Fix For: 3.4.6, 3.5.0 Attachments: ZOOKEEPER-1382-branch-3.4.patch, ZOOKEEPER-1382-branch-3.4.patch, ZOOKEEPER-1382-branch-3.4.patch, ZOOKEEPER-1382-branch-3.4.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382_3.3.4.patch I've observed that zookeeper server holds onto expired session ids in the watcher data structures. The result is the wchp command reports session ids that cannot be found through cons/dump and those expired session ids sit there maybe until the server is restarted. Here are snippets from the client and the server logs that lead to this state, for one particular session id 0x134485fd7bcb26f - There are 4 servers in the zookeeper cluster - 223, 224, 225 (leader), 226 and I'm using ZkClient to connect to the cluster From the application log - application.log.2012-01-26-325.gz:2012/01/26 04:56:36.177 INFO [ClientCnxn] [main-SendThread(223.prod:12913)] [application Session establishment complete on server 223.prod/172.17.135.38:12913, sessionid = 0x134485fd7bcb26f, negotiated timeout = 6000 application.log.2012-01-27.gz:2012/01/27 09:52:37.714 INFO [ClientCnxn] [main-SendThread(223.prod:12913)] [application] Client session timed out, have not heard from server in 9827ms for sessionid 0x134485fd7bcb26f, closing socket connection and attempting reconnect application.log.2012-01-27.gz:2012/01/27 09:52:38.191 INFO [ClientCnxn] [main-SendThread(226.prod:12913)] [application] Unable to reconnect to ZooKeeper service, session 0x134485fd7bcb26f has expired, closing socket connection On the leader zk, 225 - zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO [SessionTracker:ZooKeeperServer@314] - Expiring session 0x134485fd7bcb26f, timeout of 6000ms exceeded zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO [ProcessThread:-1:PrepRequestProcessor@391] - Processed session termination for sessionid: 0x134485fd7bcb26f On the server, the client was initially connected to, 223 - zookeeper.log.2012-01-26-223.gz:2012-01-26 04:56:36,173 - INFO [CommitProcessor:1:NIOServerCnxn@1580] - Established session 0x134485fd7bcb26f with negotiated timeout 6000 for client /172.17.136.82:45020 zookeeper.log.2012-01-27-223.gz:2012-01-27 09:52:34,018 - INFO [CommitProcessor:1:NIOServerCnxn@1435] - Closed socket connection for client /172.17.136.82:45020 which had sessionid 0x134485fd7bcb26f Here are the log snippets from 226, which is the server, the client reconnected to, before getting session expired event - 2012-01-27 09:52:38,190 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@770] - Client attempting to renew session 0x134485fd7bcb26f at /172.17.136.82:49367 2012-01-27 09:52:38,191 - INFO [QuorumPeer:/0.0.0.0:12913:NIOServerCnxn@1573] - Invalid session 0x134485fd7bcb26f for client /172.17.136.82:49367, probably expired 2012-01-27 09:52:38,191 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@1435] - Closed socket connection for client /172.17.136.82:49367 which had sessionid 0x134485fd7bcb26f wchp output from 226, taken on 01/30 - nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f *226.*wchp* | wc -l 3 wchp output from 223, taken on 01/30 - nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f *223.*wchp* | wc -l 0 cons output from 223 and 226, taken on 01/30 - nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f *226.*cons* | wc -l 0 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f *223.*cons* | wc -l 0 So, what seems to have happened is that the client was able to re-register the watches on the new server (226), after it got disconnected from 223, inspite of having an expired session id. In NIOServerCnxn, I saw that after suspecting that a session is expired, a server removes the cnxn and its watches from its internal data structures. But before that it allows more requests to be processed even if the session is expired - // Now that the session is ready we can start receiving packets
[jira] [Commented] (ZOOKEEPER-1382) Zookeeper server holds onto dead/expired session ids in the watch data structures
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841000#comment-13841000 ] Hadoop QA commented on ZOOKEEPER-1382: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12617336/ZOOKEEPER-1382.patch against trunk revision 1547702. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1818//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1818//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1818//console This message is automatically generated. Zookeeper server holds onto dead/expired session ids in the watch data structures - Key: ZOOKEEPER-1382 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1382 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.4.5 Reporter: Neha Narkhede Assignee: Germán Blanco Priority: Critical Fix For: 3.4.6, 3.5.0 Attachments: ZOOKEEPER-1382-branch-3.4.patch, ZOOKEEPER-1382-branch-3.4.patch, ZOOKEEPER-1382-branch-3.4.patch, ZOOKEEPER-1382-branch-3.4.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382.patch, ZOOKEEPER-1382_3.3.4.patch I've observed that zookeeper server holds onto expired session ids in the watcher data structures. The result is the wchp command reports session ids that cannot be found through cons/dump and those expired session ids sit there maybe until the server is restarted. Here are snippets from the client and the server logs that lead to this state, for one particular session id 0x134485fd7bcb26f - There are 4 servers in the zookeeper cluster - 223, 224, 225 (leader), 226 and I'm using ZkClient to connect to the cluster From the application log - application.log.2012-01-26-325.gz:2012/01/26 04:56:36.177 INFO [ClientCnxn] [main-SendThread(223.prod:12913)] [application Session establishment complete on server 223.prod/172.17.135.38:12913, sessionid = 0x134485fd7bcb26f, negotiated timeout = 6000 application.log.2012-01-27.gz:2012/01/27 09:52:37.714 INFO [ClientCnxn] [main-SendThread(223.prod:12913)] [application] Client session timed out, have not heard from server in 9827ms for sessionid 0x134485fd7bcb26f, closing socket connection and attempting reconnect application.log.2012-01-27.gz:2012/01/27 09:52:38.191 INFO [ClientCnxn] [main-SendThread(226.prod:12913)] [application] Unable to reconnect to ZooKeeper service, session 0x134485fd7bcb26f has expired, closing socket connection On the leader zk, 225 - zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO [SessionTracker:ZooKeeperServer@314] - Expiring session 0x134485fd7bcb26f, timeout of 6000ms exceeded zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO [ProcessThread:-1:PrepRequestProcessor@391] - Processed session termination for sessionid: 0x134485fd7bcb26f On the server, the client was initially connected to, 223 - zookeeper.log.2012-01-26-223.gz:2012-01-26 04:56:36,173 - INFO [CommitProcessor:1:NIOServerCnxn@1580] - Established session 0x134485fd7bcb26f with negotiated timeout 6000 for client /172.17.136.82:45020 zookeeper.log.2012-01-27-223.gz:2012-01-27 09:52:34,018 - INFO [CommitProcessor:1:NIOServerCnxn@1435] - Closed socket connection for client /172.17.136.82:45020 which had sessionid 0x134485fd7bcb26f Here are the log snippets from 226, which is the server, the client reconnected to, before getting session expired event - 2012-01-27 09:52:38,190 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@770] - Client attempting to renew session 0x134485fd7bcb26f at /172.17.136.82:49367 2012-01-27 09:52:38,191 - INFO [QuorumPeer:/0.0.0.0:12913:NIOServerCnxn@1573] - Invalid session 0x134485fd7bcb26f for client /172.17.136.82:49367, probably expired 2012-01-27 09:52:38,191 - INFO
Success: ZOOKEEPER-1382 PreCommit Build #1818
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1382 Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1818/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 257540 lines...] [exec] BUILD SUCCESSFUL [exec] Total time: 0 seconds [exec] [exec] [exec] [exec] [exec] +1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12617336/ZOOKEEPER-1382.patch [exec] against trunk revision 1547702. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 12 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1818//testReport/ [exec] Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1818//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1818//console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] == [exec] == [exec] Adding comment to Jira. [exec] == [exec] == [exec] [exec] [exec] Comment added. [exec] 90327af7681e8dcf35ebcad31542d9f9b0ce035d logged out [exec] [exec] [exec] == [exec] == [exec] Finished build. [exec] == [exec] == [exec] [exec] BUILD SUCCESSFUL Total time: 33 minutes 51 seconds Archiving artifacts Recording test results Description set: ZOOKEEPER-1382 Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
ZooKeeper_branch34_solaris - Build # 730 - Failure
See https://builds.apache.org/job/ZooKeeper_branch34_solaris/730/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 165617 lines...] [junit] 2013-12-06 07:56:24,150 [myid:] - INFO [Thread-4:NIOServerCnxn@997] - Closed socket connection for client /127.0.0.1:59257 (no session established for client) [junit] 2013-12-06 07:56:24,150 [myid:] - INFO [main:JMXEnv@133] - ensureOnly:[InMemoryDataTree, StandaloneServer_port] [junit] 2013-12-06 07:56:24,151 [myid:] - INFO [main:JMXEnv@105] - expect:InMemoryDataTree [junit] 2013-12-06 07:56:24,151 [myid:] - INFO [main:JMXEnv@108] - found:InMemoryDataTree org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree [junit] 2013-12-06 07:56:24,151 [myid:] - INFO [main:JMXEnv@105] - expect:StandaloneServer_port [junit] 2013-12-06 07:56:24,151 [myid:] - INFO [main:JMXEnv@108] - found:StandaloneServer_port org.apache.ZooKeeperService:name0=StandaloneServer_port-1 [junit] 2013-12-06 07:56:24,152 [myid:] - INFO [main:ClientBase@421] - STOPPING server [junit] 2013-12-06 07:56:24,152 [myid:] - INFO [main:ZooKeeperServer@441] - shutting down [junit] 2013-12-06 07:56:24,152 [myid:] - INFO [main:SessionTrackerImpl@225] - Shutting down [junit] 2013-12-06 07:56:24,153 [myid:] - INFO [main:PrepRequestProcessor@761] - Shutting down [junit] 2013-12-06 07:56:24,153 [myid:] - INFO [main:SyncRequestProcessor@209] - Shutting down [junit] 2013-12-06 07:56:24,153 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor@143] - PrepRequestProcessor exited loop! [junit] 2013-12-06 07:56:24,153 [myid:] - INFO [SyncThread:0:SyncRequestProcessor@187] - SyncRequestProcessor exited! [junit] 2013-12-06 07:56:24,153 [myid:] - INFO [main:FinalRequestProcessor@415] - shutdown of request processor complete [junit] 2013-12-06 07:56:24,154 [myid:] - INFO [main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221 [junit] 2013-12-06 07:56:24,154 [myid:] - INFO [main:JMXEnv@133] - ensureOnly:[] [junit] 2013-12-06 07:56:24,155 [myid:] - INFO [main:ClientBase@414] - STARTING server [junit] 2013-12-06 07:56:24,155 [myid:] - INFO [main:ZooKeeperServer@162] - Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 6 datadir /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch34_solaris/trunk/build/test/tmp/test6206103003480532824.junit.dir/version-2 snapdir /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch34_solaris/trunk/build/test/tmp/test6206103003480532824.junit.dir/version-2 [junit] 2013-12-06 07:56:24,156 [myid:] - INFO [main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:11221 [junit] 2013-12-06 07:56:24,159 [myid:] - INFO [main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221 [junit] 2013-12-06 07:56:24,159 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@197] - Accepted socket connection from /127.0.0.1:59259 [junit] 2013-12-06 07:56:24,160 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@817] - Processing stat command from /127.0.0.1:59259 [junit] 2013-12-06 07:56:24,160 [myid:] - INFO [Thread-5:NIOServerCnxn$StatCommand@653] - Stat command output [junit] 2013-12-06 07:56:24,160 [myid:] - INFO [Thread-5:NIOServerCnxn@997] - Closed socket connection for client /127.0.0.1:59259 (no session established for client) [junit] 2013-12-06 07:56:24,161 [myid:] - INFO [main:JMXEnv@133] - ensureOnly:[InMemoryDataTree, StandaloneServer_port] [junit] 2013-12-06 07:56:24,162 [myid:] - INFO [main:JMXEnv@105] - expect:InMemoryDataTree [junit] 2013-12-06 07:56:24,162 [myid:] - INFO [main:JMXEnv@108] - found:InMemoryDataTree org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree [junit] 2013-12-06 07:56:24,162 [myid:] - INFO [main:JMXEnv@105] - expect:StandaloneServer_port [junit] 2013-12-06 07:56:24,162 [myid:] - INFO [main:JMXEnv@108] - found:StandaloneServer_port org.apache.ZooKeeperService:name0=StandaloneServer_port-1 [junit] 2013-12-06 07:56:24,163 [myid:] - INFO [main:JUnit4ZKTestRunner$LoggedInvokeMethod@57] - FINISHED TEST METHOD testQuota [junit] 2013-12-06 07:56:24,163 [myid:] - INFO [main:ClientBase@451] - tearDown starting [junit] 2013-12-06 07:56:24,246 [myid:] - INFO [main:ZooKeeper@684] - Session: 0x142c6e8531f closed [junit] 2013-12-06 07:56:24,246 [myid:] - INFO [main:ClientBase@421] - STOPPING server [junit] 2013-12-06 07:56:24,247 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@509] - EventThread shut down [junit] 2013-12-06 07:56:24,247 [myid:] - INFO [main:ZooKeeperServer@441] - shutting down [junit] 2013-12-06 07:56:24,247
[jira] [Updated] (BOOKKEEPER-701) Improve exception handling of Bookkeeper threads
[ https://issues.apache.org/jira/browse/BOOKKEEPER-701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated BOOKKEEPER-701: Attachment: 0003-BOOKKEEPER-701.patch Improve exception handling of Bookkeeper threads Key: BOOKKEEPER-701 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-701 Project: Bookkeeper Issue Type: Improvement Components: bookkeeper-auto-recovery, bookkeeper-client, bookkeeper-server Reporter: Rakesh R Assignee: Rakesh R Fix For: 4.3.0 Attachments: 0001-BOOKKEEPER-701.patch, 0002-BOOKKEEPER-701.patch, 0003-BOOKKEEPER-701.patch This JIRA discusses how to improve the exception handling of bookkeeper threads. As part of this it needs to review all the bookkeeper threads, if any unhandled exception from a thread, it should, - log a loud error when a thread dies. - exit if any of the critical thread dies. Please have a look at BOOKKEEPER-700 to know the initial discussions. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (BOOKKEEPER-701) Improve exception handling of Bookkeeper threads
[ https://issues.apache.org/jira/browse/BOOKKEEPER-701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839953#comment-13839953 ] Rakesh R commented on BOOKKEEPER-701: - Thats good. Attached re-worked patch addressing the comment. Also I have included one testcase. Improve exception handling of Bookkeeper threads Key: BOOKKEEPER-701 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-701 Project: Bookkeeper Issue Type: Improvement Components: bookkeeper-auto-recovery, bookkeeper-client, bookkeeper-server Reporter: Rakesh R Assignee: Rakesh R Fix For: 4.3.0 Attachments: 0001-BOOKKEEPER-701.patch, 0002-BOOKKEEPER-701.patch, 0003-BOOKKEEPER-701.patch This JIRA discusses how to improve the exception handling of bookkeeper threads. As part of this it needs to review all the bookkeeper threads, if any unhandled exception from a thread, it should, - log a loud error when a thread dies. - exit if any of the critical thread dies. Please have a look at BOOKKEEPER-700 to know the initial discussions. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (BOOKKEEPER-701) Improve exception handling of Bookkeeper threads
[ https://issues.apache.org/jira/browse/BOOKKEEPER-701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13839963#comment-13839963 ] Hadoop QA commented on BOOKKEEPER-701: -- Testing JIRA BOOKKEEPER-701 Patch [0003-BOOKKEEPER-701.patch|https://issues.apache.org/jira/secure/attachment/12617131/0003-BOOKKEEPER-701.patch] downloaded at Thu Dec 5 08:51:36 UTC 2013 {color:green}+1 PATCH_APPLIES{color} {color:green}+1 CLEAN{color} {color:green}+1 RAW_PATCH_ANALYSIS{color} .{color:green}+1{color} the patch does not introduce any @author tags .{color:green}+1{color} the patch does not introduce any tabs .{color:green}+1{color} the patch does not introduce any trailing spaces .{color:green}+1{color} the patch does not introduce any line longer than 120 .{color:green}+1{color} the patch does adds/modifies 1 testcase(s) {color:green}+1 RAT{color} .{color:green}+1{color} the patch does not seem to introduce new RAT warnings {color:green}+1 JAVADOC{color} .{color:green}+1{color} the patch does not seem to introduce new Javadoc warnings {color:green}+1 COMPILE{color} .{color:green}+1{color} HEAD compiles .{color:green}+1{color} patch compiles .{color:green}+1{color} the patch does not seem to introduce new javac warnings {color:green}+1 FINDBUGS{color} .{color:green}+1{color} the patch does not seem to introduce new Findbugs warnings {color:green}+1 TESTS{color} .Tests run: 885 {color:green}+1 DISTRO{color} .{color:green}+1{color} distro tarball builds with the patch {color:green}*+1 Overall result, good!, no -1s*{color} The full output of the test-patch run is available at . https://builds.apache.org/job/bookkeeper-trunk-precommit-build/544/ Improve exception handling of Bookkeeper threads Key: BOOKKEEPER-701 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-701 Project: Bookkeeper Issue Type: Improvement Components: bookkeeper-auto-recovery, bookkeeper-client, bookkeeper-server Reporter: Rakesh R Assignee: Rakesh R Fix For: 4.3.0 Attachments: 0001-BOOKKEEPER-701.patch, 0002-BOOKKEEPER-701.patch, 0003-BOOKKEEPER-701.patch This JIRA discusses how to improve the exception handling of bookkeeper threads. As part of this it needs to review all the bookkeeper threads, if any unhandled exception from a thread, it should, - log a loud error when a thread dies. - exit if any of the critical thread dies. Please have a look at BOOKKEEPER-700 to know the initial discussions. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (BOOKKEEPER-709) SlowBookieTest#testSlowBookie fails intermittently
[ https://issues.apache.org/jira/browse/BOOKKEEPER-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated BOOKKEEPER-709: Attachment: 0003-BOOKKEEPER-709.patch SlowBookieTest#testSlowBookie fails intermittently -- Key: BOOKKEEPER-709 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-709 Project: Bookkeeper Issue Type: Bug Reporter: Rakesh R Assignee: Rakesh R Labels: tests Fix For: 4.3.0 Attachments: 0001-BOOKKEEPER-709.patch, 0002-BOOKKEEPER-709.patch, 0003-BOOKKEEPER-709.patch SlowBookieTest#testSlowBookie fails intermittently when verifying the result of addEntry. {code} junit.framework.AssertionFailedError: expected:0 but was:-559038737 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:195) at junit.framework.Assert.assertEquals(Assert.java:201) at org.apache.bookkeeper.client.SlowBookieTest.testSlowBookie(SlowBookieTest.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (BOOKKEEPER-709) SlowBookieTest#testSlowBookie fails intermittently
[ https://issues.apache.org/jira/browse/BOOKKEEPER-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated BOOKKEEPER-709: Attachment: (was: 0003-BOOKKEEPER-709.patch) SlowBookieTest#testSlowBookie fails intermittently -- Key: BOOKKEEPER-709 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-709 Project: Bookkeeper Issue Type: Bug Reporter: Rakesh R Assignee: Rakesh R Labels: tests Fix For: 4.3.0 Attachments: 0001-BOOKKEEPER-709.patch, 0002-BOOKKEEPER-709.patch SlowBookieTest#testSlowBookie fails intermittently when verifying the result of addEntry. {code} junit.framework.AssertionFailedError: expected:0 but was:-559038737 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:195) at junit.framework.Assert.assertEquals(Assert.java:201) at org.apache.bookkeeper.client.SlowBookieTest.testSlowBookie(SlowBookieTest.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (BOOKKEEPER-709) SlowBookieTest#testSlowBookie fails intermittently
[ https://issues.apache.org/jira/browse/BOOKKEEPER-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated BOOKKEEPER-709: Attachment: 0003-BOOKKEEPER-709.patch SlowBookieTest#testSlowBookie fails intermittently -- Key: BOOKKEEPER-709 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-709 Project: Bookkeeper Issue Type: Bug Reporter: Rakesh R Assignee: Rakesh R Labels: tests Fix For: 4.3.0 Attachments: 0001-BOOKKEEPER-709.patch, 0002-BOOKKEEPER-709.patch, 0003-BOOKKEEPER-709.patch SlowBookieTest#testSlowBookie fails intermittently when verifying the result of addEntry. {code} junit.framework.AssertionFailedError: expected:0 but was:-559038737 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:195) at junit.framework.Assert.assertEquals(Assert.java:201) at org.apache.bookkeeper.client.SlowBookieTest.testSlowBookie(SlowBookieTest.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (BOOKKEEPER-709) SlowBookieTest#testSlowBookie fails intermittently
[ https://issues.apache.org/jira/browse/BOOKKEEPER-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840337#comment-13840337 ] Rakesh R commented on BOOKKEEPER-709: - Attached latest patch, here I removed the unused call 'lh.getId();'. Also, could you please see my previous comments, if agrees please push this in. Thanks SlowBookieTest#testSlowBookie fails intermittently -- Key: BOOKKEEPER-709 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-709 Project: Bookkeeper Issue Type: Bug Reporter: Rakesh R Assignee: Rakesh R Labels: tests Fix For: 4.3.0 Attachments: 0001-BOOKKEEPER-709.patch, 0002-BOOKKEEPER-709.patch, 0003-BOOKKEEPER-709.patch SlowBookieTest#testSlowBookie fails intermittently when verifying the result of addEntry. {code} junit.framework.AssertionFailedError: expected:0 but was:-559038737 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:195) at junit.framework.Assert.assertEquals(Assert.java:201) at org.apache.bookkeeper.client.SlowBookieTest.testSlowBookie(SlowBookieTest.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (BOOKKEEPER-709) SlowBookieTest#testSlowBookie fails intermittently
[ https://issues.apache.org/jira/browse/BOOKKEEPER-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840382#comment-13840382 ] Hadoop QA commented on BOOKKEEPER-709: -- Testing JIRA BOOKKEEPER-709 Patch [0003-BOOKKEEPER-709.patch|https://issues.apache.org/jira/secure/attachment/12617202/0003-BOOKKEEPER-709.patch] downloaded at Thu Dec 5 18:05:12 UTC 2013 {color:green}+1 PATCH_APPLIES{color} {color:green}+1 CLEAN{color} {color:green}+1 RAW_PATCH_ANALYSIS{color} .{color:green}+1{color} the patch does not introduce any @author tags .{color:green}+1{color} the patch does not introduce any tabs .{color:green}+1{color} the patch does not introduce any trailing spaces .{color:green}+1{color} the patch does not introduce any line longer than 120 .{color:green}+1{color} the patch does adds/modifies 2 testcase(s) {color:green}+1 RAT{color} .{color:green}+1{color} the patch does not seem to introduce new RAT warnings {color:green}+1 JAVADOC{color} .{color:green}+1{color} the patch does not seem to introduce new Javadoc warnings {color:green}+1 COMPILE{color} .{color:green}+1{color} HEAD compiles .{color:green}+1{color} patch compiles .{color:green}+1{color} the patch does not seem to introduce new javac warnings {color:green}+1 FINDBUGS{color} .{color:green}+1{color} the patch does not seem to introduce new Findbugs warnings {color:green}+1 TESTS{color} .Tests run: 884 {color:green}+1 DISTRO{color} .{color:green}+1{color} distro tarball builds with the patch {color:green}*+1 Overall result, good!, no -1s*{color} The full output of the test-patch run is available at . https://builds.apache.org/job/bookkeeper-trunk-precommit-build/545/ SlowBookieTest#testSlowBookie fails intermittently -- Key: BOOKKEEPER-709 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-709 Project: Bookkeeper Issue Type: Bug Reporter: Rakesh R Assignee: Rakesh R Labels: tests Fix For: 4.3.0 Attachments: 0001-BOOKKEEPER-709.patch, 0002-BOOKKEEPER-709.patch, 0003-BOOKKEEPER-709.patch SlowBookieTest#testSlowBookie fails intermittently when verifying the result of addEntry. {code} junit.framework.AssertionFailedError: expected:0 but was:-559038737 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:195) at junit.framework.Assert.assertEquals(Assert.java:201) at org.apache.bookkeeper.client.SlowBookieTest.testSlowBookie(SlowBookieTest.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (BOOKKEEPER-709) SlowBookieTest#testSlowBookie fails intermittently
[ https://issues.apache.org/jira/browse/BOOKKEEPER-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840424#comment-13840424 ] Hadoop QA commented on BOOKKEEPER-709: -- Testing JIRA BOOKKEEPER-709 Patch [0003-BOOKKEEPER-709.patch|https://issues.apache.org/jira/secure/attachment/12617202/0003-BOOKKEEPER-709.patch] downloaded at Thu Dec 5 18:38:53 UTC 2013 {color:green}+1 PATCH_APPLIES{color} {color:green}+1 CLEAN{color} {color:green}+1 RAW_PATCH_ANALYSIS{color} .{color:green}+1{color} the patch does not introduce any @author tags .{color:green}+1{color} the patch does not introduce any tabs .{color:green}+1{color} the patch does not introduce any trailing spaces .{color:green}+1{color} the patch does not introduce any line longer than 120 .{color:green}+1{color} the patch does adds/modifies 2 testcase(s) {color:green}+1 RAT{color} .{color:green}+1{color} the patch does not seem to introduce new RAT warnings {color:green}+1 JAVADOC{color} .{color:green}+1{color} the patch does not seem to introduce new Javadoc warnings {color:green}+1 COMPILE{color} .{color:green}+1{color} HEAD compiles .{color:green}+1{color} patch compiles .{color:green}+1{color} the patch does not seem to introduce new javac warnings {color:green}+1 FINDBUGS{color} .{color:green}+1{color} the patch does not seem to introduce new Findbugs warnings {color:green}+1 TESTS{color} .Tests run: 884 {color:green}+1 DISTRO{color} .{color:green}+1{color} distro tarball builds with the patch {color:green}*+1 Overall result, good!, no -1s*{color} The full output of the test-patch run is available at . https://builds.apache.org/job/bookkeeper-trunk-precommit-build/546/ SlowBookieTest#testSlowBookie fails intermittently -- Key: BOOKKEEPER-709 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-709 Project: Bookkeeper Issue Type: Bug Reporter: Rakesh R Assignee: Rakesh R Labels: tests Fix For: 4.3.0 Attachments: 0001-BOOKKEEPER-709.patch, 0002-BOOKKEEPER-709.patch, 0003-BOOKKEEPER-709.patch SlowBookieTest#testSlowBookie fails intermittently when verifying the result of addEntry. {code} junit.framework.AssertionFailedError: expected:0 but was:-559038737 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:195) at junit.framework.Assert.assertEquals(Assert.java:201) at org.apache.bookkeeper.client.SlowBookieTest.testSlowBookie(SlowBookieTest.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (BOOKKEEPER-701) Improve exception handling of Bookkeeper threads
[ https://issues.apache.org/jira/browse/BOOKKEEPER-701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sijie Guo updated BOOKKEEPER-701: - Component/s: (was: bookkeeper-client) Improve exception handling of Bookkeeper threads Key: BOOKKEEPER-701 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-701 Project: Bookkeeper Issue Type: Improvement Components: bookkeeper-auto-recovery, bookkeeper-server Reporter: Rakesh R Assignee: Rakesh R Fix For: 4.3.0 Attachments: 0001-BOOKKEEPER-701.patch, 0002-BOOKKEEPER-701.patch, 0003-BOOKKEEPER-701.patch This JIRA discusses how to improve the exception handling of bookkeeper threads. As part of this it needs to review all the bookkeeper threads, if any unhandled exception from a thread, it should, - log a loud error when a thread dies. - exit if any of the critical thread dies. Please have a look at BOOKKEEPER-700 to know the initial discussions. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (BOOKKEEPER-643) Improve concurrency of entry logger
[ https://issues.apache.org/jira/browse/BOOKKEEPER-643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sijie Guo updated BOOKKEEPER-643: - Fix Version/s: 4.3.0 Improve concurrency of entry logger --- Key: BOOKKEEPER-643 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-643 Project: Bookkeeper Issue Type: Sub-task Components: bookkeeper-server Affects Versions: 4.2.0 Reporter: Aniruddha Assignee: Aniruddha Fix For: 4.3.0 Attachments: BOOKKEEPER-643.diff, BOOKKEEPER-643.diff the jira is created as part of BOOKKEEPER-429 to improve concurrency of current bookie implementation by leverage concurrent structures. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (BOOKKEEPER-674) Tooling wishlist
[ https://issues.apache.org/jira/browse/BOOKKEEPER-674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sijie Guo updated BOOKKEEPER-674: - Fix Version/s: (was: 4.3.0) 4.4.0 Tooling wishlist Key: BOOKKEEPER-674 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-674 Project: Bookkeeper Issue Type: Wish Reporter: Ivan Kelly Fix For: 4.4.0 One of the issues brought up when I was in California was the lack of tooling for bookkeeper. As such, I'm creating this wishlist as a place to discuss tooling and to create a list of the tools missing. Before 4.3.0 we should go through the suggestions and implement the most useful stuff. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (BOOKKEEPER-713) Bookie should store the cookie in zookeeper first
Vinay created BOOKKEEPER-713: Summary: Bookie should store the cookie in zookeeper first Key: BOOKKEEPER-713 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-713 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-server Affects Versions: 4.2.2 Reporter: Vinay following code in {{Bookie#checkEnvironment(..)}} should store the cookie in zookeeper and then to local disks for {{newEnv}} {code}if (newEnv) { if (missedCookieDirs.size() 0) { LOG.debug(Directories missing cookie file are {}, missedCookieDirs); masterCookie.writeToDirectory(journalDirectory); for (File dir : allLedgerDirs) { masterCookie.writeToDirectory(dir); } } masterCookie.writeToZooKeeper(zk, conf); }{code} Otherwise if the {{masterCookie.writeToZooKeeper(zk, conf);}} fails due to some exception, then bookie cannot start again. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (BOOKKEEPER-713) Bookie should store the cookie in zookeeper first
[ https://issues.apache.org/jira/browse/BOOKKEEPER-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay updated BOOKKEEPER-713: - Attachment: BOOKKEEPER-713.patch Attached the patch for change Bookie should store the cookie in zookeeper first - Key: BOOKKEEPER-713 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-713 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-server Affects Versions: 4.2.2 Reporter: Vinay Attachments: BOOKKEEPER-713.patch following code in {{Bookie#checkEnvironment(..)}} should store the cookie in zookeeper and then to local disks for {{newEnv}} {code}if (newEnv) { if (missedCookieDirs.size() 0) { LOG.debug(Directories missing cookie file are {}, missedCookieDirs); masterCookie.writeToDirectory(journalDirectory); for (File dir : allLedgerDirs) { masterCookie.writeToDirectory(dir); } } masterCookie.writeToZooKeeper(zk, conf); }{code} Otherwise if the {{masterCookie.writeToZooKeeper(zk, conf);}} fails due to some exception, then bookie cannot start again. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (BOOKKEEPER-701) Improve exception handling of Bookkeeper threads
[ https://issues.apache.org/jira/browse/BOOKKEEPER-701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841050#comment-13841050 ] Hudson commented on BOOKKEEPER-701: --- SUCCESS: Integrated in bookkeeper-trunk #464 (See [https://builds.apache.org/job/bookkeeper-trunk/464/]) BOOKKEEPER-701: Improve exception handling of Bookkeeper threads (rakesh via sijie) (sijie: rev 1548385) * /zookeeper/bookkeeper/trunk/CHANGES.txt * /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Bookie.java * /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieCriticalThread.java * /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieThread.java * /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/GarbageCollectorThread.java * /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Journal.java * /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/LedgerDirsManager.java * /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/BookieServer.java * /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/replication/AutoRecoveryMain.java * /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/replication/ReplicationWorker.java * /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/BookieThreadTest.java Improve exception handling of Bookkeeper threads Key: BOOKKEEPER-701 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-701 Project: Bookkeeper Issue Type: Improvement Components: bookkeeper-auto-recovery, bookkeeper-server Reporter: Rakesh R Assignee: Rakesh R Fix For: 4.3.0 Attachments: 0001-BOOKKEEPER-701.patch, 0002-BOOKKEEPER-701.patch, 0003-BOOKKEEPER-701.patch This JIRA discusses how to improve the exception handling of bookkeeper threads. As part of this it needs to review all the bookkeeper threads, if any unhandled exception from a thread, it should, - log a loud error when a thread dies. - exit if any of the critical thread dies. Please have a look at BOOKKEEPER-700 to know the initial discussions. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (BOOKKEEPER-713) Bookie should store the cookie in zookeeper first
[ https://issues.apache.org/jira/browse/BOOKKEEPER-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841070#comment-13841070 ] Sijie Guo commented on BOOKKEEPER-713: -- -1 on the patch. I don't think this is an issue. we should only write to zookeeper after initializing cookies in all local directories. since if we failed on writing cookie to zookeeper, we know the bookie failed before initialize, then it is safe to reformat local bookie. but if you reverse the order, it might cause cookie exists on zookeeper but missing in some local directories. in this case, it is hard to tell whether this bookie is failed on first initialization or failed due to losing a disk. Bookie should store the cookie in zookeeper first - Key: BOOKKEEPER-713 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-713 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-server Affects Versions: 4.2.2 Reporter: Vinay Assignee: Vinay Attachments: BOOKKEEPER-713.patch following code in {{Bookie#checkEnvironment(..)}} should store the cookie in zookeeper and then to local disks for {{newEnv}} {code}if (newEnv) { if (missedCookieDirs.size() 0) { LOG.debug(Directories missing cookie file are {}, missedCookieDirs); masterCookie.writeToDirectory(journalDirectory); for (File dir : allLedgerDirs) { masterCookie.writeToDirectory(dir); } } masterCookie.writeToZooKeeper(zk, conf); }{code} Otherwise if the {{masterCookie.writeToZooKeeper(zk, conf);}} fails due to some exception, then bookie cannot start again. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (BOOKKEEPER-713) Bookie should store the cookie in zookeeper first
[ https://issues.apache.org/jira/browse/BOOKKEEPER-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841072#comment-13841072 ] Hadoop QA commented on BOOKKEEPER-713: -- Testing JIRA BOOKKEEPER-713 Patch [BOOKKEEPER-713.patch|https://issues.apache.org/jira/secure/attachment/12617347/BOOKKEEPER-713.patch] downloaded at Fri Dec 6 07:22:24 UTC 2013 {color:green}+1 PATCH_APPLIES{color} {color:green}+1 CLEAN{color} {color:red}-1 RAW_PATCH_ANALYSIS{color} .{color:green}+1{color} the patch does not introduce any @author tags .{color:green}+1{color} the patch does not introduce any tabs .{color:green}+1{color} the patch does not introduce any trailing spaces .{color:green}+1{color} the patch does not introduce any line longer than 120 .{color:red}-1{color} the patch does not add/modify any testcase {color:green}+1 RAT{color} .{color:green}+1{color} the patch does not seem to introduce new RAT warnings {color:green}+1 JAVADOC{color} .{color:green}+1{color} the patch does not seem to introduce new Javadoc warnings {color:green}+1 COMPILE{color} .{color:green}+1{color} HEAD compiles .{color:green}+1{color} patch compiles .{color:green}+1{color} the patch does not seem to introduce new javac warnings {color:green}+1 FINDBUGS{color} .{color:green}+1{color} the patch does not seem to introduce new Findbugs warnings {color:green}+1 TESTS{color} .Tests run: 885 {color:green}+1 DISTRO{color} .{color:green}+1{color} distro tarball builds with the patch {color:red}*-1 Overall result, please check the reported -1(s)*{color} The full output of the test-patch run is available at . https://builds.apache.org/job/bookkeeper-trunk-precommit-build/547/ Bookie should store the cookie in zookeeper first - Key: BOOKKEEPER-713 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-713 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-server Affects Versions: 4.2.2 Reporter: Vinay Assignee: Vinay Attachments: BOOKKEEPER-713.patch following code in {{Bookie#checkEnvironment(..)}} should store the cookie in zookeeper and then to local disks for {{newEnv}} {code}if (newEnv) { if (missedCookieDirs.size() 0) { LOG.debug(Directories missing cookie file are {}, missedCookieDirs); masterCookie.writeToDirectory(journalDirectory); for (File dir : allLedgerDirs) { masterCookie.writeToDirectory(dir); } } masterCookie.writeToZooKeeper(zk, conf); }{code} Otherwise if the {{masterCookie.writeToZooKeeper(zk, conf);}} fails due to some exception, then bookie cannot start again. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (BOOKKEEPER-713) Bookie should store the cookie in zookeeper first
[ https://issues.apache.org/jira/browse/BOOKKEEPER-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841075#comment-13841075 ] Vinay commented on BOOKKEEPER-713: -- bq. since if we failed on writing cookie to zookeeper, we know the bookie failed before initialize, then it is safe to reformat local bookie In this case, we should continue to start and write the cookies in missing directories rather than failing to start. right? Bookie should store the cookie in zookeeper first - Key: BOOKKEEPER-713 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-713 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-server Affects Versions: 4.2.2 Reporter: Vinay Assignee: Vinay Attachments: BOOKKEEPER-713.patch following code in {{Bookie#checkEnvironment(..)}} should store the cookie in zookeeper and then to local disks for {{newEnv}} {code}if (newEnv) { if (missedCookieDirs.size() 0) { LOG.debug(Directories missing cookie file are {}, missedCookieDirs); masterCookie.writeToDirectory(journalDirectory); for (File dir : allLedgerDirs) { masterCookie.writeToDirectory(dir); } } masterCookie.writeToZooKeeper(zk, conf); }{code} Otherwise if the {{masterCookie.writeToZooKeeper(zk, conf);}} fails due to some exception, then bookie cannot start again. -- This message was sent by Atlassian JIRA (v6.1#6144)