[jira] Commented: (ZOOKEEPER-860) Add alternative search-provider to ZK site
[ https://issues.apache.org/jira/browse/ZOOKEEPER-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905821#action_12905821 ] Alex Baranau commented on ZOOKEEPER-860: Not sure that I follow why this issue was assigned to me. Is there anything I can do about it? I think I cannot commit the patch and hence resolve the issue... Add alternative search-provider to ZK site -- Key: ZOOKEEPER-860 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-860 Project: Zookeeper Issue Type: Improvement Components: documentation Reporter: Alex Baranau Assignee: Alex Baranau Priority: Minor Attachments: ZOOKEEPER-860.patch Use search-hadoop.com service to make available search in ZK sources, MLs, wiki, etc. This was initially proposed on user mailing list (http://search-hadoop.com/m/sTZ4Y1BVKWg1). The search service was already added in site's skin (common for all Hadoop related projects) before (as a part of [AVRO-626|https://issues.apache.org/jira/browse/AVRO-626]) so this issue is about enabling it for ZK. The ultimate goal is to use it at all Hadoop's sub-projects' sites. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-822) Leader election taking a long time to complete
[ https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12905836#action_12905836 ] Flavio Junqueira commented on ZOOKEEPER-822: {quote} 1. Blocking connects and accepts: You are right, when the node is down TCP timeouts rule. a) The first problem is in manager.toSend(). This invokes connectOne(), which does a blocking connect. While testing, I changed the code so that connectOne() starts a new thread called AsyncConnct(). AsyncConnect.run() does a socketChannel.connect(). After starting AsyncConnect, connectOne starts a timer. connectOne continues with normal operations if the connection is established before the timer expires, otherwise, when the timer expires it interrupts AsyncConnect() thread and returns. In this way, I can have an upper bound on the amount of time we need to wait for connect to succeed. Of course, this was a quick fix for my testing. Ideally, we should use Selector to do non-blocking connects/accepts. I am planning to do that later once we at least have a quick fix for the problem and consensus from others for the real fix (this problem is big blocker for us). Note that it is OK to do blocking IO in SenderWorker and RecvWorker threads since they block IO to the respective pe! er. {quote} As I commented before, it might be ok to make it asynchronous, especially if we have a way of checking that there is an attempt to establish a connection in progress. I'm also still intrigued about why this is a problem for you. I haven't seen any of this being a problem before, which of course doesn't mean we shouldn't fix it. It would be nice to understand what's special about your setup or if others have seen similar problems and I missed the reports. {quote} b) The blocking IO problem is not just restricted to connectOne(), but also in receiveConnection(). The Listener thread calls receiveConnection() for each incoming connection request. receiveConnection does blocking IO to get peer's info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the peer that had sent the connection request. All of this is happening from the Listener. In short, if a peer fails after initiating a connection, the Listener thread won't be able to accept connections from other peers, because it would be stuck in read() or connetOne(). Also the code has an inherent cycle. initiateConnection() and receiveConnection() will have to be very carefully synchronized otherwise, we could run into deadlocks. This code is going to be difficult to maintain/modify. {quote} If I remember correctly, we currently synchronize connectOne and make all connection establishments through connectOne so that we make sure that we do one at a time. My understanding is that this should reduce the number of rounds of attempts to establish connections, perhaps at the cost of a longer delay in some runs. {quote} 2. Buggy senderWorkerMap handling: The code that manages senderWorkerMap is very buggy. It is causing multiple election rounds. While debugging I found that sometimes after FLE a node will have its sendWorkerMap empty even if it has SenderWorker and RecvWorker threads for each peer. {quote} I don't think that having multiple rounds is bad; in fact, I think it is unavoidable using reasonable timeout values. The second part, however, sounds like a problem we should fix. {quote} a) The receiveConnection() method calls the finish() method, which removes an entry from the map. Additionally, the thread itself calls finish() which could remove the newly added entry from the map. In short, receiveConnection is causing the exact condition that you mentioned above. {quote} I thought that we were increasing the intervals between notifications, and if so I believe the case you mention above should not happen more than a few times. Now, to fix it, it sounds like we need to check that the finish call is removing the correct object in sendWorkerMap. That is, obj.finish() should remove obj and do nothing if the SendWorker object in sendWorkerMap is a different one. What do you think? {quote} b) Apart from the bug in finish(), receiveConnection is making an entry in senderWorkerMap at the wrong place. Here's the buggy code: SendWorker vsw = senderWorkerMap.get(sid); senderWorkerMap.put(sid, sw); if(vsw != null) vsw.finish(); It makes an entry for the new thread and then calls finish, which causes the new thread to be removed from the Map. The old thread will also get terminated since finish() will interrupt the thread. {quote} See my comment above. Perhaps I should wait to see your proposed modifications, but I wonder if works to check that we are removing the correct SendWorker object. {quote} 3. Race condition in receiveConnection and initiateConnection: In theory, two peers can keep disconnecting each other's connection. Example: T0: Peer 0
Build failed in Hudson: ZooKeeper-trunk #923
See https://hudson.apache.org/hudson/job/ZooKeeper-trunk/923/ -- [...truncated 162872 lines...] [junit] 2010-09-03 10:51:25,922 [myid:] - INFO [Thread-285:nioservercnxn$statcomm...@645] - Stat command output [junit] 2010-09-03 10:51:25,923 [myid:] - INFO [Thread-285:nioserverc...@967] - Closed socket connection for client /127.0.0.1:33405 (no session established for client) [junit] 2010-09-03 10:51:25,923 [myid:] - INFO [main:quorumb...@195] - 127.0.0.1:11236 is accepting client connections [junit] 2010-09-03 10:51:25,923 [myid:] - INFO [main:clientb...@225] - connecting to 127.0.0.1 11237 [junit] 2010-09-03 10:51:25,923 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11237:nioservercnxnfact...@196] - Accepted socket connection from /127.0.0.1:44608 [junit] 2010-09-03 10:51:25,924 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11237:nioserverc...@791] - Processing stat command from /127.0.0.1:44608 [junit] 2010-09-03 10:51:25,924 [myid:] - INFO [Thread-286:nioservercnxn$statcomm...@645] - Stat command output [junit] 2010-09-03 10:51:25,925 [myid:] - INFO [Thread-286:nioserverc...@967] - Closed socket connection for client /127.0.0.1:44608 (no session established for client) [junit] 2010-09-03 10:51:25,925 [myid:] - INFO [main:quorumb...@195] - 127.0.0.1:11237 is accepting client connections [junit] 2010-09-03 10:51:25,925 [myid:] - INFO [main:clientb...@225] - connecting to 127.0.0.1 11238 [junit] 2010-09-03 10:51:25,925 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11238:nioservercnxnfact...@196] - Accepted socket connection from /127.0.0.1:58661 [junit] 2010-09-03 10:51:25,926 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11238:nioserverc...@791] - Processing stat command from /127.0.0.1:58661 [junit] 2010-09-03 10:51:25,926 [myid:] - INFO [Thread-287:nioservercnxn$statcomm...@645] - Stat command output [junit] 2010-09-03 10:51:25,927 [myid:] - INFO [Thread-287:nioserverc...@967] - Closed socket connection for client /127.0.0.1:58661 (no session established for client) [junit] 2010-09-03 10:51:25,927 [myid:] - INFO [main:quorumb...@195] - 127.0.0.1:11238 is accepting client connections [junit] 2010-09-03 10:51:25,927 [myid:] - INFO [main:clientb...@225] - connecting to 127.0.0.1 11239 [junit] 2010-09-03 10:51:25,928 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11239:nioservercnxnfact...@196] - Accepted socket connection from /127.0.0.1:55577 [junit] 2010-09-03 10:51:25,928 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11239:nioserverc...@791] - Processing stat command from /127.0.0.1:55577 [junit] 2010-09-03 10:51:25,929 [myid:] - INFO [Thread-288:nioserverc...@967] - Closed socket connection for client /127.0.0.1:55577 (no session established for client) [junit] 2010-09-03 10:51:26,179 [myid:] - INFO [main:clientb...@225] - connecting to 127.0.0.1 11239 [junit] 2010-09-03 10:51:26,179 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11239:nioservercnxnfact...@196] - Accepted socket connection from /127.0.0.1:55578 [junit] 2010-09-03 10:51:26,179 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11239:nioserverc...@791] - Processing stat command from /127.0.0.1:55578 [junit] 2010-09-03 10:51:26,180 [myid:] - INFO [Thread-289:nioserverc...@967] - Closed socket connection for client /127.0.0.1:55578 (no session established for client) [junit] 2010-09-03 10:51:26,430 [myid:] - INFO [main:clientb...@225] - connecting to 127.0.0.1 11239 [junit] 2010-09-03 10:51:26,430 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11239:nioservercnxnfact...@196] - Accepted socket connection from /127.0.0.1:55579 [junit] 2010-09-03 10:51:26,431 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11239:nioserverc...@791] - Processing stat command from /127.0.0.1:55579 [junit] 2010-09-03 10:51:26,431 [myid:] - INFO [Thread-290:nioserverc...@967] - Closed socket connection for client /127.0.0.1:55579 (no session established for client) [junit] 2010-09-03 10:51:26,681 [myid:] - INFO [main:clientb...@225] - connecting to 127.0.0.1 11239 [junit] 2010-09-03 10:51:26,682 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11239:nioservercnxnfact...@196] - Accepted socket connection from /127.0.0.1:55580 [junit] 2010-09-03 10:51:26,682 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11239:nioserverc...@791] - Processing stat command from /127.0.0.1:55580 [junit] 2010-09-03 10:51:26,682 [myid:] - INFO [Thread-291:nioservercnxn$statcomm...@645] - Stat command output [junit] 2010-09-03 10:51:26,683 [myid:] - INFO [Thread-291:nioserverc...@967] - Closed socket connection for client /127.0.0.1:55580 (no session established for client) [junit] JMXEnv.dump() follows [junit] 2010-09-03 10:51:26,683 [myid:] - INFO
[jira] Created: (ZOOKEEPER-863) Runaway thread - Zookeeper inside Eclipse
Runaway thread - Zookeeper inside Eclipse - Key: ZOOKEEPER-863 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-863 Project: Zookeeper Issue Type: Bug Affects Versions: 3.3.0 Environment: Linux; x86 Reporter: Stephen McCants Priority: Critical I'm running Zookeeper inside an Eclipse application. When I launch the application from inside Eclipse I use the following arguments: -Dzoodiscovery.autoStart=true -Dzoodiscovery.flavor=zoodiscovery.flavor.centralized=localhost This causes the application to start its own ZooKeeper server inside the JVM/application. It immediately goes into a runaway state. The name of the runaway thread is NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181. When I suspend this thread, the CPU usage returns to 0. Here is a stack trace from that thread when it is suspended: EPollArrayWrapper.epollWait(long, int, long, int) line: not available [native method] EPollArrayWrapper.poll(long) line: 215 EPollSelectorImpl.doSelect(long) line: 77 EPollSelectorImpl(SelectorImpl).lockAndDoSelect(long) line: 69 EPollSelectorImpl(SelectorImpl).select(long) line: 80 NIOServerCnxn$Factory.run() line: 232 Any ideas what might be going wrong? Thanks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-864) Hedwig C++ client improvements
Hedwig C++ client improvements -- Key: ZOOKEEPER-864 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-864 Project: Zookeeper Issue Type: Improvement Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: 3.4.0 Attachments: ZOOKEEPER-864.diff I changed the socket code to use boost asio. Now the client only creates one thread, and all operations are non-blocking. Tests are now automated, just run make check. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-864) Hedwig C++ client improvements
[ https://issues.apache.org/jira/browse/ZOOKEEPER-864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated ZOOKEEPER-864: - Attachment: ZOOKEEPER-864.diff Hedwig C++ client improvements -- Key: ZOOKEEPER-864 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-864 Project: Zookeeper Issue Type: Improvement Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: 3.4.0 Attachments: ZOOKEEPER-864.diff I changed the socket code to use boost asio. Now the client only creates one thread, and all operations are non-blocking. Tests are now automated, just run make check. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-864) Hedwig C++ client improvements
[ https://issues.apache.org/jira/browse/ZOOKEEPER-864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Kelly updated ZOOKEEPER-864: - Status: Patch Available (was: Open) Hedwig C++ client improvements -- Key: ZOOKEEPER-864 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-864 Project: Zookeeper Issue Type: Improvement Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: 3.4.0 Attachments: ZOOKEEPER-864.diff I changed the socket code to use boost asio. Now the client only creates one thread, and all operations are non-blocking. Tests are now automated, just run make check. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (ZOOKEEPER-865) Runaway thread
Runaway thread -- Key: ZOOKEEPER-865 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-865 Project: Zookeeper Issue Type: Bug Affects Versions: 3.3.1, 3.3.0 Environment: Linux; Java 1.6; x86; Reporter: Stephen McCants Priority: Critical I'm starting a standalone Zookeeper server (v3.3.1). That starts normally and does not have a runaway thread. Next, I start an based Eclipse application that is using ZK 3.3.0 to register itself with the ZooKeeper server (3.3.1). The Eclipse application using the following arguments to Eclipse: -Dzoodiscovery.autoStart=true -Dzoodiscovery.flavor=zoodiscovery.flavor.centralized=smccants.austin.ibm.com When the Eclipse application starts, the ZK server prints out: 2010-09-03 09:59:46,006 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioservercnxn$fact...@250] - Accepted socket connection from /9.53.189.11:42271 2010-09-03 09:59:46,039 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@776] - Client attempting to establish new session at /9.53.189.11:42271 2010-09-03 09:59:46,045 - INFO [SyncThread:0:nioserverc...@1579] - Established session 0x12ad81b9002 with negotiated timeout 4000 for client /9.53.189.11:42271 2010-09-03 09:59:46,046 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioservercnxn$fact...@250] - Accepted socket connection from /9.53.189.11:42272 2010-09-03 09:59:46,078 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:nioserverc...@776] - Client attempting to establish new session at /9.53.189.11:42272 2010-09-03 09:59:46,080 - INFO [SyncThread:0:nioserverc...@1579] - Established session 0x12ad81b9003 with negotiated timeout 4000 for client /9.53.189.11:42272 Then both the Eclipse application and the ZK server go into runaway states and consume 100% of the CPU. Here is a view from top: PID USERPR NI VIRTRES SHR S %CPU %MEMTIME+ COMMAND 4949 smccants 15 0 597m 78m 5964 S66.2 1.0 1:03.14 autosubmitter 4876 smccants 17 0 554m 27m 6688 S30.9 0.3 0:34.74 java PID 4949 (autosubmitter) is the Eclipse application and is using more than twice the CPU of PID 4876 (java) which is the ZK server. They will continue in this state indefinitely. I can attach a debugger to the Eclipse application and if I stop the thread named pool-1-thread-2-SendThread(smccants.austin.ibm.com:2181) and the runaway condition stops on both the application and ZK server. However the ZK server reports: 2010-09-03 10:03:38,001 - INFO [SessionTracker:zookeeperser...@315] - Expiring session 0x12ad81b9003, timeout of 4000ms exceeded 2010-09-03 10:03:38,002 - INFO [ProcessThread:-1:preprequestproces...@208] - Processed session termination for sessionid: 0x12ad81b9003 2010-09-03 10:03:38,005 - INFO [SyncThread:0:nioserverc...@1434] - Closed socket connection for client /9.53.189.11:42272 which had sessionid 0x12ad81b9003 Here is the stack trace from the suspended thread: EPollArrayWrapper.epollWait(long, int, long, int) line: not available [native method] EPollArrayWrapper.poll(long) line: 215 EPollSelectorImpl.doSelect(long) line: 77 EPollSelectorImpl(SelectorImpl).lockAndDoSelect(long) line: 69 EPollSelectorImpl(SelectorImpl).select(long) line: 80 ClientCnxn$SendThread.run() line: 1066 Any ideas what might be going wrong? Thanks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: (ZOOKEEPER-844) handle auth failure in java client
I don't see why we couldn't include it. Thanks! Patrick On Thu, Sep 2, 2010 at 12:41 PM, Fournier, Camille F. [Tech] camille.fourn...@gs.com wrote: Hi all, I would like to submit this patch into the 3.3 branch as well, since we are probably going to go into production with 3.3 and I'd rather not do a production release with a patched version of ZK if possible. I added a patch for this fix against the 3.3 branch to this ticket. Any idea of the odds of getting this in to the 3.3.2 release? Thanks, Camille -Original Message- From: Giridharan Kesavan (JIRA) [mailto:j...@apache.org] Sent: Tuesday, August 31, 2010 7:25 PM To: Fournier, Camille F. [Tech] Subject: [jira] Updated: (ZOOKEEPER-844) handle auth failure in java client [ https://issues.apache.org/jira/browse/ZOOKEEPER-844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] Giridharan Kesavan updated ZOOKEEPER-844: - Status: Patch Available (was: Open) handle auth failure in java client -- Key: ZOOKEEPER-844 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-844 Project: Zookeeper Issue Type: Improvement Components: java client Affects Versions: 3.3.1 Reporter: Camille Fournier Assignee: Camille Fournier Fix For: 3.4.0 Attachments: ZOOKEEPER-844.patch ClientCnxn.java currently has the following code: if (replyHdr.getXid() == -4) { // -2 is the xid for AuthPacket // TODO: process AuthPacket here if (LOG.isDebugEnabled()) { LOG.debug(Got auth sessionid:0x + Long.toHexString(sessionId)); } return; } Auth failures appear to cause the server to disconnect but the client never gets a proper state change or notification that auth has failed, which makes handling this scenario very difficult as it causes the client to go into a loop of sending bad auth, getting disconnected, trying to reconnect, sending bad auth again, over and over. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-844) handle auth failure in java client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated ZOOKEEPER-844: --- Issue Type: Bug (was: Improvement) This is really a bug, not an improvement. handle auth failure in java client -- Key: ZOOKEEPER-844 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-844 Project: Zookeeper Issue Type: Bug Components: java client Affects Versions: 3.3.1 Reporter: Camille Fournier Assignee: Camille Fournier Fix For: 3.3.2, 3.4.0 Attachments: ZOOKEEPER-844.patch, ZOOKEEPER332-844 ClientCnxn.java currently has the following code: if (replyHdr.getXid() == -4) { // -2 is the xid for AuthPacket // TODO: process AuthPacket here if (LOG.isDebugEnabled()) { LOG.debug(Got auth sessionid:0x + Long.toHexString(sessionId)); } return; } Auth failures appear to cause the server to disconnect but the client never gets a proper state change or notification that auth has failed, which makes handling this scenario very difficult as it causes the client to go into a loop of sending bad auth, getting disconnected, trying to reconnect, sending bad auth again, over and over. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: About symbol table of Zookeeper c client
This is a long standing issue slated for 4.0 https://issues.apache.org/jira/browse/ZOOKEEPER-295 Mahadev had done some work to reduce the exported symbols as part of 3.3, perhaps this slipped through the net? Mahadev - can we address this using the current mechanism? https://issues.apache.org/jira/browse/ZOOKEEPER-295Patrick On Thu, Sep 2, 2010 at 7:37 AM, Qian Ye yeqian@gmail.com wrote: Hi all: I'm writing a application in C which need to link both memcached's lib and zookeeper's c client lib. I found a symbol table conflict, because both libs provide implmentation(recordio.h/c) of function htonll. It seems that some functions of zookeeper c client, which can be accessed externally but uesd internally, have simple names. I think it will bring much symbol table confilct from time to time, and we should do something about it, e.g. add a specific prefix to these funcitons. thx -- With Regards! Ye, Qian
Re: Problems in FLE implementation
Hi Vishal, Thanks for picking this up. My comments are inline: On 9/2/10 3:31 PM, Vishal K vishalm...@gmail.com wrote: Hi All, I had posted this message as a comment for ZOOKEEPER-822. I thought it might be a good idea to give a wider attention so that it will be easier to collect feedback. I found few problems in the FLE implementation while debugging for: https://issues.apache.org/jira/browse/ZOOKEEPER-822. Following the email below might require some background. If necessary, please browse the JIRA. I have a patch for 1. a) and 2). I will send them out soon. 1. Blocking connects and accepts: a) The first problem is in manager.toSend(). This invokes connectOne(), which does a blocking connect. While testing, I changed the code so that connectOne() starts a new thread called AsyncConnct(). AsyncConnect.run() does a socketChannel.connect(). After starting AsyncConnect, connectOne starts a timer. connectOne continues with normal operations if the connection is established before the timer expires, otherwise, when the timer expires it interrupts AsyncConnect() thread and returns. In this way, I can have an upper bound on the amount of time we need to wait for connect to succeed. Of course, this was a quick fix for my testing. Ideally, we should use Selector to do non-blocking connects/accepts. I am planning to do that later once we at least have a quick fix for the problem and consensus from others for the real fix (this problem is big blocker for us). Note that it is OK to do blocking IO in SenderWorker and RecvWorker threads since they block IO to the respective peer. Vishal, I am really concerned about starting up new threads in the server. We really need a total revamp of this code (using NIO and selector). Is the quick fix really required. Zookeeper servers have been running in production for a while, and this problem hasn't been noticed by anyone. Shouldn't we fix it with NIO then? b) The blocking IO problem is not just restricted to connectOne(), but also in receiveConnection(). The Listener thread calls receiveConnection() for each incoming connection request. receiveConnection does blocking IO to get peer's info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the peer that had sent the connection request. All of this is happening from the Listener. In short, if a peer fails after initiating a connection, the Listener thread won't be able to accept connections from other peers, because it would be stuck in read() or connetOne(). Also the code has an inherent cycle. initiateConnection() and receiveConnection() will have to be very carefully synchronized otherwise, we could run into deadlocks. This code is going to be difficult to maintain/modify. 2. Buggy senderWorkerMap handling: The code that manages senderWorkerMap is very buggy. It is causing multiple election rounds. While debugging I found that sometimes after FLE a node will have its sendWorkerMap empty even if it has SenderWorker and RecvWorker threads for each peer. IT would be great to clean it up!! I'd be happy to see this class be cleaned up! :) a) The receiveConnection() method calls the finish() method, which removes an entry from the map. Additionally, the thread itself calls finish() which could remove the newly added entry from the map. In short, receiveConnection is causing the exact condition that you mentioned above. b) Apart from the bug in finish(), receiveConnection is making an entry in senderWorkerMap at the wrong place. Here's the buggy code: SendWorker vsw = senderWorkerMap.get(sid); senderWorkerMap.put(sid, sw); if(vsw != null) vsw.finish(); It makes an entry for the new thread and then calls finish, which causes the new thread to be removed from the Map. The old thread will also get terminated since finish() will interrupt the thread. 3. Race condition in receiveConnection and initiateConnection: *In theory*, two peers can keep disconnecting each other's connection. Example: T0: Peer 0 initiates a connection (request 1) T1: Peer 1 receives connection from peer 0 T2: Peer 1 calls receiveConnection() T2: Peer 0 closes connection to Peer 1 because its ID is lower. T3: Peer 0 re-initiates connection to Peer 1 from manger.toSend() (request 2) T3: Peer 1 terminates older connection to peer 0 T4: Peer 1 calls connectOne() which starts new sendWorker threads for peer 0 T5: Peer 1 kills connection created in T3 because it receives another (request 2) connect request from 0 The problem here is that while Peer 0 is accepting a connection from Peer 1 it can also be initiating a connection to Peer 1. So if they hit the right frequencies they could sit in a connect/disconnect loop and cause multiple rounds of leader election. I think
[jira] Commented: (ZOOKEEPER-863) Runaway thread - Zookeeper inside Eclipse
[ https://issues.apache.org/jira/browse/ZOOKEEPER-863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906014#action_12906014 ] Stephen McCants commented on ZOOKEEPER-863: --- Removing the registered service after ZK had stopped running away, causes ZK to return to using 100% of the CPU. Runaway thread - Zookeeper inside Eclipse - Key: ZOOKEEPER-863 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-863 Project: Zookeeper Issue Type: Bug Affects Versions: 3.3.0 Environment: Linux; x86 Reporter: Stephen McCants Priority: Critical I'm running Zookeeper inside an Eclipse application. When I launch the application from inside Eclipse I use the following arguments: -Dzoodiscovery.autoStart=true -Dzoodiscovery.flavor=zoodiscovery.flavor.centralized=localhost This causes the application to start its own ZooKeeper server inside the JVM/application. It immediately goes into a runaway state. The name of the runaway thread is NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181. When I suspend this thread, the CPU usage returns to 0. Here is a stack trace from that thread when it is suspended: EPollArrayWrapper.epollWait(long, int, long, int) line: not available [native method] EPollArrayWrapper.poll(long) line: 215 EPollSelectorImpl.doSelect(long) line: 77 EPollSelectorImpl(SelectorImpl).lockAndDoSelect(long) line: 69 EPollSelectorImpl(SelectorImpl).select(long) line: 80 NIOServerCnxn$Factory.run() line: 232 Any ideas what might be going wrong? Thanks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Problems in FLE implementation
Hi Mahadev, To be honest, yes, we need the quick fix. I am really surprised why anyone else is not seeing this problem. There is nothing special with our setup. If you look at the JIRA, I have posted logs from various setups (different OS, using physical machines, using virtual machines, etc). Also, the bug is evident from the code. Pretty much every developer in our team has hit this bug. Now, we have an application that is highly time-sensitive. Maybe most of the applications that ZK is running on today can tolerate a 60-80 seconds of FLE convergence. For us such a long delays (under normal conidtions) are not acceptable. It will be nice if people can provide some feedback on how time sensitive their application is? Is 60-80 seconds delay in FLE acceptable? What has been your experience with running ZK in production? How often do you have leader reboots? Feedback will be greatly apprecaited. Thanks. -Vishal On Fri, Sep 3, 2010 at 1:44 PM, Mahadev Konar maha...@yahoo-inc.com wrote: Hi Vishal, Thanks for picking this up. My comments are inline: On 9/2/10 3:31 PM, Vishal K vishalm...@gmail.com wrote: Hi All, I had posted this message as a comment for ZOOKEEPER-822. I thought it might be a good idea to give a wider attention so that it will be easier to collect feedback. I found few problems in the FLE implementation while debugging for: https://issues.apache.org/jira/browse/ZOOKEEPER-822. Following the email below might require some background. If necessary, please browse the JIRA. I have a patch for 1. a) and 2). I will send them out soon. 1. Blocking connects and accepts: a) The first problem is in manager.toSend(). This invokes connectOne(), which does a blocking connect. While testing, I changed the code so that connectOne() starts a new thread called AsyncConnct(). AsyncConnect.run() does a socketChannel.connect(). After starting AsyncConnect, connectOne starts a timer. connectOne continues with normal operations if the connection is established before the timer expires, otherwise, when the timer expires it interrupts AsyncConnect() thread and returns. In this way, I can have an upper bound on the amount of time we need to wait for connect to succeed. Of course, this was a quick fix for my testing. Ideally, we should use Selector to do non-blocking connects/accepts. I am planning to do that later once we at least have a quick fix for the problem and consensus from others for the real fix (this problem is big blocker for us). Note that it is OK to do blocking IO in SenderWorker and RecvWorker threads since they block IO to the respective peer. Vishal, I am really concerned about starting up new threads in the server. We really need a total revamp of this code (using NIO and selector). Is the quick fix really required. Zookeeper servers have been running in production for a while, and this problem hasn't been noticed by anyone. Shouldn't we fix it with NIO then? b) The blocking IO problem is not just restricted to connectOne(), but also in receiveConnection(). The Listener thread calls receiveConnection() for each incoming connection request. receiveConnection does blocking IO to get peer's info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the peer that had sent the connection request. All of this is happening from the Listener. In short, if a peer fails after initiating a connection, the Listener thread won't be able to accept connections from other peers, because it would be stuck in read() or connetOne(). Also the code has an inherent cycle. initiateConnection() and receiveConnection() will have to be very carefully synchronized otherwise, we could run into deadlocks. This code is going to be difficult to maintain/modify. 2. Buggy senderWorkerMap handling: The code that manages senderWorkerMap is very buggy. It is causing multiple election rounds. While debugging I found that sometimes after FLE a node will have its sendWorkerMap empty even if it has SenderWorker and RecvWorker threads for each peer. IT would be great to clean it up!! I'd be happy to see this class be cleaned up! :) a) The receiveConnection() method calls the finish() method, which removes an entry from the map. Additionally, the thread itself calls finish() which could remove the newly added entry from the map. In short, receiveConnection is causing the exact condition that you mentioned above. b) Apart from the bug in finish(), receiveConnection is making an entry in senderWorkerMap at the wrong place. Here's the buggy code: SendWorker vsw = senderWorkerMap.get(sid); senderWorkerMap.put(sid, sw); if(vsw != null) vsw.finish(); It makes an entry for the new thread and then calls finish, which causes the new thread to be removed from the Map. The old thread will also get terminated since finish() will interrupt the thread. 3. Race condition in
[jira] Updated: (ZOOKEEPER-863) Runaway thread - Zookeeper inside Eclipse
[ https://issues.apache.org/jira/browse/ZOOKEEPER-863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen McCants updated ZOOKEEPER-863: -- Attachment: zookeeper.log Runaway thread - Zookeeper inside Eclipse - Key: ZOOKEEPER-863 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-863 Project: Zookeeper Issue Type: Bug Affects Versions: 3.3.0 Environment: Linux; x86 Reporter: Stephen McCants Priority: Critical Attachments: zookeeper.log I'm running Zookeeper inside an Eclipse application. When I launch the application from inside Eclipse I use the following arguments: -Dzoodiscovery.autoStart=true -Dzoodiscovery.flavor=zoodiscovery.flavor.centralized=localhost This causes the application to start its own ZooKeeper server inside the JVM/application. It immediately goes into a runaway state. The name of the runaway thread is NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181. When I suspend this thread, the CPU usage returns to 0. Here is a stack trace from that thread when it is suspended: EPollArrayWrapper.epollWait(long, int, long, int) line: not available [native method] EPollArrayWrapper.poll(long) line: 215 EPollSelectorImpl.doSelect(long) line: 77 EPollSelectorImpl(SelectorImpl).lockAndDoSelect(long) line: 69 EPollSelectorImpl(SelectorImpl).select(long) line: 80 NIOServerCnxn$Factory.run() line: 232 Any ideas what might be going wrong? Thanks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Problems in FLE implementation
Hi Flavio, On Fri, Sep 3, 2010 at 3:02 PM, Flavio Junqueira f...@yahoo-inc.com wrote: Vishal, 60-80 seconds is definitely high, and I would expect people to complain if they were observing such an amount of time to recover. I personally haven't seen any such a case. Can you describe how you were trying to reproduce the bug? On physical machines, it took me 15 retries (reboot -n leader) to reproduce the problem. On VMs it is lot more frequent. On my end, you have good points, but I'm not entirely convinced that we need changes as you're proposing them. Seeing a patch would definitely help to determine. If you can't provide a patch due to legal issue, we should work on one or more to fix at least some of the issues you observed. You are right, my fixes may not be the best approach. My intention was to have a quick fix for our internal use and then start-off a discussion for real fix. I will send out the diff soon. I also agree that it would be nice to have the numbers you are requesting. I would love to see Thanks, -Flavio Thanks. -Vishal On Sep 3, 2010, at 8:51 PM, Vishal K wrote: Hi Mahadev, To be honest, yes, we need the quick fix. I am really surprised why anyone else is not seeing this problem. There is nothing special with our setup. If you look at the JIRA, I have posted logs from various setups (different OS, using physical machines, using virtual machines, etc). Also, the bug is evident from the code. Pretty much every developer in our team has hit this bug. Now, we have an application that is highly time-sensitive. Maybe most of the applications that ZK is running on today can tolerate a 60-80 seconds of FLE convergence. For us such a long delays (under normal conidtions) are not acceptable. It will be nice if people can provide some feedback on how time sensitive their application is? Is 60-80 seconds delay in FLE acceptable? What has been your experience with running ZK in production? How often do you have leader reboots? Feedback will be greatly apprecaited. Thanks. -Vishal On Fri, Sep 3, 2010 at 1:44 PM, Mahadev Konar maha...@yahoo-inc.com wrote: Hi Vishal, Thanks for picking this up. My comments are inline: On 9/2/10 3:31 PM, Vishal K vishalm...@gmail.com wrote: Hi All, I had posted this message as a comment for ZOOKEEPER-822. I thought it might be a good idea to give a wider attention so that it will be easier to collect feedback. I found few problems in the FLE implementation while debugging for: https://issues.apache.org/jira/browse/ZOOKEEPER-822. Following the email below might require some background. If necessary, please browse the JIRA. I have a patch for 1. a) and 2). I will send them out soon. 1. Blocking connects and accepts: a) The first problem is in manager.toSend(). This invokes connectOne(), which does a blocking connect. While testing, I changed the code so that connectOne() starts a new thread called AsyncConnct(). AsyncConnect.run() does a socketChannel.connect(). After starting AsyncConnect, connectOne starts a timer. connectOne continues with normal operations if the connection is established before the timer expires, otherwise, when the timer expires it interrupts AsyncConnect() thread and returns. In this way, I can have an upper bound on the amount of time we need to wait for connect to succeed. Of course, this was a quick fix for my testing. Ideally, we should use Selector to do non-blocking connects/accepts. I am planning to do that later once we at least have a quick fix for the problem and consensus from others for the real fix (this problem is big blocker for us). Note that it is OK to do blocking IO in SenderWorker and RecvWorker threads since they block IO to the respective peer. Vishal, I am really concerned about starting up new threads in the server. We really need a total revamp of this code (using NIO and selector). Is the quick fix really required. Zookeeper servers have been running in production for a while, and this problem hasn't been noticed by anyone. Shouldn't we fix it with NIO then? b) The blocking IO problem is not just restricted to connectOne(), but also in receiveConnection(). The Listener thread calls receiveConnection() for each incoming connection request. receiveConnection does blocking IO to get peer's info (s.read(msgBuffer)). Worse, it invokes connectOne() back to the peer that had sent the connection request. All of this is happening from the Listener. In short, if a peer fails after initiating a connection, the Listener thread won't be able to accept connections from other peers, because it would be stuck in read() or connetOne(). Also the code has an inherent cycle. initiateConnection() and receiveConnection() will have to be very carefully synchronized otherwise, we could run into deadlocks. This code is going to be difficult to maintain/modify. 2.
[jira] Commented: (ZOOKEEPER-864) Hedwig C++ client improvements
[ https://issues.apache.org/jira/browse/ZOOKEEPER-864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906077#action_12906077 ] Michi Mutsuzaki commented on ZOOKEEPER-864: --- Thanks for the patch, Ivan! What do we need to do before we can check in this patch? --Michi Hedwig C++ client improvements -- Key: ZOOKEEPER-864 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-864 Project: Zookeeper Issue Type: Improvement Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: 3.4.0 Attachments: ZOOKEEPER-864.diff I changed the socket code to use boost asio. Now the client only creates one thread, and all operations are non-blocking. Tests are now automated, just run make check. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-864) Hedwig C++ client improvements
[ https://issues.apache.org/jira/browse/ZOOKEEPER-864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906102#action_12906102 ] Mahadev konar commented on ZOOKEEPER-864: - michi to answer your question, all we need is a careful review. Hedwig C++ client improvements -- Key: ZOOKEEPER-864 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-864 Project: Zookeeper Issue Type: Improvement Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: 3.4.0 Attachments: ZOOKEEPER-864.diff I changed the socket code to use boost asio. Now the client only creates one thread, and all operations are non-blocking. Tests are now automated, just run make check. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-822) Leader election taking a long time to complete
[ https://issues.apache.org/jira/browse/ZOOKEEPER-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated ZOOKEEPER-822: Assignee: Vishal K Fix Version/s: 3.3.2 3.4.0 Marking this for 3.3.2, to see if we want this included in 3.3.2. Leader election taking a long time to complete --- Key: ZOOKEEPER-822 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-822 Project: Zookeeper Issue Type: Bug Components: quorum Affects Versions: 3.3.0 Reporter: Vishal K Assignee: Vishal K Priority: Blocker Fix For: 3.3.2, 3.4.0 Attachments: 822.tar.gz, rhel.tar.gz, test_zookeeper_1.log, test_zookeeper_2.log, zk_leader_election.tar.gz, zookeeper-3.4.0.tar.gz Created a 3 node cluster. 1 Fail the ZK leader 2. Let leader election finish. Restart the leader and let it join the 3. Repeat After a few rounds leader election takes anywhere 25- 60 seconds to finish. Note- we didn't have any ZK clients and no new znodes were created. zoo.cfg is shown below: #Mon Jul 19 12:15:10 UTC 2010 server.1=192.168.4.12\:2888\:3888 server.0=192.168.4.11\:2888\:3888 clientPort=2181 dataDir=/var/zookeeper syncLimit=2 server.2=192.168.4.13\:2888\:3888 initLimit=5 tickTime=2000 I have attached logs from two nodes that took a long time to form the cluster after failing the leader. The leader was down anyways so logs from that node shouldn't matter. Look for START HERE. Logs after that point should be of our interest. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-860) Add alternative search-provider to ZK site
[ https://issues.apache.org/jira/browse/ZOOKEEPER-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906223#action_12906223 ] Mahadev konar commented on ZOOKEEPER-860: - alex, the assignment just means that you are working on the patch currently. A committer will review and provide you feedback or commit if deemed fit for the project. Hope that helps. Add alternative search-provider to ZK site -- Key: ZOOKEEPER-860 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-860 Project: Zookeeper Issue Type: Improvement Components: documentation Reporter: Alex Baranau Assignee: Alex Baranau Priority: Minor Fix For: 3.4.0 Attachments: ZOOKEEPER-860.patch Use search-hadoop.com service to make available search in ZK sources, MLs, wiki, etc. This was initially proposed on user mailing list (http://search-hadoop.com/m/sTZ4Y1BVKWg1). The search service was already added in site's skin (common for all Hadoop related projects) before (as a part of [AVRO-626|https://issues.apache.org/jira/browse/AVRO-626]) so this issue is about enabling it for ZK. The ultimate goal is to use it at all Hadoop's sub-projects' sites. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-860) Add alternative search-provider to ZK site
[ https://issues.apache.org/jira/browse/ZOOKEEPER-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahadev konar updated ZOOKEEPER-860: Fix Version/s: 3.4.0 marking it for 3.4 for keeping track. Add alternative search-provider to ZK site -- Key: ZOOKEEPER-860 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-860 Project: Zookeeper Issue Type: Improvement Components: documentation Reporter: Alex Baranau Assignee: Alex Baranau Priority: Minor Fix For: 3.4.0 Attachments: ZOOKEEPER-860.patch Use search-hadoop.com service to make available search in ZK sources, MLs, wiki, etc. This was initially proposed on user mailing list (http://search-hadoop.com/m/sTZ4Y1BVKWg1). The search service was already added in site's skin (common for all Hadoop related projects) before (as a part of [AVRO-626|https://issues.apache.org/jira/browse/AVRO-626]) so this issue is about enabling it for ZK. The ultimate goal is to use it at all Hadoop's sub-projects' sites. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: race condition in InvalidSnapShotTest on client close
Hi Thomas, Sorry for my late response. Please open a jira regarding this. Is this fixed in your netty patc hfor the client? Htanks mahadev On 9/1/10 9:09 AM, Thomas Koch tho...@koch.ro wrote: Hi, I believe, that I've found a race condition in org.apache.zookeeper.server.InvalidSnapshotTest In this test the server is closed before the client. The client, on close(), submits as last package with type ZooDefs.OpCode.closeSession and waits for this package to be finished. However, nobody is there to awake the thread from packet.wait(). The sendThread will on cleanup call packet.notifyAll() in finishpackage. The race condition is: If an exception occurs in the sendThread, closing is already true, so the sendThread breaks out of it's loop, calls cleanup and finishes. If this happens, before the main thread calls packet.wait() then there's nobody left to awake the main thread. Regards, Thomas Koch, http://www.koch.ro