[jira] [Commented] (BOOKKEEPER-173) Uncontrolled number of threads in bookkeeper
[ https://issues.apache.org/jira/browse/BOOKKEEPER-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258385#comment-13258385 ] Flavio Junqueira commented on BOOKKEEPER-173: - +1 as well. Thanks, Sijie. Uncontrolled number of threads in bookkeeper Key: BOOKKEEPER-173 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-173 Project: Bookkeeper Issue Type: Bug Reporter: Philipp Sushkin Assignee: Sijie Guo Fix For: 4.1.0 Attachments: BK-173.patch, BK-173.patch_v2, BOOKKEEPER-173.v3.patch I am not sure if it is a but or not. Say, I do have pc with 256 cores, and there is following code in bookkeeper: {code:title=BookKeeper.java|borderStyle=solid} OrderedSafeExecutor callbackWorker = new OrderedSafeExecutor(Runtime.getRuntime().availableProcessors()); OrderedSafeExecutor mainWorkerPool = new OrderedSafeExecutor(Runtime .getRuntime().availableProcessors()); {code} As I understand, callbackWorker is not used at all, so it could be removed. Also could be required to get more control over mainWorkerPool (say, extract interface + pass instance through contructor). Myabe there are other places in library where some thread pools are created without ability to reuse existing thread pools in application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1411) Consolidate membership management and add client port information
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257367#comment-13257367 ] Flavio Junqueira commented on ZOOKEEPER-1411: - I have added some comments to the review board. The review is not associated to this jira, so it doesn't show up here. It looks mostly good to me, though. Consolidate membership management and add client port information - Key: ZOOKEEPER-1411 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1411 Project: ZooKeeper Issue Type: Sub-task Components: server Reporter: Alexander Shraer Assignee: Alexander Shraer Fix For: 3.5.0 Attachments: ZOOKEEPER-1411-ver1.patch, ZOOKEEPER-1411-ver2.patch, ZOOKEEPER-1411-ver3.patch, ZOOKEEPER-1411-ver4.patch, ZOOKEEPER-1411-ver5.patch, ZOOKEEPER-1411-ver6.patch, ZOOKEEPER-1411-ver7.patch, ZOOKEEPER-1411-ver8.patch, ZOOKEEPER-1411-ver9.patch Currently every server has a different configuration file. With this patch, we will have all cluster membership definitions in a single file, and every sever can have a copy of this file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1411) Consolidate membership management, distinguish between static and dynamic configuration parameters
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257642#comment-13257642 ] Flavio Junqueira commented on ZOOKEEPER-1411: - I just have one small comment about public methods. Would you mind fixing it, Alex? Other than that, looks good to me. Consolidate membership management, distinguish between static and dynamic configuration parameters -- Key: ZOOKEEPER-1411 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1411 Project: ZooKeeper Issue Type: Sub-task Components: server Reporter: Alexander Shraer Assignee: Alexander Shraer Fix For: 3.5.0 Attachments: ZOOKEEPER-1411-ver1.patch, ZOOKEEPER-1411-ver10.patch, ZOOKEEPER-1411-ver2.patch, ZOOKEEPER-1411-ver3.patch, ZOOKEEPER-1411-ver4.patch, ZOOKEEPER-1411-ver5.patch, ZOOKEEPER-1411-ver6.patch, ZOOKEEPER-1411-ver7.patch, ZOOKEEPER-1411-ver8.patch, ZOOKEEPER-1411-ver9.patch Currently every server has a different static configuration file. This patch distinguishes between dynamic parameters, which are now in a separate dynamic configuration file, and static parameters which are in the usual file. The config file points to the dynamic config file by specifying dynamicConfigFile= In the first stage (this patch), all cluster membership definitions are in the dynamic config file, but in the future additional parameters may be moved to the dynamic file. Backward compatibility makes sure that you can still use a single config file if you'd like. Only when the config is changed (once ZK-107 is in) a dynamic file is automatically created and the necessary parameters are moved to it. This patch also moves all membership parsing and management into the QuorumVerifier classes, and removes QuorumPeer.quorumPeers. The cluster membership is contained in QuorumPeer.quorumVerifier. QuorumVerifier was expanded and now has methods such as getAllMembers(), getVotingMembers(), getObservingMembers(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-215) Deadlock occurs under high load
[ https://issues.apache.org/jira/browse/BOOKKEEPER-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256601#comment-13256601 ] Flavio Junqueira commented on BOOKKEEPER-215: - Hi Sijie, Here are some comments on the patch: * One clarification. I understand the call to lh.bk.callbackWorker.submitOrdered in readComplete, but not the one in doRecoveryRead. Why do we need to give it to a worker thread in this case? * It is not performance critical in this case, but it sounds like a good ideal in general to have LOG.debug statements wrapped with isDebugEnabled() (LedgerRecoveryOp:86). You may have simply missed this one. * Is this change gratuitous or really necessary: {noformat} protected Bookie newBookie(ServerConfiguration conf) throws IOException, KeeperException, InterruptedException, BookieException { return new Bookie(conf); } {noformat} * testRecoveryDeadlockWithLimitedPermits() has no assertion or fail clause. What is it testing? * I'm not entirely sure why we need this method: {noformat} /** * Add configuration object. * * @param conf configuration object */ public void addConf(Configuration otherConf) throws ConfigurationException { conf.addConfiguration(otherConf); } {noformat} Why can't we set the bk client configuration in the constructor? * Typo: ... so a scan request need to scan over two ledger - ... so a scan request need to scan over two ledgers * In TestDeadlock, if I understand the test correctly, consumeQueue.take() is supposed to hang due to the bug of this jira. Consequently, we have to wait until junit times out the test? I was wondering if there is a way of avoiding the time out. * Suggestion for rephrasing comment: {noformat} // it obtains the permit and wait for a response, // but the response is delayed and readEntries is called // in the readComplete callback to read entries of the // same ledger. since there is no permit, it blocks. {noformat} Deadlock occurs under high load --- Key: BOOKKEEPER-215 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-215 Project: Bookkeeper Issue Type: Bug Components: hedwig-server Affects Versions: 4.1.0 Reporter: Aniruddha Priority: Critical Fix For: 4.1.0 Attachments: BK-215.patch, hedwig_ts.log LedgerHandle uses a Semaphore(opCounterSem) with a default value of 5000 permits to implement throttling for outstanding requests. This is causing a deadlock under high load. What I've observed is the following - There are a fixed number of threads created by OrderedSafeExecutor(mainWorkerPool in BookKeeper) and this is used to execute operations by PerChannelBookieClient. Under high load, the bookies are not able to satisfy requests at the rate at which they are being generated. This exhausts all permits in the Semaphore and any further operations block on lh.opCounterSem.acquire(). In this scenario, if the connection to the bookies is shut down, channelDisconnected in PerChannelBookieClient tries to error out all outstanding entries. The errorOutReadKey and errorOutAddKey functions enqueue these operations in the same mainWorkerPool, all threads in which are blocked on acquire. So, handleBookieFailure is never executed and the server stops responding. Blocking operations in a fixed size thread pool doesn't sound quite right. Temporarily, I fixed this by having another ExecutorService for every PerChannelBookieClient and queuing the operations from the errorOut* functions in it, but this is just a quick fix. I feel that the server shouldn't rely on LedgerHandle to throttle connections, but do this itself. Any other ideas on how to fix this? I'd be happy to contribute a patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-173) Uncontrolled number of threads in bookkeeper
[ https://issues.apache.org/jira/browse/BOOKKEEPER-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256626#comment-13256626 ] Flavio Junqueira commented on BOOKKEEPER-173: - I have one quick clarification and one request. If setNumWorkerThreads is called after the BookKeeper object is constructed, then it has no effect on the number of threads in the pool. Should we enforce somehow that the value doesn't change after the BookKeeper object is constructed? Otherwise the semantics could be confusing. Also, it would be good to add a description of these options to the documentation. Uncontrolled number of threads in bookkeeper Key: BOOKKEEPER-173 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-173 Project: Bookkeeper Issue Type: Bug Reporter: Philipp Sushkin Fix For: 4.1.0 Attachments: BK-173.patch I am not sure if it is a but or not. Say, I do have pc with 256 cores, and there is following code in bookkeeper: {code:title=BookKeeper.java|borderStyle=solid} OrderedSafeExecutor callbackWorker = new OrderedSafeExecutor(Runtime.getRuntime().availableProcessors()); OrderedSafeExecutor mainWorkerPool = new OrderedSafeExecutor(Runtime .getRuntime().availableProcessors()); {code} As I understand, callbackWorker is not used at all, so it could be removed. Also could be required to get more control over mainWorkerPool (say, extract interface + pass instance through contructor). Myabe there are other places in library where some thread pools are created without ability to reuse existing thread pools in application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-213) PerChannelBookieClient calls the wrong errorOut function when encountering an exception
[ https://issues.apache.org/jira/browse/BOOKKEEPER-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255979#comment-13255979 ] Flavio Junqueira commented on BOOKKEEPER-213: - Aniruddha, I was wondering if you're still planning on submitting a patch for this issue. PerChannelBookieClient calls the wrong errorOut function when encountering an exception --- Key: BOOKKEEPER-213 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-213 Project: Bookkeeper Issue Type: Bug Reporter: Aniruddha Priority: Minor In PerChannelBookieClient.java, addEntry calls errorOutReadKey on encountering an exception instead of errorOutAddKey. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-215) Deadlock occurs under high load
[ https://issues.apache.org/jira/browse/BOOKKEEPER-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252238#comment-13252238 ] Flavio Junqueira commented on BOOKKEEPER-215: - I don't think it is good practice to have calls to bookkeeper itself from a callback. The callback will be executed by a bookkeeper thread, so you really want to give control back to the application shortly. My preference is to leave the bookkeeper client as is. Deadlock occurs under high load --- Key: BOOKKEEPER-215 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-215 Project: Bookkeeper Issue Type: Bug Components: hedwig-server Affects Versions: 4.1.0 Reporter: Aniruddha Priority: Critical Attachments: hedwig_ts.log LedgerHandle uses a Semaphore(opCounterSem) with a default value of 5000 permits to implement throttling for outstanding requests. This is causing a deadlock under high load. What I've observed is the following - There are a fixed number of threads created by OrderedSafeExecutor(mainWorkerPool in BookKeeper) and this is used to execute operations by PerChannelBookieClient. Under high load, the bookies are not able to satisfy requests at the rate at which they are being generated. This exhausts all permits in the Semaphore and any further operations block on lh.opCounterSem.acquire(). In this scenario, if the connection to the bookies is shut down, channelDisconnected in PerChannelBookieClient tries to error out all outstanding entries. The errorOutReadKey and errorOutAddKey functions enqueue these operations in the same mainWorkerPool, all threads in which are blocked on acquire. So, handleBookieFailure is never executed and the server stops responding. Blocking operations in a fixed size thread pool doesn't sound quite right. Temporarily, I fixed this by having another ExecutorService for every PerChannelBookieClient and queuing the operations from the errorOut* functions in it, but this is just a quick fix. I feel that the server shouldn't rely on LedgerHandle to throttle connections, but do this itself. Any other ideas on how to fix this? I'd be happy to contribute a patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-215) Deadlock occurs under high load
[ https://issues.apache.org/jira/browse/BOOKKEEPER-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252958#comment-13252958 ] Flavio Junqueira commented on BOOKKEEPER-215: - This is the code for asyncReadEntries: {noformat} public void asyncReadEntries(long firstEntry, long lastEntry, ReadCallback cb, Object ctx) { // Little sanity check if (firstEntry 0 || lastEntry lastAddConfirmed || firstEntry lastEntry) { cb.readComplete(BKException.Code.ReadException, this, null, ctx); return; } try { new PendingReadOp(this, firstEntry, lastEntry, cb, ctx).initiate(); } catch (InterruptedException e) { cb.readComplete(BKException.Code.InterruptedException, this, null, ctx); } } {noformat} initiate() is called form the application thread, right? Deadlock occurs under high load --- Key: BOOKKEEPER-215 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-215 Project: Bookkeeper Issue Type: Bug Components: hedwig-server Affects Versions: 4.1.0 Reporter: Aniruddha Priority: Critical Fix For: 4.1.0 Attachments: BK-215.patch, hedwig_ts.log LedgerHandle uses a Semaphore(opCounterSem) with a default value of 5000 permits to implement throttling for outstanding requests. This is causing a deadlock under high load. What I've observed is the following - There are a fixed number of threads created by OrderedSafeExecutor(mainWorkerPool in BookKeeper) and this is used to execute operations by PerChannelBookieClient. Under high load, the bookies are not able to satisfy requests at the rate at which they are being generated. This exhausts all permits in the Semaphore and any further operations block on lh.opCounterSem.acquire(). In this scenario, if the connection to the bookies is shut down, channelDisconnected in PerChannelBookieClient tries to error out all outstanding entries. The errorOutReadKey and errorOutAddKey functions enqueue these operations in the same mainWorkerPool, all threads in which are blocked on acquire. So, handleBookieFailure is never executed and the server stops responding. Blocking operations in a fixed size thread pool doesn't sound quite right. Temporarily, I fixed this by having another ExecutorService for every PerChannelBookieClient and queuing the operations from the errorOut* functions in it, but this is just a quick fix. I feel that the server shouldn't rely on LedgerHandle to throttle connections, but do this itself. Any other ideas on how to fix this? I'd be happy to contribute a patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-215) Deadlock occurs under high load
[ https://issues.apache.org/jira/browse/BOOKKEEPER-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251376#comment-13251376 ] Flavio Junqueira commented on BOOKKEEPER-215: - Hi Aniruddha, The thread blocking due to the exhaustion of permits is the application thread, not a thread from the bk client pool. Consequently, I don't see how the deadlock situation you describe can happen. Is there anything I'm missing? Deadlock occurs under high load --- Key: BOOKKEEPER-215 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-215 Project: Bookkeeper Issue Type: Bug Components: hedwig-server Affects Versions: 4.1.0 Reporter: Aniruddha Priority: Critical LedgerHandle uses a Semaphore(opCounterSem) with a default value of 5000 permits to implement throttling for outstanding requests. This is causing a deadlock under high load. What I've observed is the following - There are a fixed number of threads created by OrderedSafeExecutor(mainWorkerPool in BookKeeper) and this is used to execute operations by PerChannelBookieClient. Under high load, the bookies are not able to satisfy requests at the rate at which they are being generated. This exhausts all permits in the Semaphore and any further operations block on lh.opCounterSem.acquire(). In this scenario, if the connection to the bookies is shut down, channelDisconnected in PerChannelBookieClient tries to error out all outstanding entries. The errorOutReadKey and errorOutAddKey functions enqueue these operations in the same mainWorkerPool, all threads in which are blocked on acquire. So, handleBookieFailure is never executed and the server stops responding. Blocking operations in a fixed size thread pool doesn't sound quite right. Temporarily, I fixed this by having another ExecutorService for every PerChannelBookieClient and queuing the operations from the errorOut* functions in it, but this is just a quick fix. I feel that the server shouldn't rely on LedgerHandle to throttle connections, but do this itself. Any other ideas on how to fix this? I'd be happy to contribute a patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-197) HedwigConsole uses the same file to load bookkeeper client config and hub server config
[ https://issues.apache.org/jira/browse/BOOKKEEPER-197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251514#comment-13251514 ] Flavio Junqueira commented on BOOKKEEPER-197: - +1, looks good to me. HedwigConsole uses the same file to load bookkeeper client config and hub server config --- Key: BOOKKEEPER-197 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-197 Project: Bookkeeper Issue Type: Bug Components: hedwig-server Affects Versions: 4.1.0 Reporter: Aniruddha Assignee: Sijie Guo Priority: Minor Fix For: 4.1.0 Attachments: BK-197.diff, BK-197.diff_v2, BK-197.diff_v3 In the current implementation of HedwigConsole.java, The same server-cfg file (default = hedwig-server/conf/hw_server.conf) is used to load both hubServerConf and bkClientConf. This seems incorrect because both have different option names. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-212) Bookie stops responding when creating and deleting many ledgers
[ https://issues.apache.org/jira/browse/BOOKKEEPER-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249759#comment-13249759 ] Flavio Junqueira commented on BOOKKEEPER-212: - +1, it looks good to me. I have also run a version with this patch against my simple test case and I don't see the issue any more. Good job, Sijie! Committed revision 1311177. Bookie stops responding when creating and deleting many ledgers --- Key: BOOKKEEPER-212 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-212 Project: Bookkeeper Issue Type: Bug Reporter: Flavio Junqueira Priority: Blocker Fix For: 4.1.0 Attachments: BK-212.diff, bookkeeper-server.log I have written down a short app to try to reproduce one problematic case reported on the user list. The app does the following: # It creates initially a number of ledgers, say 2000; # Once it reaches 2000, for each new ledger it creates, it deletes the one at the head of the list; # Before closing the ledger, it adds 5 entries of 1k, just to generate some traffic for any given ledger. What I tried to achieve is to have thousands of active ledgers and delete new ledgers as I create new ones. I'll post a link to my test code later. At some point, one bookie stops responding. The bookie seems to be up, but it is not responsive. Looking at the logs, this is what I see: {noformat} 2012-04-06 12:22:05,765 - INFO [SyncThread:LedgerCacheImpl@682] - Ledger 1726 is evicted from file info cache. 2012-04-06 12:22:05,769 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1727 is evicted from file info cache. 2012-04-06 12:22:05,772 - INFO [SyncThread:LedgerCacheImpl@682] - Ledger 1728 is evicted from file info cache. 2012-04-06 12:22:05,780 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1729 is evicted from file info cache. 2012-04-06 12:22:05,787 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1730 is evicted from file info cache. 2012-04-06 12:22:05,794 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1731 is evicted from file info cache. 2012-04-06 12:22:05,801 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1732 is evicted from file info cache. 2012-04-06 12:22:05,807 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1733 is evicted from file info cache. 2012-04-06 12:22:05,822 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1734 is evicted from file info cache. 2012-04-06 12:22:05,828 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1735 is evicted from file info cache. 2012-04-06 12:22:05,842 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1736 is evicted from file info cache. 2012-04-06 12:22:05,851 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1737 is evicted from file info cache. 2012-04-06 12:22:05,856 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1738 is evicted from file info cache. 2012-04-06 12:22:05,864 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1739 is evicted from file info cache. 2012-04-06 12:22:05,874 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1740 is evicted from file info cache. 2012-04-06 12:22:05,885 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1741 is evicted from file info cache. 2012-04-06 12:22:05,894 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1742 is evicted from file info cache. 2012-04-06 12:22:05,902 - INFO [NIOServerFactory-3181:LedgerCacheImpl@682] - Ledger 1743 is evicted from file info cache. 2012-04-06 12:22:05,987 - INFO [GarbageCollectorThread:LedgerCacheImpl@682] - Ledger 1744 is evicted from file info cache. 2012-04-06 12:22:05,987 - ERROR [GarbageCollectorThread:GarbageCollectorThread$1@244] - Exception when deleting the ledger index file on the Bookie: java.io.IOException: /home/fpj/bk/current/1/b/10b.idx not found at org.apache.bookkeeper.bookie.FileInfo.checkOpen(FileInfo.java:118) at org.apache.bookkeeper.bookie.FileInfo.close(FileInfo.java:194) at org.apache.bookkeeper.bookie.LedgerCacheImpl.deleteLedger(LedgerCacheImpl.java:619) at org.apache.bookkeeper.bookie.GarbageCollectorThread$1.gc(GarbageCollectorThread.java:242) at org.apache.bookkeeper.meta.AbstractZkLedgerManager.doGc(AbstractZkLedgerManager.java:274) at org.apache.bookkeeper.meta.FlatLedgerManager.garbageCollectLedgers(FlatLedgerManager.java:168) at org.apache.bookkeeper.bookie.GarbageCollectorThread.doGcLedgers(GarbageCollectorThread.java:237) at
[jira] [Commented] (ZOOKEEPER-1355) Add zk.updateServerList(newServerList)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246175#comment-13246175 ] Flavio Junqueira commented on ZOOKEEPER-1355: - There are currently two separate reviews on the review board for this jira, one by Alex and one by Marshall. It would be great if we could keep one review request and update it with new revisions. And, just so that the comments from review board are recorded in the jira, please make sure to update the bugs field with the jira number. I have also already reviewed the java part of this patch, but I don't see any of my previous comments reflected. Could you guys please fix all these? Add zk.updateServerList(newServerList) --- Key: ZOOKEEPER-1355 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1355 Project: ZooKeeper Issue Type: New Feature Components: java client Reporter: Alexander Shraer Assignee: Alexander Shraer Fix For: 3.5.0 Attachments: ZOOKEEPER-1355-ver10-1.patch, ZOOKEEPER-1355-ver10-2.patch, ZOOKEEPER-1355-ver10-3.patch, ZOOKEEPER-1355-ver10-4.patch, ZOOKEEPER-1355-ver10-4.patch, ZOOKEEPER-1355-ver10.patch, ZOOKEEPER-1355-ver11-1.patch, ZOOKEEPER-1355-ver11.patch, ZOOKEEPER-1355-ver2.patch, ZOOKEEPER-1355-ver4.patch, ZOOKEEPER-1355-ver5.patch, ZOOKEEPER-1355-ver6.patch, ZOOKEEPER-1355-ver7.patch, ZOOKEEPER-1355-ver8.patch, ZOOKEEPER-1355-ver9-1.patch, ZOOKEEPER-1355-ver9.patch, ZOOKEEPER=1355-ver3.patch, ZOOOKEEPER-1355-test.patch, ZOOOKEEPER-1355-ver1.patch, ZOOOKEEPER-1355.patch, loadbalancing-more-details.pdf, loadbalancing.pdf When the set of servers changes, we would like to update the server list stored by clients without restarting the clients. Moreover, assuming that the number of clients per server is the same (in expectation) in the old configuration (as guaranteed by the current list shuffling for example), we would like to re-balance client connections across the new set of servers in a way that a) the number of clients per server is the same for all servers (in expectation) and b) there is no excessive/unnecessary client migration. It is simple to achieve (a) without (b) - just re-shuffle the new list of servers at every client. But this would create unnecessary migration, which we'd like to avoid. We propose a simple probabilistic migration scheme that achieves (a) and (b) - each client locally decides whether and where to migrate when the list of servers changes. The attached document describes the scheme and shows an evaluation of it in Zookeeper. We also implemented re-balancing through a consistent-hashing scheme and show a comparison. We derived the probabilistic migration rules from a simple formula that we can also provide, if someone's interested in the proof. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-203) improve ledger manager interface to remove zookeeper dependency on metadata operations.
[ https://issues.apache.org/jira/browse/BOOKKEEPER-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245764#comment-13245764 ] Flavio Junqueira commented on BOOKKEEPER-203: - Even if we +1 this jira, I was wondering if we should move it to 4.2.0. This jira is a sub-task of BOOKKEPER-181, and we have marked it for 4.2.0. improve ledger manager interface to remove zookeeper dependency on metadata operations. --- Key: BOOKKEEPER-203 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-203 Project: Bookkeeper Issue Type: Sub-task Components: bookkeeper-client, bookkeeper-server Reporter: Sijie Guo Assignee: Sijie Guo Fix For: 4.1.0 Attachments: BOOKKEEPER-203.diff we need to improve ledger manager interface to remove zookeeper dependency on metadata operations, so it is easy for us to implement a MetaStore based ledger manager. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-197) HedwigConsole uses the same file to load bookkeeper client config and hub server config
[ https://issues.apache.org/jira/browse/BOOKKEEPER-197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245136#comment-13245136 ] Flavio Junqueira commented on BOOKKEEPER-197: - This patch looks mostly ok to me, except that the log statement following the new ZooKeeper(...) call seems to be using the wrong values. I also don't have it clear if this assignment still makes sense this.bkClientConf = bkConf;. I haven't checked trunk, so the comment about the assignment could be wrong. HedwigConsole uses the same file to load bookkeeper client config and hub server config --- Key: BOOKKEEPER-197 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-197 Project: Bookkeeper Issue Type: Bug Components: hedwig-server Affects Versions: 4.1.0 Reporter: Aniruddha Assignee: Sijie Guo Priority: Minor Fix For: 4.1.0 Attachments: BK-197.diff, BK-197.diff_v2 In the current implementation of HedwigConsole.java, The same server-cfg file (default = hedwig-server/conf/hw_server.conf) is used to load both hubServerConf and bkClientConf. This seems incorrect because both have different option names. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-199) Provide bookie readonly mode, when journal/ledgers flushing has failed with IOE
[ https://issues.apache.org/jira/browse/BOOKKEEPER-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245331#comment-13245331 ] Flavio Junqueira commented on BOOKKEEPER-199: - Hi Rakesh, The idea we had for BOOKKEEPER-201 was very simple. If we observe that the latency of writing to a given disk is high, then an operator can manually tell the bookie to stop creating ledgers on that disk. In the case that the bookie has multiple ledger devices available, it can slowly shift the traffic to the other devices. The case of a broken disk is a bit more complicated. In this case, the best course of action I can see right now is to have an operator removing the bookie from the pool and executing a bookie recovery procedure for the ledger fragments on the dead disk. The cases you're mentioning are a bit different, since you can observe problems by catching exceptions, so you can automate the decision of removing the bookie from the pool. It sounds ok with me to do it. Provide bookie readonly mode, when journal/ledgers flushing has failed with IOE --- Key: BOOKKEEPER-199 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-199 Project: Bookkeeper Issue Type: Sub-task Components: bookkeeper-server Affects Versions: 4.0.0 Reporter: Rakesh R Assignee: Rakesh R Priority: Critical Bookkeeper should change to readonly(r-o) mode when the journal/ledgers flushing has failed with IOException. Later on, reject write requests on server side and will accept only the read requests from the clients, because even if flushing fails, the data in the bookie which has been flushed is still valid. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-207) BenchBookie doesn't run correctly
[ https://issues.apache.org/jira/browse/BOOKKEEPER-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244186#comment-13244186 ] Flavio Junqueira commented on BOOKKEEPER-207: - It looks mostly good, I just have one small question. In trunk, we refer to ledger+1 and ledger+2, which seems to imply that we are writing to different ledgers. Your patch makes it uniform. Is it the right behavior? BenchBookie doesn't run correctly - Key: BOOKKEEPER-207 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-207 Project: Bookkeeper Issue Type: Bug Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: 4.1.0 Attachments: BOOKKEEPER-207.diff Bench bookie tests latency of addEntry to a single bookie. Currently it simply writes to a specified ledger id. If this ledger id doesn't exist in zookeeper, the ledger is GC'd from the bookie and errors occur in the bench. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1113) QuorumMaj counts the number of ACKs but does not check who sent the ACK
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243194#comment-13243194 ] Flavio Junqueira commented on ZOOKEEPER-1113: - The patch attached looks mostly good to me. It needs a couple of small fixes, though. It does not apply cleaning to trunk and the spacing is not right in QuorumMajorityTest: {noformat} + //setup servers 1-3 to be followers and 4 and 5 to be observers + setUp(true); +ackSet.clear(); + +// 1 follower out of 3 is not a majority {noformat} QuorumMaj counts the number of ACKs but does not check who sent the ACK --- Key: ZOOKEEPER-1113 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1113 Project: ZooKeeper Issue Type: Sub-task Components: quorum Reporter: Alexander Shraer Priority: Minor Fix For: 3.5.0 Attachments: ZOOKEEPER-1113.patch If a server connects to the leader as follower, it will be allowed to vote (with QuorumMaj) even if it is not a follower in the current configuration, as the leader does not care who sends the ACK - it only counts the number of ACKs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1355) Add zk.updateServerList(newServerList)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243211#comment-13243211 ] Flavio Junqueira commented on ZOOKEEPER-1355: - I'm reviewing this one. Add zk.updateServerList(newServerList) --- Key: ZOOKEEPER-1355 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1355 Project: ZooKeeper Issue Type: New Feature Components: java client Reporter: Alexander Shraer Assignee: Alexander Shraer Fix For: 3.5.0 Attachments: ZOOKEEPER-1355-ver10-1.patch, ZOOKEEPER-1355-ver10-2.patch, ZOOKEEPER-1355-ver10-3.patch, ZOOKEEPER-1355-ver10-4.patch, ZOOKEEPER-1355-ver10-4.patch, ZOOKEEPER-1355-ver10.patch, ZOOKEEPER-1355-ver11-1.patch, ZOOKEEPER-1355-ver11.patch, ZOOKEEPER-1355-ver2.patch, ZOOKEEPER-1355-ver4.patch, ZOOKEEPER-1355-ver5.patch, ZOOKEEPER-1355-ver6.patch, ZOOKEEPER-1355-ver7.patch, ZOOKEEPER-1355-ver8.patch, ZOOKEEPER-1355-ver9-1.patch, ZOOKEEPER-1355-ver9.patch, ZOOKEEPER=1355-ver3.patch, ZOOOKEEPER-1355-test.patch, ZOOOKEEPER-1355-ver1.patch, ZOOOKEEPER-1355.patch, loadbalancing-more-details.pdf, loadbalancing.pdf When the set of servers changes, we would like to update the server list stored by clients without restarting the clients. Moreover, assuming that the number of clients per server is the same (in expectation) in the old configuration (as guaranteed by the current list shuffling for example), we would like to re-balance client connections across the new set of servers in a way that a) the number of clients per server is the same for all servers (in expectation) and b) there is no excessive/unnecessary client migration. It is simple to achieve (a) without (b) - just re-shuffle the new list of servers at every client. But this would create unnecessary migration, which we'd like to avoid. We propose a simple probabilistic migration scheme that achieves (a) and (b) - each client locally decides whether and where to migrate when the list of servers changes. The attached document describes the scheme and shows an evaluation of it in Zookeeper. We also implemented re-balancing through a consistent-hashing scheme and show a comparison. We derived the probabilistic migration rules from a simple formula that we can also provide, if someone's interested in the proof. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
[ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243122#comment-13243122 ] Flavio Junqueira commented on BOOKKEEPER-112: - Hi Sijie, I'm not done reviewing it and I'd like to before having it in. Is it ok? Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail -- Key: BOOKKEEPER-112 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112 Project: Bookkeeper Issue Type: Bug Reporter: Flavio Junqueira Assignee: Sijie Guo Fix For: 4.1.0 Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, BOOKKEEPER-112.patch_v6, BOOKKEEPER-112.patch_v7, BOOKKEEPER-112.patch_v8, bk-112.pdf, bk-112.pdf Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-198) replaying entries of deleted ledgers would exhaust ledger cache.
[ https://issues.apache.org/jira/browse/BOOKKEEPER-198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242192#comment-13242192 ] Flavio Junqueira commented on BOOKKEEPER-198: - Thanks, Sijie. The patch looks good, but I'm now confused by a different thing. I couldn't find code to decrement pageCount in trunk. Shouldn't we decrement pageCount as we flush pages? Consequently, there should a line somewhere decrementing it, no? I'm mentioning this because it might be worth having a test, not necessarily for this jira, that confirms that our logic increments and decrements correctly. In particular, we should test that it grows and eventually becomes zero again, never going negative. replaying entries of deleted ledgers would exhaust ledger cache. Key: BOOKKEEPER-198 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-198 Project: Bookkeeper Issue Type: Bug Reporter: Sijie Guo Assignee: Sijie Guo Fix For: 4.1.0 Attachments: BK-198.patch, BK-198.patch_v2 we found that replaying entries of deleted ledgers would exhaust ledger cache. then ledger cache would no clean page to grab, it would throw following exception. {code} java.util.NoSuchElementException at java.util.LinkedList.getFirst(LinkedList.java:109) at org.apache.bookkeeper.bookie.LedgerCacheImpl.grabCleanPage(LedgerCacheImpl.java:454) at org.apache.bookkeeper.bookie.LedgerCacheImpl.putEntryOffset(LedgerCacheImpl.java:165) {code} this issue is because bookie grabs a clean page but fail to updating page due to NoLedgerException, but bookie doesn't return this clean page back to ledger cache. so the ledger cache is exhausted, when new ledger want to grab a clean page, it failed to find available page. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-193) Ledger is garbage collected by mistake.
[ https://issues.apache.org/jira/browse/BOOKKEEPER-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242201#comment-13242201 ] Flavio Junqueira commented on BOOKKEEPER-193: - Ok, I'm almost convinced. :-) If it is only ever used by the gcThread, then the reason why we are creating a ConcurrentHashMap is just for additional safety? If we used a map that is not thread safe we would still be fine? Ledger is garbage collected by mistake. --- Key: BOOKKEEPER-193 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-193 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-server Reporter: Sijie Guo Assignee: Sijie Guo Priority: Blocker Fix For: 4.1.0 Attachments: BK-193.patch, BK-193.patch_v2, BOOKKEEPER-193.diff currently, we encountered such case: ledger is garbage collected by mistake, and following requests would fail due to NoLedgerException. {code} 2012-03-23 19:10:47,403 - INFO [GarbageCollectorThread:GarbageCollectorThread@234] - Garbage collecting deleted ledger index files. 2012-03-23 19:10:48,702 - INFO [GarbageCollectorThread:LedgerCache@544] - Deleting ledgerId: 89408 2012-03-23 19:10:48,703 - INFO [GarbageCollectorThread:LedgerCache@577] - Deleted ledger : 89408 2012-03-23 19:11:10,013 - ERROR [NIOServerFactory-3181:BookieServer@361] - Error writing 1@89408 org.apache.bookkeeper.bookie.Bookie$NoLedgerException: Ledger 89408 not found at org.apache.bookkeeper.bookie.LedgerCache.getFileInfo(LedgerCache.java:228) at org.apache.bookkeeper.bookie.LedgerCache.updatePage(LedgerCache.java:260) at org.apache.bookkeeper.bookie.LedgerCache.putEntryOffset(LedgerCache.java:158) at org.apache.bookkeeper.bookie.LedgerDescriptor.addEntry(LedgerDescriptor.java:135) at org.apache.bookkeeper.bookie.Bookie.addEntryInternal(Bookie.java:1059) at org.apache.bookkeeper.bookie.Bookie.addEntry(Bookie.java:1099) at org.apache.bookkeeper.proto.BookieServer.processPacket(BookieServer.java:357) at org.apache.bookkeeper.proto.NIOServerFactory$Cnxn.readRequest(NIOServerFactory.java:315) at org.apache.bookkeeper.proto.NIOServerFactory$Cnxn.doIO(NIOServerFactory.java:213) at org.apache.bookkeeper.proto.NIOServerFactory.run(NIOServerFactory.java:124) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-198) replaying entries of deleted ledgers would exhaust ledger cache.
[ https://issues.apache.org/jira/browse/BOOKKEEPER-198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242260#comment-13242260 ] Flavio Junqueira commented on BOOKKEEPER-198: - +1, thanks for the clarifications, Sijie. replaying entries of deleted ledgers would exhaust ledger cache. Key: BOOKKEEPER-198 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-198 Project: Bookkeeper Issue Type: Bug Reporter: Sijie Guo Assignee: Sijie Guo Fix For: 4.1.0 Attachments: BK-198.patch, BK-198.patch_v2 we found that replaying entries of deleted ledgers would exhaust ledger cache. then ledger cache would no clean page to grab, it would throw following exception. {code} java.util.NoSuchElementException at java.util.LinkedList.getFirst(LinkedList.java:109) at org.apache.bookkeeper.bookie.LedgerCacheImpl.grabCleanPage(LedgerCacheImpl.java:454) at org.apache.bookkeeper.bookie.LedgerCacheImpl.putEntryOffset(LedgerCacheImpl.java:165) {code} this issue is because bookie grabs a clean page but fail to updating page due to NoLedgerException, but bookie doesn't return this clean page back to ledger cache. so the ledger cache is exhausted, when new ledger want to grab a clean page, it failed to find available page. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-193) Ledger is garbage collected by mistake.
[ https://issues.apache.org/jira/browse/BOOKKEEPER-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13241053#comment-13241053 ] Flavio Junqueira commented on BOOKKEEPER-193: - +1, looks good to me! Since this is a blocker, I'd like to give an opportunity to others to have a look at it. If no one else says anything by the end of today, I'll commit it. Ledger is garbage collected by mistake. --- Key: BOOKKEEPER-193 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-193 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-server Reporter: Sijie Guo Assignee: Sijie Guo Priority: Blocker Fix For: 4.1.0 Attachments: BK-193.patch, BK-193.patch_v2 currently, we encountered such case: ledger is garbage collected by mistake, and following requests would fail due to NoLedgerException. {code} 2012-03-23 19:10:47,403 - INFO [GarbageCollectorThread:GarbageCollectorThread@234] - Garbage collecting deleted ledger index files. 2012-03-23 19:10:48,702 - INFO [GarbageCollectorThread:LedgerCache@544] - Deleting ledgerId: 89408 2012-03-23 19:10:48,703 - INFO [GarbageCollectorThread:LedgerCache@577] - Deleted ledger : 89408 2012-03-23 19:11:10,013 - ERROR [NIOServerFactory-3181:BookieServer@361] - Error writing 1@89408 org.apache.bookkeeper.bookie.Bookie$NoLedgerException: Ledger 89408 not found at org.apache.bookkeeper.bookie.LedgerCache.getFileInfo(LedgerCache.java:228) at org.apache.bookkeeper.bookie.LedgerCache.updatePage(LedgerCache.java:260) at org.apache.bookkeeper.bookie.LedgerCache.putEntryOffset(LedgerCache.java:158) at org.apache.bookkeeper.bookie.LedgerDescriptor.addEntry(LedgerDescriptor.java:135) at org.apache.bookkeeper.bookie.Bookie.addEntryInternal(Bookie.java:1059) at org.apache.bookkeeper.bookie.Bookie.addEntry(Bookie.java:1099) at org.apache.bookkeeper.proto.BookieServer.processPacket(BookieServer.java:357) at org.apache.bookkeeper.proto.NIOServerFactory$Cnxn.readRequest(NIOServerFactory.java:315) at org.apache.bookkeeper.proto.NIOServerFactory$Cnxn.doIO(NIOServerFactory.java:213) at org.apache.bookkeeper.proto.NIOServerFactory.run(NIOServerFactory.java:124) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-181) Scale hedwig
[ https://issues.apache.org/jira/browse/BOOKKEEPER-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240287#comment-13240287 ] Flavio Junqueira commented on BOOKKEEPER-181: - bq. hmm, hbase did provide versions for rows. but I not sure that HBase provide such functionality to check current version and swap. I think it just provide set a value associate with a version, not a CAS by version. You're right, checkAndPut does not allow us to use the internal version to perform the check. I actually found a jira discussing it: HBASE-4527. The recommended way described there is to have our own version column. Scale hedwig Key: BOOKKEEPER-181 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-181 Project: Bookkeeper Issue Type: Improvement Components: bookkeeper-server, hedwig-server Reporter: Sijie Guo Assignee: Sijie Guo Attachments: hedwigscale.pdf Current implementation of Hedwig and BookKeeper is designed to scale to hundreds of thousands of topics, but now we are looking at scaling them to tens to hundreds of millions of topics, using a scalable key/value store such as HBase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-181) Scale hedwig
[ https://issues.apache.org/jira/browse/BOOKKEEPER-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239508#comment-13239508 ] Flavio Junqueira commented on BOOKKEEPER-181: - Hi Sijie, I agree with creating sub-tasks for this umbrella jira. For bookkeeper, we need to access ledger metadata both from clients and bookies, right? We may want to have a separate task for each, even though they will most likely rely upon the same interface to the metadata repository. I'd like to clarify one point. In my understanding, we still plan to use zookeeper for node availability, while we move all metadata to another scalable data store, such as hbase. Is this correct? If so, the plugable interface will allow the use of different repositories for the metadata part, but we will still rely upon zookeeper to monitor node availability. I have a few other comments about the design doc: # In the definition of the compare-and-swap operation, the comparison is performed using the key and value itself. This might be expensive, so I was wondering if it is a better approach to use versions instead. The drawback is relying upon a backend that provides versioned data. It seems fine for me, though. # Related to the previous comment, it might be a better idea to state somewhere what properties we require from the backend store. # I'm not entirely sure I understand the implementation of leader election in 5.1. What happens if a hub is incorrectly suspected of crashing and it loses ownership over a topic? Does it find out via session expiration? Also, I suppose that if the hub has crashed but the list of hubs hasn't changed, then multiple iterations of 1 may have to happen. Scale hedwig Key: BOOKKEEPER-181 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-181 Project: Bookkeeper Issue Type: Improvement Components: bookkeeper-server, hedwig-server Reporter: Sijie Guo Assignee: Sijie Guo Attachments: hedwigscale.pdf Current implementation of Hedwig and BookKeeper is designed to scale to hundreds of thousands of topics, but now we are looking at scaling them to tens to hundreds of millions of topics, using a scalable key/value store such as HBase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-181) Scale hedwig
[ https://issues.apache.org/jira/browse/BOOKKEEPER-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239842#comment-13239842 ] Flavio Junqueira commented on BOOKKEEPER-181: - bq. in the proposal, the comparison operation is just applied in a cell (located by (key,family,qualifier), while the set operation can be applied on multiple cells. for example, suppose we have two columns, one column is data column, which is used to store actual data; while the other one is version column, which is used to store a incremented number. the initial value is (oldData, 0). when we want to update data column, we executed by CAS (key, 0, key, (newData, 1)). the comparison is applied only on version column, is not on data column, which is not expensive. I was thinking that in the case of hbase, we have versions for the rows already, so we could use those to perform the comparison: if the version passed as input is not current, then we don't swap. I agree that it restricts the kv-stores we could use, though. The approach you describe is more general. bq. As my knowledge, zk#setData provides a conditional set over version, the set operation succeeds only when the given matches the version of the znode, which is a kind of CAS. CAS would be better to support more K/V stores. Exactly, the comparison is performed using versions (int). bq. I think I have put them in section 3, the operations required by a MetaStore. That's right, the beginning of Section 3 talks about it. I read it as the set of operations that we execute against the metadata store as opposed to the semantics we expect from the store. bq. doesn't this case exit using zookeeper? it seems that there is still a gap between hub crashed and znode deletion (session expired). in metastore-based topic manager, this gap becomes hub crashed and other hub server got notified about hub crashed. Yes, it exists with zookeeper, but the current design document does not reflect it. You may consider updating it. bq. if a hub server is not crashed, other hub server would not receive the notification from zookeeper about that hub crashed (can zookeeper guarantee it?). so ownership would not change, since other hub server still see a same zxid about that hub server. It was not entirely clear to me that you're assuming zookeeper in the document for this part. If that's the assumption, then it seems fine to me. Scale hedwig Key: BOOKKEEPER-181 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-181 Project: Bookkeeper Issue Type: Improvement Components: bookkeeper-server, hedwig-server Reporter: Sijie Guo Assignee: Sijie Guo Attachments: hedwigscale.pdf Current implementation of Hedwig and BookKeeper is designed to scale to hundreds of thousands of topics, but now we are looking at scaling them to tens to hundreds of millions of topics, using a scalable key/value store such as HBase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-190) Add entries would fail when number of open ledgers reaches more than openFileLimit.
[ https://issues.apache.org/jira/browse/BOOKKEEPER-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236940#comment-13236940 ] Flavio Junqueira commented on BOOKKEEPER-190: - Hi Sijie, We introduced LedgerCacheTest (and LedgerCacheTest#testAddEntryException) in BOOKKEEPER-22. Add entries would fail when number of open ledgers reaches more than openFileLimit. --- Key: BOOKKEEPER-190 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-190 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-server Reporter: Sijie Guo Assignee: Sijie Guo Fix For: 4.1.0 Attachments: BOOKKEEPER-190.diff, BOOKKEEPER-190.diff_v2 when the number of open ledgers reaches more than openFileLimit, a file info will be closed and removed from opened ledgers list. And after BOOKKEEPER-137, the ledger index file creation delayed until necessary. suppose ledger l is removed from opened ledger list, and its index file haven't been created. new add entries operations of other ledgers came into bookie server, a new page need to be grab for them. so bookie server may need to flush the dirty pages of ledger l(when page cache is full). and the flush would fail due to NoLedgerException (no index file found). actually the ledger l isn't lost, it could be recovered if restarting bookie server, but the bookie server would not work well on adding entries. a proposal solution is that we need to force index creation when the ledger is evicted from open ledgers list. {code} 2012-03-21 14:00:42,989 - DEBUG - [NIOServerFactory-5000:LedgerCache@235] - New ledger index file created for ledgerId: 4 2012-03-21 14:00:42,990 - INFO - [NIOServerFactory-5000:LedgerCache@241] - Ledger 2 is evicted from file info cache. 2012-03-21 14:00:42,990 - DEBUG - [New I/O client worker #1-1:PerChannelBookieClient$2@255] - Successfully wrote request for adding entry: 0 ledger-id: 4 bookie: /10.82.129.173:5000 entry length: 70 2012-03-21 14:00:42,990 - ERROR - [NIOServerFactory-5000:BookieServer@361] - Error writing 0@4 org.apache.bookkeeper.bookie.Bookie$NoLedgerException: Ledger 2 not found at org.apache.bookkeeper.bookie.LedgerCache.getFileInfo(LedgerCache.java:228) at org.apache.bookkeeper.bookie.LedgerCache.flushLedger(LedgerCache.java:359) at org.apache.bookkeeper.bookie.LedgerCache.flushLedger(LedgerCache.java:292) at org.apache.bookkeeper.bookie.LedgerCache.grabCleanPage(LedgerCache.java:447) at org.apache.bookkeeper.bookie.LedgerCache.putEntryOffset(LedgerCache.java:157) at org.apache.bookkeeper.bookie.LedgerDescriptor.addEntry(LedgerDescriptor.java:130) at org.apache.bookkeeper.bookie.Bookie.addEntryInternal(Bookie.java:1059) at org.apache.bookkeeper.bookie.Bookie.addEntry(Bookie.java:1099) at org.apache.bookkeeper.proto.BookieServer.processPacket(BookieServer.java:357) at org.apache.bookkeeper.proto.NIOServerFactory$Cnxn.readRequest(NIOServerFactory.java:315) at org.apache.bookkeeper.proto.NIOServerFactory$Cnxn.doIO(NIOServerFactory.java:213) at org.apache.bookkeeper.proto.NIOServerFactory.run(NIOServerFactory.java:124) 2012-03-21 14:00:42,991 - DEBUG - [pool-3-thread-1:PerChannelBookieClient@576] - Got response for add request from bookie: /10.82.129.173:5000 for ledger: 4 entry: 0 rc: 101 2012-03-21 14:00:42,991 - ERROR - [pool-3-thread-1:PerChannelBookieClient@594] - Add for ledger: 4, entry: 0 failed on bookie: /10.82.129.173:5000 with code: 101 2012-03-21 14:00:42,991 - WARN - [pool-3-thread-1:PendingAddOp@142] - Write did not succeed: 4, 0 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1411) Consolidate membership management and add client port information
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235461#comment-13235461 ] Flavio Junqueira commented on ZOOKEEPER-1411: - @Alex You're right, I hadn't looked into setType. @Rakesh If you check the first comment of this jira, in the sample config, it sounds like Alex is proposing this new pattern: hostname:port:port:type. We currently don't use such a pattern host:port:type, at least according to the documentation and to my knowledge. I'm also not sure I understand the comment about the NFE propagating back in the call chain. Why is that an issue exactly? Consolidate membership management and add client port information - Key: ZOOKEEPER-1411 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1411 Project: ZooKeeper Issue Type: Sub-task Components: server Reporter: Alexander Shraer Assignee: Alexander Shraer Fix For: 3.5.0 Attachments: ZOOKEEPER-1411-ver1.patch, ZOOKEEPER-1411-ver2.patch, ZOOKEEPER-1411-ver3.patch, ZOOKEEPER-1411-ver4.patch, ZOOKEEPER-1411-ver5.patch Currently every server has a different configuration file. With this patch, we will have all cluster membership definitions in a single file, and every sever can have a copy of this file. This also solves ZOOKEEPER-1113 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-190) Add entries would fail when number of open ledgers reaches more than openFileLimit.
[ https://issues.apache.org/jira/browse/BOOKKEEPER-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235463#comment-13235463 ] Flavio Junqueira commented on BOOKKEEPER-190: - +1, looks great, Sijie. Thanks for changing the test. Add entries would fail when number of open ledgers reaches more than openFileLimit. --- Key: BOOKKEEPER-190 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-190 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-server Reporter: Sijie Guo Assignee: Sijie Guo Fix For: 4.1.0 Attachments: BOOKKEEPER-190.diff, BOOKKEEPER-190.diff_v2 when the number of open ledgers reaches more than openFileLimit, a file info will be closed and removed from opened ledgers list. And after BOOKKEEPER-137, the ledger index file creation delayed until necessary. suppose ledger l is removed from opened ledger list, and its index file haven't been created. new add entries operations of other ledgers came into bookie server, a new page need to be grab for them. so bookie server may need to flush the dirty pages of ledger l(when page cache is full). and the flush would fail due to NoLedgerException (no index file found). actually the ledger l isn't lost, it could be recovered if restarting bookie server, but the bookie server would not work well on adding entries. a proposal solution is that we need to force index creation when the ledger is evicted from open ledgers list. {code} 2012-03-21 14:00:42,989 - DEBUG - [NIOServerFactory-5000:LedgerCache@235] - New ledger index file created for ledgerId: 4 2012-03-21 14:00:42,990 - INFO - [NIOServerFactory-5000:LedgerCache@241] - Ledger 2 is evicted from file info cache. 2012-03-21 14:00:42,990 - DEBUG - [New I/O client worker #1-1:PerChannelBookieClient$2@255] - Successfully wrote request for adding entry: 0 ledger-id: 4 bookie: /10.82.129.173:5000 entry length: 70 2012-03-21 14:00:42,990 - ERROR - [NIOServerFactory-5000:BookieServer@361] - Error writing 0@4 org.apache.bookkeeper.bookie.Bookie$NoLedgerException: Ledger 2 not found at org.apache.bookkeeper.bookie.LedgerCache.getFileInfo(LedgerCache.java:228) at org.apache.bookkeeper.bookie.LedgerCache.flushLedger(LedgerCache.java:359) at org.apache.bookkeeper.bookie.LedgerCache.flushLedger(LedgerCache.java:292) at org.apache.bookkeeper.bookie.LedgerCache.grabCleanPage(LedgerCache.java:447) at org.apache.bookkeeper.bookie.LedgerCache.putEntryOffset(LedgerCache.java:157) at org.apache.bookkeeper.bookie.LedgerDescriptor.addEntry(LedgerDescriptor.java:130) at org.apache.bookkeeper.bookie.Bookie.addEntryInternal(Bookie.java:1059) at org.apache.bookkeeper.bookie.Bookie.addEntry(Bookie.java:1099) at org.apache.bookkeeper.proto.BookieServer.processPacket(BookieServer.java:357) at org.apache.bookkeeper.proto.NIOServerFactory$Cnxn.readRequest(NIOServerFactory.java:315) at org.apache.bookkeeper.proto.NIOServerFactory$Cnxn.doIO(NIOServerFactory.java:213) at org.apache.bookkeeper.proto.NIOServerFactory.run(NIOServerFactory.java:124) 2012-03-21 14:00:42,991 - DEBUG - [pool-3-thread-1:PerChannelBookieClient@576] - Got response for add request from bookie: /10.82.129.173:5000 for ledger: 4 entry: 0 rc: 101 2012-03-21 14:00:42,991 - ERROR - [pool-3-thread-1:PerChannelBookieClient@594] - Add for ledger: 4, entry: 0 failed on bookie: /10.82.129.173:5000 with code: 101 2012-03-21 14:00:42,991 - WARN - [pool-3-thread-1:PendingAddOp@142] - Write did not succeed: 4, 0 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1411) Consolidate membership management and add client port information
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235599#comment-13235599 ] Flavio Junqueira commented on ZOOKEEPER-1411: - Got it, Rakesh. Thanks for the clarification. You're right, we need to remember to remove it when we get rid of UDP leader election. Consolidate membership management and add client port information - Key: ZOOKEEPER-1411 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1411 Project: ZooKeeper Issue Type: Sub-task Components: server Reporter: Alexander Shraer Assignee: Alexander Shraer Fix For: 3.5.0 Attachments: ZOOKEEPER-1411-ver1.patch, ZOOKEEPER-1411-ver2.patch, ZOOKEEPER-1411-ver3.patch, ZOOKEEPER-1411-ver4.patch, ZOOKEEPER-1411-ver5.patch Currently every server has a different configuration file. With this patch, we will have all cluster membership definitions in a single file, and every sever can have a copy of this file. This also solves ZOOKEEPER-1113 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1411) Consolidate membership management and add client port information
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235626#comment-13235626 ] Flavio Junqueira commented on ZOOKEEPER-1411: - Given that the old LE code is already deprecated (ZOOKEEPER-1153), I would say that it is fine to not take it into account as Rakesh suggests. I don't feel too strongly either way. In the case we keep it, it would be a good idea to create a jira or sub-task for removing it. I can't find a jira for removing the UDP-based implementations. If there isn't one, then we need to create one for it too. Consolidate membership management and add client port information - Key: ZOOKEEPER-1411 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1411 Project: ZooKeeper Issue Type: Sub-task Components: server Reporter: Alexander Shraer Assignee: Alexander Shraer Fix For: 3.5.0 Attachments: ZOOKEEPER-1411-ver1.patch, ZOOKEEPER-1411-ver2.patch, ZOOKEEPER-1411-ver3.patch, ZOOKEEPER-1411-ver4.patch, ZOOKEEPER-1411-ver5.patch Currently every server has a different configuration file. With this patch, we will have all cluster membership definitions in a single file, and every sever can have a copy of this file. This also solves ZOOKEEPER-1113 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
[ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234591#comment-13234591 ] Flavio Junqueira commented on BOOKKEEPER-112: - There are two occasions during the lifetime of a ledger that we use consensus through zookeeper: when we change the ensemble and when we close the ledger. By design, the former is only proposed by the writer, whereas the latter can be proposed by either the writer or another client trying to recover it. Trying to change the design so that we can have multiple clients proposing changes to the ensemble of a ledger would be difficult and prone to errors, so I suggest we keep this part of the design the way it is. One way to perform the fencing for recovery and still keep the original design as is with respect to ensemble changes is to wait for the writer to mark in the ledger metadata such a change. Say that we externally detect that a bookie C has crashed. If the writer of a given ledger L removes C from its configuration and writes to ZooKeeper, then we can safely recover the ledger fragment of C for L. If the writer of L never makes such a change, then we assume that the writer can still talk to C, and consequently we don't care. We can monitor ensemble changes by watching the node in ZooKeeper. Using watches will make it difficult to implement an HBase backend for metadata as we proposed in another jira. Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail -- Key: BOOKKEEPER-112 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112 Project: Bookkeeper Issue Type: Bug Reporter: Flavio Junqueira Assignee: Sijie Guo Fix For: 4.1.0 Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5 Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1225) Successive invocation of LeaderElectionSupport.start() will bring the ELECTED node to READY and cause no one in ELECTED state.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235116#comment-13235116 ] Flavio Junqueira commented on ZOOKEEPER-1225: - Hi Rakesh, One question here. In makeOffer, we create an ephemeral, sequential node. Consequently, multiple invocations of start() with the trunk code would create multiple znodes for the same client. Isn't it sufficient that we just consider the first one when we try to determine if a given client is the leader in determineElectionStatus()? Successive invocation of LeaderElectionSupport.start() will bring the ELECTED node to READY and cause no one in ELECTED state. -- Key: ZOOKEEPER-1225 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1225 Project: ZooKeeper Issue Type: Bug Components: recipes Affects Versions: 3.3.3 Reporter: Rakesh R Assignee: Rakesh R Fix For: 3.5.0 Attachments: ZOOKEEPER-1225.patch Presently there is no state validation for the start() api, so one can invoke multiple times consecutively. The second or further invocation will makes the client node to become 'READY' state transition. Because there is an offer already got created during the first invocation of the start() api, the second invocation again makeOffer() and after determination will be chosen as READY state transitions. This makes the situation with no 'ELECTED' nodes present and the client (or the user of the election recipe) will be indefinitely waiting for the 'ELECTED' node. Similarly, stop() api can be invoked and there is no state validation and this can dispatch unnecessary FAILED transition events. IMO, LES recipe can have validation logic to avoid the successive start() and stop() invocations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1411) Consolidate membership management and add client port information
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235124#comment-13235124 ] Flavio Junqueira commented on ZOOKEEPER-1411: - In the way I understand the problem Rakesh is pointing out, say we have NumberFormatException (numeric string is broken) and serverParts.length == 3. In this situation, this patch will allow the server to continue, and it shouldn't. We should throw and exception in such a run instead of allowing it to continue. Consolidate membership management and add client port information - Key: ZOOKEEPER-1411 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1411 Project: ZooKeeper Issue Type: Sub-task Components: server Reporter: Alexander Shraer Assignee: Alexander Shraer Fix For: 3.5.0 Attachments: ZOOKEEPER-1411-ver1.patch, ZOOKEEPER-1411-ver2.patch, ZOOKEEPER-1411-ver3.patch, ZOOKEEPER-1411-ver4.patch, ZOOKEEPER-1411-ver5.patch Currently every server has a different configuration file. With this patch, we will have all cluster membership definitions in a single file, and every sever can have a copy of this file. This also solves ZOOKEEPER-1113 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1419) Leader election never settles for a 5-node cluster
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232739#comment-13232739 ] Flavio Junqueira commented on ZOOKEEPER-1419: - yes, only 3.4+ Leader election never settles for a 5-node cluster -- Key: ZOOKEEPER-1419 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1419 Project: ZooKeeper Issue Type: Bug Components: leaderElection Affects Versions: 3.4.3, 3.5.0 Environment: 64-bit Linux, all nodes running on the same machine (different ports) Reporter: Jeremy Stribling Assignee: Flavio Junqueira Priority: Blocker Fix For: 3.4.4, 3.5.0 Attachments: ZOOKEEPER-1419-fixed2.tgz, ZOOKEEPER-1419.patch, ZOOKEEPER-1419.patch, ZOOKEEPER-1419.patch We have a situation where it seems to my untrained eye that leader election never finishes for a 5-node cluster. In this test, all nodes are ZK 3.4.3 and running on the same server (listening on different ports, of course). The nodes have server IDs of 0, 1, 2, 3, 4. The test brings up the cluster in different configurations, adding in a new node each time. We embed ZK in our application, so when we shut a node down and restart it with a new configuration, it all happens in a single JVM process. Here's our server startup code (for the case where there's more than one node in the cluster): {code} if (servers.size() 1) { _log.debug(Starting Zookeeper server in quorum server mode); _quorum_peer = new QuorumPeer(); synchronized(_quorum_peer) { _quorum_peer.setClientPortAddress(clientAddr); _quorum_peer.setTxnFactory(log); _quorum_peer.setQuorumPeers(servers); _quorum_peer.setElectionType(_election_alg); _quorum_peer.setMyid(_server_id); _quorum_peer.setTickTime(_tick_time); _quorum_peer.setInitLimit(_init_limit); _quorum_peer.setSyncLimit(_sync_limit); QuorumVerifier quorumVerifier = new QuorumMaj(servers.size()); _quorum_peer.setQuorumVerifier(quorumVerifier); _quorum_peer.setCnxnFactory(_cnxn_factory); _quorum_peer.setZKDatabase(new ZKDatabase(log)); _quorum_peer.start(); } } else { _log.debug(Starting Zookeeper server in single server mode); _zk_server = new ZooKeeperServer(); _zk_server.setTxnLogFactory(log); _zk_server.setTickTime(_tick_time); _cnxn_factory.startup(_zk_server); } {code} And here's our shutdown code: {code} if (_quorum_peer != null) { synchronized(_quorum_peer) { _quorum_peer.shutdown(); FastLeaderElection fle = (FastLeaderElection) _quorum_peer.getElectionAlg(); fle.shutdown(); try { _quorum_peer.getTxnFactory().commit(); } catch (java.nio.channels.ClosedChannelException e) { // ignore } } } else { _cnxn_factory.shutdown(); _zk_server.getTxnLogFactory().commit(); } {code} The test steps through the following scenarios in quick succession: Run 1: Start a 1-node cluster, servers=[0] Run 2: Start a 2-node cluster, servers=[0,3] Run 3: Start a 3-node cluster, servers=[0,1,3] Run 4: Start a 4-node cluster, servers=[0,1,2,3] Run 5: Start a 5-node cluster, servers=[0,1,2,3,4] It appears that run 5 never elects a leader -- the nodes just keep spewing messages like this (example from node 0): {noformat} 2012-03-14 16:23:12,775 13308 [WorkerSender[myid=0]] DEBUG org.apache.zookeeper.server.quorum.QuorumCnxManager - There is a connection already for server 2 2012-03-14 16:23:12,776 13309 [QuorumPeer[myid=0]/127.0.0.1:2900] DEBUG org.apache.zookeeper.server.quorum.FastLeaderElection - Sending Notification: 3 (n.leader), 0x0 (n.zxid), 0x1 (n.round), 3 (recipient), 0 (myid), 0x2 (n.peerEpoch) 2012-03-14 16:23:12,776 13309 [WorkerSender[myid=0]] DEBUG org.apache.zookeeper.server.quorum.QuorumCnxManager - There is a connection already for server 3 2012-03-14 16:23:12,776 13309 [QuorumPeer[myid=0]/127.0.0.1:2900] DEBUG org.apache.zookeeper.server.quorum.FastLeaderElection - Sending Notification: 3 (n.leader), 0x0 (n.zxid), 0x1 (n.round), 4 (recipient), 0 (myid), 0x2 (n.peerEpoch) 2012-03-14 16:23:12,776 13309 [WorkerSender[myid=0]] DEBUG org.apache.zookeeper.server.quorum.QuorumCnxManager - There is a connection already for server 4 2012-03-14 16:23:12,776 13309 [WorkerReceiver[myid=0]] DEBUG org.apache.zookeeper.server.quorum.FastLeaderElection - Receive new notification message. My id = 0 2012-03-14 16:23:12,776 13309 [WorkerReceiver[myid=0]] INFO org.apache.zookeeper.server.quorum.FastLeaderElection - Notification: 4 (n.leader), 0x0 (n.zxid), 0x1
[jira] [Commented] (ZOOKEEPER-1419) Leader election never settles for a 5-node cluster
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232251#comment-13232251 ] Flavio Junqueira commented on ZOOKEEPER-1419: - Hi Camille, The logic is incorrect in trunk. This case in the test I propose, inspired by Jeremy's run: {noformat} Assert.assertFalse (mock.predicate(4L, 0L, 0L, 3L, 0L, 2L)); {noformat} fails with trunk code. It shouldn't return true in the case the new vote has an earlier epoch, and it does because we don't have the parentheses placed correctly. This was introduced with the ZAB 1.0 changes, btw. About the tests, I felt was simpler and cleaner to have a more focused test instead of a test that tests the whole machinery. Do you think it is not sufficient? Leader election never settles for a 5-node cluster -- Key: ZOOKEEPER-1419 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1419 Project: ZooKeeper Issue Type: Bug Components: leaderElection Affects Versions: 3.4.3, 3.3.5, 3.5.0 Environment: 64-bit Linux, all nodes running on the same machine (different ports) Reporter: Jeremy Stribling Assignee: Flavio Junqueira Priority: Blocker Fix For: 3.3.6, 3.4.4, 3.5.0 Attachments: ZOOKEEPER-1419-fixed2.tgz, ZOOKEEPER-1419.patch, ZOOKEEPER-1419.patch, ZOOKEEPER-1419.patch We have a situation where it seems to my untrained eye that leader election never finishes for a 5-node cluster. In this test, all nodes are ZK 3.4.3 and running on the same server (listening on different ports, of course). The nodes have server IDs of 0, 1, 2, 3, 4. The test brings up the cluster in different configurations, adding in a new node each time. We embed ZK in our application, so when we shut a node down and restart it with a new configuration, it all happens in a single JVM process. Here's our server startup code (for the case where there's more than one node in the cluster): {code} if (servers.size() 1) { _log.debug(Starting Zookeeper server in quorum server mode); _quorum_peer = new QuorumPeer(); synchronized(_quorum_peer) { _quorum_peer.setClientPortAddress(clientAddr); _quorum_peer.setTxnFactory(log); _quorum_peer.setQuorumPeers(servers); _quorum_peer.setElectionType(_election_alg); _quorum_peer.setMyid(_server_id); _quorum_peer.setTickTime(_tick_time); _quorum_peer.setInitLimit(_init_limit); _quorum_peer.setSyncLimit(_sync_limit); QuorumVerifier quorumVerifier = new QuorumMaj(servers.size()); _quorum_peer.setQuorumVerifier(quorumVerifier); _quorum_peer.setCnxnFactory(_cnxn_factory); _quorum_peer.setZKDatabase(new ZKDatabase(log)); _quorum_peer.start(); } } else { _log.debug(Starting Zookeeper server in single server mode); _zk_server = new ZooKeeperServer(); _zk_server.setTxnLogFactory(log); _zk_server.setTickTime(_tick_time); _cnxn_factory.startup(_zk_server); } {code} And here's our shutdown code: {code} if (_quorum_peer != null) { synchronized(_quorum_peer) { _quorum_peer.shutdown(); FastLeaderElection fle = (FastLeaderElection) _quorum_peer.getElectionAlg(); fle.shutdown(); try { _quorum_peer.getTxnFactory().commit(); } catch (java.nio.channels.ClosedChannelException e) { // ignore } } } else { _cnxn_factory.shutdown(); _zk_server.getTxnLogFactory().commit(); } {code} The test steps through the following scenarios in quick succession: Run 1: Start a 1-node cluster, servers=[0] Run 2: Start a 2-node cluster, servers=[0,3] Run 3: Start a 3-node cluster, servers=[0,1,3] Run 4: Start a 4-node cluster, servers=[0,1,2,3] Run 5: Start a 5-node cluster, servers=[0,1,2,3,4] It appears that run 5 never elects a leader -- the nodes just keep spewing messages like this (example from node 0): {noformat} 2012-03-14 16:23:12,775 13308 [WorkerSender[myid=0]] DEBUG org.apache.zookeeper.server.quorum.QuorumCnxManager - There is a connection already for server 2 2012-03-14 16:23:12,776 13309 [QuorumPeer[myid=0]/127.0.0.1:2900] DEBUG org.apache.zookeeper.server.quorum.FastLeaderElection - Sending Notification: 3 (n.leader), 0x0 (n.zxid), 0x1 (n.round), 3 (recipient), 0 (myid), 0x2 (n.peerEpoch) 2012-03-14 16:23:12,776 13309 [WorkerSender[myid=0]] DEBUG org.apache.zookeeper.server.quorum.QuorumCnxManager - There is a connection already for server 3 2012-03-14 16:23:12,776 13309 [QuorumPeer[myid=0]/127.0.0.1:2900] DEBUG org.apache.zookeeper.server.quorum.FastLeaderElection - Sending Notification: 3 (n.leader),
[jira] [Commented] (ZOOKEEPER-1277) servers stop serving when lower 32bits of zxid roll over
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229656#comment-13229656 ] Flavio Junqueira commented on ZOOKEEPER-1277: - Ok, I only looked at propose() as you suggested, Pat. That method sounds right: it forces a leader election when we reach the limit. However, I'm not sure how we guarantee that Zab will work correctly under this exception. It is an invariant of the protocol that a follower won't go back to a previous epoch; if we roll over, then followers will have to go back to a previous epoch, no? How do we make sure that it doesn't break the protocol implementation? servers stop serving when lower 32bits of zxid roll over Key: ZOOKEEPER-1277 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1277 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.3.3 Reporter: Patrick Hunt Assignee: Patrick Hunt Priority: Critical Fix For: 3.3.6 Attachments: ZOOKEEPER-1277_br33.patch, ZOOKEEPER-1277_br33.patch When the lower 32bits of a zxid roll over (zxid is a 64 bit number, however the upper 32 are considered the epoch number) the epoch number (upper 32 bits) are incremented and the lower 32 start at 0 again. This should work fine, however in the current 3.3 branch the followers see this as a NEWLEADER message, which it's not, and effectively stop serving clients. Attached clients seem to eventually time out given that heartbeats (or any operation) are no longer processed. The follower doesn't recover from this. I've tested this out on 3.3 branch and confirmed this problem, however I haven't tried it on 3.4/3.5. It may not happen on the newer branches due to ZOOKEEPER-335, however there is certainly an issue with updating the acceptedEpoch files contained in the datadir. (I'll enter a separate jira for that) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-168) Message bounding on subscriptions
[ https://issues.apache.org/jira/browse/BOOKKEEPER-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229077#comment-13229077 ] Flavio Junqueira commented on BOOKKEEPER-168: - Although Ivan's argument sounds right to me, I also agree with Stu that a time-based collection can be useful in some scenarios. For example, in the case I tell my users that their messages will be there for at least a month. In fact, I believe we have considered a time-based mechanism at some point. If you guys think this is a useful feature, we can try to work it out perhaps in a separate jira. Message bounding on subscriptions - Key: BOOKKEEPER-168 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-168 Project: Bookkeeper Issue Type: New Feature Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: 4.1.0 Attachments: BOOKKEEPER-168.diff, BOOKKEEPER-168.diff, BOOKKEEPER-168.diff In hedwig, messages for a subscription will queue up forever if the subscriber is offline. In some usecases, this is undesirable, as it will eventually mean resource exhaustion. In this JIRA we propose an optional change to the subscription contract, which allows the user to set a bound on the number of messages which will be queued for its subscription while it is offline. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-56) Race condition of message handler in connection recovery in Hedwig client
[ https://issues.apache.org/jira/browse/BOOKKEEPER-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13228308#comment-13228308 ] Flavio Junqueira commented on BOOKKEEPER-56: Should we move this one to 4.2.0? Race condition of message handler in connection recovery in Hedwig client - Key: BOOKKEEPER-56 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-56 Project: Bookkeeper Issue Type: Bug Components: hedwig-client Affects Versions: 4.0.0 Reporter: Gavin Li Assignee: Gavin Li Fix For: 4.1.0 Attachments: patch_56 There's a race condition in the connection recovery logic in Hedwig client. The message handler user set might be overwritten incorrectly. When handling channelDisconnected event, we try to reconnect to Hedwig server. After the connection is created and subscribed, we'll call StartDelivery() to recover the message handler to the original one of the disconnected connection. But if during this process, user calls StartDelivery() to set a new message handler, it will get overwritten to the original one. The process can be demonstrated as below: || main thread || netty worker thread || | StartDelivery(messageHandlerA) | | | (connection Broken here, and recovered later...) | | | ResponseHandler::channelDisconnected() (connection disconnected event received) | | | new SubscribeReconnectCallback(subHandler.getMessageHandler()) (store messageHandlerA in SubscribeReconnectCallback to recover later) | | | client.doConnect() (try reconnect) | | | doSubUnsub() (resubscribe) | | | SubscriberResponseHandler::handleSubscribeResponse() (subscription succeeds) | | StartDelivery(messageHandlderB) | | | | SubscribeReconnectCallback::operationFinished() | | | StartDelvery(messageHandlerA) (messageHandler get overwritten) | I can stably reproduce this by simulating this race condition by put some sleep in ResponseHandler. I think essentially speaking we should not store messageHandler in ResponseHandler, since the message handler is supposed to be bound to connection. Instead, no matter which connection is in use, we should use the same messageHandler, the one user set last time. So I think we should change to store messageHandler in the HedwigSubscriber, in this way we don't need to recover the handler in connection recovery and thus won't face this race condition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
[ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220018#comment-13220018 ] Flavio Junqueira commented on BOOKKEEPER-112: - There is a distinction between bookie recovery and ledger recovery. I don't think we should do bookie recovery on an open ledger. Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail -- Key: BOOKKEEPER-112 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112 Project: Bookkeeper Issue Type: Bug Reporter: Flavio Junqueira Assignee: Sijie Guo Fix For: 4.1.0 Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4 Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-173) Uncontrolled number of threads in bookkeeper
[ https://issues.apache.org/jira/browse/BOOKKEEPER-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214520#comment-13214520 ] Flavio Junqueira commented on BOOKKEEPER-173: - +1 on the idea of having a configuration option. Uncontrolled number of threads in bookkeeper Key: BOOKKEEPER-173 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-173 Project: Bookkeeper Issue Type: Bug Reporter: Philipp Sushkin I am not sure if it is a but or not. Say, I do have pc with 256 cores, and there is following code in bookkeeper: {code:title=BookKeeper.java|borderStyle=solid} OrderedSafeExecutor callbackWorker = new OrderedSafeExecutor(Runtime.getRuntime().availableProcessors()); OrderedSafeExecutor mainWorkerPool = new OrderedSafeExecutor(Runtime .getRuntime().availableProcessors()); {code} As I understand, callbackWorker is not used at all, so it could be removed. Also could be required to get more control over mainWorkerPool (say, extract interface + pass instance through contructor). Myabe there are other places in library where some thread pools are created without ability to reuse existing thread pools in application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-113) NPE In BookKeeper test
[ https://issues.apache.org/jira/browse/BOOKKEEPER-113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211752#comment-13211752 ] Flavio Junqueira commented on BOOKKEEPER-113: - In that long call chain, it looks like there are multiple places where we can get an NPE from. Also, checking for null and executing the operation are not atomic, so it can still break. That's my rationale at least... NPE In BookKeeper test -- Key: BOOKKEEPER-113 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-113 Project: Bookkeeper Issue Type: Bug Reporter: Flavio Junqueira Priority: Minor Attachments: BOOKKEEPER-113.patch This is not correctness issue, but it is ugly to throw an NPE there. {noformat} Running org.apache.bookkeeper.test.BookieFailureTest Nov 17, 2011 2:48:28 PM org.jboss.netty.channel.DefaultChannelFuture WARNING: An exception was thrown by ChannelFutureListener. java.lang.NullPointerException at org.apache.bookkeeper.proto.PerChannelBookieClient.addEntry(PerChannelBookieClient.java:231) at org.apache.bookkeeper.proto.BookieClient$1.operationComplete(BookieClient.java:85) at org.apache.bookkeeper.proto.BookieClient$1.operationComplete(BookieClient.java:78) at org.apache.bookkeeper.proto.PerChannelBookieClient$1.operationComplete(PerChannelBookieClient.java:158) at org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:381) at org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:372) at org.jboss.netty.channel.DefaultChannelFuture.setSuccess(DefaultChannelFuture.java:316) at org.jboss.netty.channel.socket.nio.NioWorker$RegisterTask.run(NioWorker.java:767) at org.jboss.netty.channel.socket.nio.NioWorker.processRegisterTaskQueue(NioWorker.java:256) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:198) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) {noformat} The fix should be trivial, though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-170) Bookie constructor starts a number of threads
[ https://issues.apache.org/jira/browse/BOOKKEEPER-170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208410#comment-13208410 ] Flavio Junqueira commented on BOOKKEEPER-170: - It looks good overall. A few comments: # I'm not sure why you removed a number of calls to shutdown. Could you explain? # In Bookie, I thought that we were setting the value of this.zk further down in the constructor because we needed to wait until the bookie starts up. With this patch, we set it earlier. I'm trying to understand why that is not a problem... # This change here @@ -346,12 +350,11 does not seem to be necessary. # This is minor, but I was wondering if we should call startGC() something else, like initiate(). It leaks out information about EntryLogger. Bookie constructor starts a number of threads - Key: BOOKKEEPER-170 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-170 Project: Bookkeeper Issue Type: Bug Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: 4.1.0 Attachments: BOOKKEEPER-170.diff Starting a thread in a constructor is bad[1]. Also, it makes unit testing on Bookie a bit of a pain. For this reason, i've refactored the thread starting code out, so that to start the bookie, you call start() like you usually have to for a thread anyhow. As a bonus, it fixes some findbugs issues. [1] http://stackoverflow.com/questions/84285/calling-thread-start-within-its-own-constructor -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-162) LedgerHandle.readLastConfirmed does not work
[ https://issues.apache.org/jira/browse/BOOKKEEPER-162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206807#comment-13206807 ] Flavio Junqueira commented on BOOKKEEPER-162: - Thanks for reporting and helping, Philipp. LedgerHandle.readLastConfirmed does not work Key: BOOKKEEPER-162 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-162 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-client Affects Versions: 4.0.0 Reporter: Philipp Sushkin Assignee: Flavio Junqueira Priority: Critical Fix For: 4.1.0 Attachments: BOOKKEEPER-162.patch, BOOKKEEPER-162.patch, BOOKKEEPER-162.patch, BOOKKEEPER-162.patch, BOOKKEEPER-162.patch, BookieReadWriteTest.java.patch, BookieReadWriteTest.java.patch, BookieReadWriteTest.java.patch, bookkeeper.log Two bookkeeper clients. 1st continuously writing to ledger X. 2nd (bk.openLedgerNoRecovery) polling ledger X for new entries and reading them. In response we always reveiceing 0 as last confirmed entry id (in fact we are receiving -1 from each bookie RecoveryData but then in ReadLastConfirmedOp, but uninitialized long maxAddConfirmed; takes priority in Math.max(...). Main question - is given scenario is expected to work at all? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-162) LedgerHandle.readLastConfirmed does not work
[ https://issues.apache.org/jira/browse/BOOKKEEPER-162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204472#comment-13204472 ] Flavio Junqueira commented on BOOKKEEPER-162: - Thanks, Sijie. If no one else has an issue with this patch, I'll commit it later today. LedgerHandle.readLastConfirmed does not work Key: BOOKKEEPER-162 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-162 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-client Affects Versions: 4.0.0 Reporter: Philipp Sushkin Assignee: Flavio Junqueira Priority: Critical Fix For: 4.1.0 Attachments: BOOKKEEPER-162.patch, BOOKKEEPER-162.patch, BOOKKEEPER-162.patch, BOOKKEEPER-162.patch, BOOKKEEPER-162.patch, BookieReadWriteTest.java.patch, BookieReadWriteTest.java.patch, BookieReadWriteTest.java.patch, bookkeeper.log Two bookkeeper clients. 1st continuously writing to ledger X. 2nd (bk.openLedgerNoRecovery) polling ledger X for new entries and reading them. In response we always reveiceing 0 as last confirmed entry id (in fact we are receiving -1 from each bookie RecoveryData but then in ReadLastConfirmedOp, but uninitialized long maxAddConfirmed; takes priority in Math.max(...). Main question - is given scenario is expected to work at all? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-162) LedgerHandle.readLastConfirmed does not work
[ https://issues.apache.org/jira/browse/BOOKKEEPER-162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13203358#comment-13203358 ] Flavio Junqueira commented on BOOKKEEPER-162: - Agreed, it shouldn't return 0 when the ledger is empty. In the case the ledger has one element, it should still return -1 (empty) according to the semantics of the call. readLastConfirmed returns the maximum hint across all bookies, and the hint for a ledger is the value of the last confirmed field in the last entry it wrote. Consequently, if there is only one entry written, the hint will say that there is no add confirmed before that one, which is correct. Perhaps if you need to know precisely which entries have been confirmed, you may want to have the writer communicating to the readers though ZooKeeper or directly (e.g., TCP). The readLastConfirmed mechanism gives an approximation of the state of the ledger, and is particularly useful when writing streams continuously. If you can say more about your use case, we may be able to help you decide, Philipp. LedgerHandle.readLastConfirmed does not work Key: BOOKKEEPER-162 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-162 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-client Affects Versions: 4.0.0 Reporter: Philipp Sushkin Assignee: Flavio Junqueira Priority: Critical Fix For: 4.0.0 Attachments: BOOKKEEPER-162.patch, BOOKKEEPER-162.patch, BOOKKEEPER-162.patch, BookieReadWriteTest.java.patch, BookieReadWriteTest.java.patch, BookieReadWriteTest.java.patch, bookkeeper.log Two bookkeeper clients. 1st continuously writing to ledger X. 2nd (bk.openLedgerNoRecovery) polling ledger X for new entries and reading them. In response we always reveiceing 0 as last confirmed entry id (in fact we are receiving -1 from each bookie RecoveryData but then in ReadLastConfirmedOp, but uninitialized long maxAddConfirmed; takes priority in Math.max(...). Main question - is given scenario is expected to work at all? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-162) LedgerHandle.readLastConfirmed does not work
[ https://issues.apache.org/jira/browse/BOOKKEEPER-162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13203387#comment-13203387 ] Flavio Junqueira commented on BOOKKEEPER-162: - Check the API docs: http://zookeeper.apache.org/bookkeeper/docs/r4.0.0/apidocs/ I don't mind having more documentation added if it is not clear the distinction. LedgerHandle.readLastConfirmed does not work Key: BOOKKEEPER-162 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-162 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-client Affects Versions: 4.0.0 Reporter: Philipp Sushkin Assignee: Flavio Junqueira Priority: Critical Fix For: 4.1.0 Attachments: BOOKKEEPER-162.patch, BOOKKEEPER-162.patch, BOOKKEEPER-162.patch, BookieReadWriteTest.java.patch, BookieReadWriteTest.java.patch, BookieReadWriteTest.java.patch, bookkeeper.log Two bookkeeper clients. 1st continuously writing to ledger X. 2nd (bk.openLedgerNoRecovery) polling ledger X for new entries and reading them. In response we always reveiceing 0 as last confirmed entry id (in fact we are receiving -1 from each bookie RecoveryData but then in ReadLastConfirmedOp, but uninitialized long maxAddConfirmed; takes priority in Math.max(...). Main question - is given scenario is expected to work at all? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1256) ClientPortBindTest is failing on Mac OS X
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13203421#comment-13203421 ] Flavio Junqueira commented on ZOOKEEPER-1256: - We should have committed this one to the 3.4.3 branch. The same patch applies. ClientPortBindTest is failing on Mac OS X - Key: ZOOKEEPER-1256 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1256 Project: ZooKeeper Issue Type: Bug Components: tests Environment: Mac OS X Reporter: Daniel Gómez Ferro Assignee: Daniel Gómez Ferro Fix For: 3.5.0 Attachments: ClientPortBindTest.log, ZOOKEEPER-1256.patch, ZOOKEEPER-1256.patch ClientPortBindTest is failing consistently on Mac OS X. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-162) LedgerHandle.readLastConfirmed does not work
[ https://issues.apache.org/jira/browse/BOOKKEEPER-162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13202521#comment-13202521 ] Flavio Junqueira commented on BOOKKEEPER-162: - Yes, we write lastAddConfirmed upon every add. No bookie alone knows whether an add has been confirmed to the client application or not. LedgerHandle.readLastConfirmed does not work Key: BOOKKEEPER-162 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-162 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-client Affects Versions: 4.0.0 Reporter: Philipp Sushkin Assignee: Flavio Junqueira Priority: Critical Fix For: 4.0.0 Attachments: BOOKKEEPER-162.patch, BOOKKEEPER-162.patch, BOOKKEEPER-162.patch, BookieReadWriteTest.java.patch, BookieReadWriteTest.java.patch, BookieReadWriteTest.java.patch, bookkeeper.log Two bookkeeper clients. 1st continuously writing to ledger X. 2nd (bk.openLedgerNoRecovery) polling ledger X for new entries and reading them. In response we always reveiceing 0 as last confirmed entry id (in fact we are receiving -1 from each bookie RecoveryData but then in ReadLastConfirmedOp, but uninitialized long maxAddConfirmed; takes priority in Math.max(...). Main question - is given scenario is expected to work at all? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-145) Put notice and license file for distributed binaries in SVN
[ https://issues.apache.org/jira/browse/BOOKKEEPER-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13198918#comment-13198918 ] Flavio Junqueira commented on BOOKKEEPER-145: - +1 for me, but I think it would be a good idea to have someone else lik Pat having a look at this. Put notice and license file for distributed binaries in SVN --- Key: BOOKKEEPER-145 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-145 Project: Bookkeeper Issue Type: Improvement Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: 4.1.0 Attachments: BOOKKEEPER-145.diff During the 4.0.0 I manually put these in the binary tarballs. This is awkward though. It's easier just to have them in the source tree. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-154) Garbage collect messages for those subscribers inactive/offline for a long time.
[ https://issues.apache.org/jira/browse/BOOKKEEPER-154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13197812#comment-13197812 ] Flavio Junqueira commented on BOOKKEEPER-154: - Ivan and I had an offline discussion about this issues, and here is a summary. We have not reached agreement on a solution, btw, this is just a bit of brainstorm. We find it better to have an application thread responsible for garbage-collecting messages from subscribers by consuming such messages. It is better in the sense that it avoids introducing functionality that is application specific. Assuming that this feature is implemented at the application level, we need a way for the application thread to determine: # the subscribers it needs to watch for; # the last time each of those subscribers has consumed a message. This information is in principle available through ZooKeeper, so one way of implementing this feature is to make the information in ZooKeeper available. Having the application accessing directly ZooKeeper sounds messy because it is prone to consistency problems to have the application manipulating the ZooKeeper metadata directly and it is operationally more difficult (e.g., for open ports). One option is to expose it through Hubs. Exposing the ZooKeeper metadata via hubs doesn't solve the whole problem. Assuming millions of subscribers, such an application thread would have to loop through the subscribers frequently inducing a high load. If we could use the watch functionality of ZooKeeper, then perhaps we could have the application thread build a local table of subscribers and update the table when anything changes. This way it has to loop through the same subscribers, but locally. Garbage collect messages for those subscribers inactive/offline for a long time. - Key: BOOKKEEPER-154 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-154 Project: Bookkeeper Issue Type: New Feature Components: hedwig-client, hedwig-server Affects Versions: 4.0.0 Reporter: Sijie Guo Currently hedwig tracks subscribers progress for garbage collecting published messages. If subscriber subscribe and becomes offline without unsubscribing for a long time, those messages published in its topic have no chance to be garbage collected. A time based garbage collection policy would be suitable for this case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-156) BookieJournalRollingTest failing
[ https://issues.apache.org/jira/browse/BOOKKEEPER-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13197910#comment-13197910 ] Flavio Junqueira commented on BOOKKEEPER-156: - I'm not sure if this a problem anymore. I ran it in a computer with little memory and apparently it caused the same problem in other tests as well. After killing some apps and getting memory back, I can't reproduce the problem. But, if we should be using #startNewBookie instead, then I suggest we make the change. What do you think? BookieJournalRollingTest failing - Key: BOOKKEEPER-156 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-156 Project: Bookkeeper Issue Type: Bug Affects Versions: 4.0.0 Reporter: Flavio Junqueira Fix For: 4.1.0 Attachments: org.apache.bookkeeper.test.BookieJournalRollingTest-output.txt The test fails for me intermittently. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-154) Garbage collect messages for those subscribers inactive/offline for a long time.
[ https://issues.apache.org/jira/browse/BOOKKEEPER-154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13196947#comment-13196947 ] Flavio Junqueira commented on BOOKKEEPER-154: - The new api call would be to consume up to a given timestamp? My understanding of the requirement is that the application needs to able to garbage collect old messages and needs a way of determining how old messages are, otherwise it doesn't know how far it should consume. A different way would be to consume based on size, in the case an application needs the ability to reduce the amount of state store on a given topic to some value. Garbage collect messages for those subscribers inactive/offline for a long time. - Key: BOOKKEEPER-154 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-154 Project: Bookkeeper Issue Type: New Feature Components: hedwig-client, hedwig-server Affects Versions: 4.0.0 Reporter: Sijie Guo Currently hedwig tracks subscribers progress for garbage collecting published messages. If subscriber subscribe and becomes offline without unsubscribing for a long time, those messages published in its topic have no chance to be garbage collected. A time based garbage collection policy would be suitable for this case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-156) BookieJournalRollingTest failing
[ https://issues.apache.org/jira/browse/BOOKKEEPER-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195729#comment-13195729 ] Flavio Junqueira commented on BOOKKEEPER-156: - I haven't turned on logging, so I only have the exception thrown for now: {noformat} --- Test set: org.apache.bookkeeper.test.BookieJournalRollingTest --- Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 46.702 sec FAILURE! testJournalRollingWithoutSyncup[0](org.apache.bookkeeper.test.BookieJournalRollingTest) Time elapsed: 1.887 sec ERROR! org.apache.bookkeeper.client.BKException$ZKException at org.apache.bookkeeper.client.BKException.create(BKException.java:64) at org.apache.bookkeeper.client.BookKeeper.createLedger(BookKeeper.java:293) at org.apache.bookkeeper.client.BookKeeper.createLedger(BookKeeper.java:260) at org.apache.bookkeeper.test.BookieJournalRollingTest.writeLedgerEntries(BookieJournalRollingTest.java:79) at org.apache.bookkeeper.test.BookieJournalRollingTest.testJournalRollingWithoutSyncup(BookieJournalRollingTest.java:206) {noformat} The creation of the ledger is failing, but this is not giving the exact error code. I'll run it again with logging on to see if we can get the error code. I suspect that it could be because the bookie servers have not started (restartBookies call) by the time we try to create the ledger in that test. BookieJournalRollingTest failing - Key: BOOKKEEPER-156 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-156 Project: Bookkeeper Issue Type: Bug Affects Versions: 4.0.0 Reporter: Flavio Junqueira Fix For: 4.1.0 The test fails for me intermittently. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1366) Zookeeper should be tolerant of clock adjustments
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189038#comment-13189038 ] Flavio Junqueira commented on ZOOKEEPER-1366: - Hi Ted, I was wondering if this change fits into the framework of ZOOKEEPER-702. Zookeeper should be tolerant of clock adjustments - Key: ZOOKEEPER-1366 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1366 Project: ZooKeeper Issue Type: Bug Reporter: Ted Dunning Fix For: 3.4.3 Attachments: ZOOKEEPER-1366.patch If you want to wreak havoc on a ZK based system just do [date -s +1hour] and watch the mayhem as all sessions expire at once. This shouldn't happen. Zookeeper could easily know handle elapsed times as elapsed times rather than as differences between absolute times. The absolute times are subject to adjustment when the clock is set while a timer is not subject to this problem. In Java, System.currentTimeMillis() gives you absolute time while System.nanoTime() gives you time based on a timer from an arbitrary epoch. I have done this and have been running tests now for some tens of minutes with no failures. I will set up a test machine to redo the build again on Ubuntu and post a patch here for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1366) Zookeeper should be tolerant of clock adjustments
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189284#comment-13189284 ] Flavio Junqueira commented on ZOOKEEPER-1366: - You're talking about timers in your description, and we use timers for failure detection. The main problem is getting that patch in, there hasn't been enough agreement to get it committed. Zookeeper should be tolerant of clock adjustments - Key: ZOOKEEPER-1366 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1366 Project: ZooKeeper Issue Type: Bug Reporter: Ted Dunning Fix For: 3.4.3 Attachments: ZOOKEEPER-1366.patch If you want to wreak havoc on a ZK based system just do [date -s +1hour] and watch the mayhem as all sessions expire at once. This shouldn't happen. Zookeeper could easily know handle elapsed times as elapsed times rather than as differences between absolute times. The absolute times are subject to adjustment when the clock is set while a timer is not subject to this problem. In Java, System.currentTimeMillis() gives you absolute time while System.nanoTime() gives you time based on a timer from an arbitrary epoch. I have done this and have been running tests now for some tens of minutes with no failures. I will set up a test machine to redo the build again on Ubuntu and post a patch here for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-153) Ledger can't be opened or closed due to zero-length metadata
[ https://issues.apache.org/jira/browse/BOOKKEEPER-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186882#comment-13186882 ] Flavio Junqueira commented on BOOKKEEPER-153: - I think we have done it for simplicity, since we rely upon writeLedgerConfig to write ledger metadata, and writeLedgerConfig only invokes setData. I agree that it has the issue you're raising, though. If we keep relying upon writeLedgerConfig for writing the ledger metadata, then we will have a separation between creating it initially and writing the metadata (non-atomic). To fix the problem you're raising, it sounds like we have to either modify writeLedgerConfig or have a different code path for the initial write. Ledger can't be opened or closed due to zero-length metadata Key: BOOKKEEPER-153 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-153 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-client Affects Versions: 4.0.0 Reporter: Sijie Guo Assignee: Sijie Guo Fix For: 4.1.0 Currently creating ledger path and writing ledger metadata are not in a transaction. so if the bookkeeper client (hub server uses bookkeeper client) is crashed, we have a ledger existed in zookeeper with zero-length metadata. we can't open/close it. we should create the ledger path with initial metadata to avoid such case. besides that, we need to add code in openLedgerOp to handle zero-length metadata for backward compatibility. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-151) Delete method for Hedwig client API that uses eager cleanup
[ https://issues.apache.org/jira/browse/BOOKKEEPER-151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185718#comment-13185718 ] Flavio Junqueira commented on BOOKKEEPER-151: - Hi Daniel, You're saying that the lazy cleanup is not the right way for you? I'm bit concerned about having an eager delete call, since it might affect our guarantees to subscribers. In general, if there is a subscriber still subscriber, then we should be able to guarantee that the subscriber will receive published messages. If we have an eager delete, then we won't be able to fulfill that guarantee in the case we delete the topic and all related data. Delete method for Hedwig client API that uses eager cleanup --- Key: BOOKKEEPER-151 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-151 Project: Bookkeeper Issue Type: Wish Components: hedwig-client, hedwig-server Environment: Ubuntu / Centos Reporter: Daniel Kim Labels: features, new I am using hedwig as a notification system for my webapp. However, the current version of hedwig does not have any api for deleting topics. Since I want to be able to manage the resources, I tried to delete a topic by removing the znodes associated with the topic. However, the hubs do not lose their ownership of their delete topics until the redistribution period (e.g., lazy cleanup). The hubs will behave as if they still own the topic, which has no information in zookeeper server(s). I am hoping to see a hedwig-client api that does eager delete. As of now, the system is unthrottled and thus can grow without bound. This poses a threat where my resources can run out by malicious use cases, rogue programmers, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-148) Jenkins build is failing
[ https://issues.apache.org/jira/browse/BOOKKEEPER-148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13184105#comment-13184105 ] Flavio Junqueira commented on BOOKKEEPER-148: - Thanks, Ivan! Committed revision 1230070. Jenkins build is failing Key: BOOKKEEPER-148 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-148 Project: Bookkeeper Issue Type: Bug Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: 4.1.0 Attachments: BOOKKEEPER-148.diff This is due to running out of DirectBufferMemory in TestFencing which doesn't get garbage collected as normal memory does. TestFencing creates too many BookKeeper client instances, and this is what exhausts the buffers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1343) getEpochToPropose should check if lastAcceptedEpoch is greater or equal than epoch
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182895#comment-13182895 ] Flavio Junqueira commented on ZOOKEEPER-1343: - Hi Alex, I'd like to suggest that we create a new jira for the issue you're raising, especially if it is happening in your branch, and not in trunk. We may want to look more carefully into it before rushing into a fix. Otherwise, your observations make sense. getEpochToPropose should check if lastAcceptedEpoch is greater or equal than epoch -- Key: ZOOKEEPER-1343 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1343 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.0 Reporter: Flavio Junqueira Assignee: Flavio Junqueira Priority: Critical Fix For: 3.4.3, 3.5.0 Attachments: ZOOKEEPER-1343-3.4.patch, ZOOKEEPER-1343.patch, ZOOKEEPER-1343.patch, ZOOKEEPER-1343.patch The following block in Leader.getEpochToPropose: {noformat} if (lastAcceptedEpoch epoch) { epoch = lastAcceptedEpoch+1; } {noformat} needs to be fixed, since it doesn't increment the epoch variable in the case epoch != -1 (initial value) and lastAcceptedEpoch is equal. The fix trivial and corresponds to changing with =. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-150) Entry is lost when recovering a ledger with not enough bookies.
[ https://issues.apache.org/jira/browse/BOOKKEEPER-150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179063#comment-13179063 ] Flavio Junqueira commented on BOOKKEEPER-150: - I agree, it sounds wrong to close the ledger in that case. Entry is lost when recovering a ledger with not enough bookies. --- Key: BOOKKEEPER-150 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-150 Project: Bookkeeper Issue Type: Bug Components: bookkeeper-client Affects Versions: 4.0.0 Reporter: Sijie Guo Assignee: Sijie Guo Fix For: 4.1.0 Attachments: BOOKKEEPER-150.patch suppose a ledger is created as ensemble size 3 and quorum size 3. 3 entries is added in this ledger, entry ids are 0, 1, 2. this ledger is not closed. then a bookie server is down. the ledger is opened. it would be recovered in following steps: 1) retrieve LAC from all bookie ensemble to get maxAddConfirmed. then maxAddPushed would be 2 and maxAddConfirmed would be 1. then lastAddConfirmed would be 1. 2) doRecovery read lastAddConfirmed + 1 (2). it would return right data since there is still 2 replicas. 3) doRecovery add entry 2. but it would fail since there is not enough bookies to form a new ensemble. 4) this ledger will be closed with lastAddConfirmed (1). entry 2 will be lost. this issue happened in hub server. old ledger will be recovered and closed when changing ownership. so published messages would be lost. we should not close ledger when we encountered exception during recovery adding, otherwise we would lose entries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1343) getEpochToPropose should check if lastAcceptedEpoch is greater or equal than epoch
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177165#comment-13177165 ] Flavio Junqueira commented on ZOOKEEPER-1343: - The last patch I uploaded was for 3.4, not trunk. That's why QA failed. getEpochToPropose should check if lastAcceptedEpoch is greater or equal than epoch -- Key: ZOOKEEPER-1343 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1343 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.0 Reporter: Flavio Junqueira Assignee: Flavio Junqueira Priority: Critical Fix For: 3.5.0, 3.4.3 Attachments: ZOOKEEPER-1343-3.4.patch, ZOOKEEPER-1343.patch, ZOOKEEPER-1343.patch, ZOOKEEPER-1343.patch The following block in Leader.getEpochToPropose: {noformat} if (lastAcceptedEpoch epoch) { epoch = lastAcceptedEpoch+1; } {noformat} needs to be fixed, since it doesn't increment the epoch variable in the case epoch != -1 (initial value) and lastAcceptedEpoch is equal. The fix trivial and corresponds to changing with =. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1343) getEpochToPropose should check if lastAcceptedEpoch is greater or equal than epoch
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175410#comment-13175410 ] Flavio Junqueira commented on ZOOKEEPER-1343: - The problem described occurs when the last accepted epoch e of a peer p is one more than the last accepted epoch e' of another peer p' (e' = e + 1). If getEpochToPropose is called for p before being called for p', then the value of epoch will be e' instead of e' + 1. I can try to take a cut at a test. I have already checked that the modification doesn't break anything. getEpochToPropose should check if lastAcceptedEpoch is greater or equal than epoch -- Key: ZOOKEEPER-1343 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1343 Project: ZooKeeper Issue Type: Bug Reporter: Flavio Junqueira Assignee: Flavio Junqueira The following block in Leader.getEpochToPropose: {noformat} if (lastAcceptedEpoch epoch) { epoch = lastAcceptedEpoch+1; } {noformat} needs to be fixed, since it doesn't increment the epoch variable in the case epoch != -1 (initial value) and lastAcceptedEpoch is equal. The fix trivial and corresponds to changing with =. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1343) getEpochToPropose should check if lastAcceptedEpoch is greater or equal than epoch
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175691#comment-13175691 ] Flavio Junqueira commented on ZOOKEEPER-1343: - Leader initializes epoch to -1, so the first call always changes the value of epoch. The subsequent calls are the ones that can cause the problem. getEpochToPropose should check if lastAcceptedEpoch is greater or equal than epoch -- Key: ZOOKEEPER-1343 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1343 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.0 Reporter: Flavio Junqueira Assignee: Flavio Junqueira Fix For: 3.5.0 Attachments: ZOOKEEPER-1343.patch The following block in Leader.getEpochToPropose: {noformat} if (lastAcceptedEpoch epoch) { epoch = lastAcceptedEpoch+1; } {noformat} needs to be fixed, since it doesn't increment the epoch variable in the case epoch != -1 (initial value) and lastAcceptedEpoch is equal. The fix trivial and corresponds to changing with =. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1332) Zookeeper data is not in sync with quorum in the mentioned scenario
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172140#comment-13172140 ] Flavio Junqueira commented on ZOOKEEPER-1332: - Amith, Please check ZOOKEEPER-1319 and consider strongly using 3.4.1. If you agree it is the same problem, please mark as duplicate and close it. Zookeeper data is not in sync with quorum in the mentioned scenario --- Key: ZOOKEEPER-1332 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1332 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.4.0 Environment: 3 zookeeper quorum Reporter: amith Fix For: 3.4.1 Please check the below mentioned scenario:- 1. Configure 3 zookeeper servers in quorum 2. Start zk1 (F) and zk2(L) from a java client create a node(client connect to zk2) 3. Stop the zk2 (L) 4. Start the zk3, Now FLE is successful but zookeeper-3 is not having the node created In step 4 Zookeeper-3 is getting a diff from the leader 2011-12-19 20:15:59,379 [myid:3] - INFO [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2183:Environment@98] - Server environment:user.home=/root 2011-12-19 20:15:59,379 [myid:3] - INFO [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2183:Environment@98] - Server environment:user.dir=/home/amith/OpenSrc/zookeeper/zookeeper3/bin 2011-12-19 20:15:59,381 [myid:3] - INFO [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2183:ZooKeeperServer@168] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 4 datadir ../dataDir/version-2 snapdir ../dataDir/version-2 2011-12-19 20:15:59,382 [myid:3] - INFO [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2183:Follower@63] - FOLLOWING - LEADER ELECTION TOOK - 102 2011-12-19 20:15:59,403 [myid:3] - INFO [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2183:Learner@322] - Getting a diff from the leader 0x1000a 2011-12-19 20:15:59,449 [myid:3] - WARN [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2183:Learner@372] - Got zxid 0x1000a expected 0x1 2011-12-19 20:15:59,450 [myid:3] - INFO [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2183:FileTxnSnapLog@255] - Snapshotting: 1000a but in the diff all the required data is not obtained ...! Here I think zookeeper-3 should get snapshot from leader and not Diff -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1292) FLETest is flaky
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13170259#comment-13170259 ] Flavio Junqueira commented on ZOOKEEPER-1292: - The test that failed is org.apache.zookeeper.test.LETest.testLE, not FLETest, the one being changed in this patch. FLETest is flaky Key: ZOOKEEPER-1292 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1292 Project: ZooKeeper Issue Type: Improvement Components: leaderElection Reporter: Flavio Junqueira Assignee: Flavio Junqueira Fix For: 3.5.0 Attachments: ZOOKEEPER-1292.patch, ZOOKEEPER-1292.patch, ZOOKEEPER-1292.patch testLE in FLETest is convoluted, difficult to read, and doesn't test FLE appropriately. The goal of this jira is to clean it up and propose a more reasonable test case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1319) Missing data after restarting+expanding a cluster
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165084#comment-13165084 ] Flavio Junqueira commented on ZOOKEEPER-1319: - Since the double NEWLEADER is harmless, it doesn't sound like a big deal to me to postpone to a future release. It is mainly inefficient to have it that way. At the same time, the fix seems to be simple enough so we could consider. I don't think it causes backward compatibility problems. In my understanding, all it happens in the Leaner is executing this twice: {noformat} case Leader.NEWLEADER: // it will be NEWLEADER in v1.0 zk.takeSnapshot(); self.setCurrentEpoch(newEpoch); snapshotTaken = true; writePacket(new QuorumPacket(Leader.ACK, newLeaderZxid, null, null), true); break; } {noformat} I believe this is idempotent. Missing data after restarting+expanding a cluster - Key: ZOOKEEPER-1319 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1319 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.0 Environment: Linux (Debian Squeeze) Reporter: Jeremy Stribling Assignee: Patrick Hunt Priority: Blocker Labels: cluster, data Fix For: 3.5.0, 3.4.1 Attachments: ZOOKEEPER-1319.patch, ZOOKEEPER-1319.patch, ZOOKEEPER-1319_trunk.patch, ZOOKEEPER-1319_trunk2.patch, logs.tgz I've been trying to update to ZK 3.4.0 and have had some issues where some data become inaccessible after adding a node to a cluster. My use case is a bit strange (as explained before on this list) in that I try to grow the cluster dynamically by having an external program automatically restart Zookeeper servers in a controlled way whenever the list of participating ZK servers needs to change. This used to work just fine in 3.3.3 (and before), so this represents a regression. The scenario I see is this: 1) Start up a 1-server ZK cluster (the server has ZK ID 0). 2) A client connects to the server, and makes a bunch of znodes, in particular a znode called /membership. 3) Shut down the cluster. 4) Bring up a 2-server ZK cluster, including the original server 0 with its existing data, and a new server with ZK ID 1. 5) Node 0 has the highest zxid and is elected leader. 6) A client connecting to server 1 tries to get /membership and gets back a -101 error code (no such znode). 7) The same client then tries to create /membership and gets back a -110 error code (znode already exists). 8) Clients connecting to server 0 can successfully get /membership. I will attach a tarball with debug logs for both servers, annotating where steps #1 and #4 happen. You can see that the election involves a proposal for zxid 110 from server 0, but immediately following the election server 1 has these lines: 2011-12-05 17:18:48,308 9299 [QuorumPeer[myid=1]/127.0.0.1:2901] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x10001 expected 0x1 2011-12-05 17:18:48,313 9304 [SyncThread:1] INFO org.apache.zookeeper.server.persistence.FileTxnLog - Creating new log file: log.10001 Perhaps that's not relevant, but it struck me as odd. At the end of server 1's log you can see a repeated cycle of getData-create-getData as the client tries to make sense of the inconsistent responses. The other piece of information is that if I try to use the on-disk directories for either of the servers to start a new one-node ZK cluster, all the data are accessible. I haven't tried writing a program outside of my application to reproduce this, but I can do it very easily with some of my app's tests if anyone needs more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1319) Missing data after restarting+expanding a cluster
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164612#comment-13164612 ] Flavio Junqueira commented on ZOOKEEPER-1319: - I didn't see lastProposed commented out in the patch of ZOOKEEPER-1136. Have I misunderstood your comment, Pat? Missing data after restarting+expanding a cluster - Key: ZOOKEEPER-1319 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1319 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.0 Environment: Linux (Debian Squeeze) Reporter: Jeremy Stribling Assignee: Patrick Hunt Priority: Blocker Labels: cluster, data Fix For: 3.4.1 Attachments: logs.tgz I've been trying to update to ZK 3.4.0 and have had some issues where some data become inaccessible after adding a node to a cluster. My use case is a bit strange (as explained before on this list) in that I try to grow the cluster dynamically by having an external program automatically restart Zookeeper servers in a controlled way whenever the list of participating ZK servers needs to change. This used to work just fine in 3.3.3 (and before), so this represents a regression. The scenario I see is this: 1) Start up a 1-server ZK cluster (the server has ZK ID 0). 2) A client connects to the server, and makes a bunch of znodes, in particular a znode called /membership. 3) Shut down the cluster. 4) Bring up a 2-server ZK cluster, including the original server 0 with its existing data, and a new server with ZK ID 1. 5) Node 0 has the highest zxid and is elected leader. 6) A client connecting to server 1 tries to get /membership and gets back a -101 error code (no such znode). 7) The same client then tries to create /membership and gets back a -110 error code (znode already exists). 8) Clients connecting to server 0 can successfully get /membership. I will attach a tarball with debug logs for both servers, annotating where steps #1 and #4 happen. You can see that the election involves a proposal for zxid 110 from server 0, but immediately following the election server 1 has these lines: 2011-12-05 17:18:48,308 9299 [QuorumPeer[myid=1]/127.0.0.1:2901] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x10001 expected 0x1 2011-12-05 17:18:48,313 9304 [SyncThread:1] INFO org.apache.zookeeper.server.persistence.FileTxnLog - Creating new log file: log.10001 Perhaps that's not relevant, but it struck me as odd. At the end of server 1's log you can see a repeated cycle of getData-create-getData as the client tries to make sense of the inconsistent responses. The other piece of information is that if I try to use the on-disk directories for either of the servers to start a new one-node ZK cluster, all the data are accessible. I haven't tried writing a program outside of my application to reproduce this, but I can do it very easily with some of my app's tests if anyone needs more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1319) Missing data after restarting+expanding a cluster
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164948#comment-13164948 ] Flavio Junqueira commented on ZOOKEEPER-1319: - +1, looks good, Pat. About the double occurrence of NEWLEADER, it happens because we insert NEWLEADER in outstandingRequests in Leader.lead() and queue a NEWLEADER message in LearnerHandler.run(). When we execute LearnerHandler.startForwarding() from LearnerHandler.run(), we queue the packets in outstandingRequests, including NEWLEADER. It is not necessary to send it again in startForwarding(), but we do need it in outstandingRequests to collect acks. Since we have to add it to outstandingRequests, one simple way to avoid it is by performing a check like this in startForwarding: {noformat} if(outstandingProposals.get(zxid).packet.getType() != NEWLEADER){ handler.queuePacket(outstandingProposals.get(zxid).packet); } {noformat} I have verified that by including this check, I can remove the double occurrence of NEWLEADER in Pat's patch and the test passes. We may want to consider this check in some later release. Missing data after restarting+expanding a cluster - Key: ZOOKEEPER-1319 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1319 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.0 Environment: Linux (Debian Squeeze) Reporter: Jeremy Stribling Assignee: Patrick Hunt Priority: Blocker Labels: cluster, data Fix For: 3.5.0, 3.4.1 Attachments: ZOOKEEPER-1319.patch, ZOOKEEPER-1319.patch, ZOOKEEPER-1319_trunk.patch, logs.tgz I've been trying to update to ZK 3.4.0 and have had some issues where some data become inaccessible after adding a node to a cluster. My use case is a bit strange (as explained before on this list) in that I try to grow the cluster dynamically by having an external program automatically restart Zookeeper servers in a controlled way whenever the list of participating ZK servers needs to change. This used to work just fine in 3.3.3 (and before), so this represents a regression. The scenario I see is this: 1) Start up a 1-server ZK cluster (the server has ZK ID 0). 2) A client connects to the server, and makes a bunch of znodes, in particular a znode called /membership. 3) Shut down the cluster. 4) Bring up a 2-server ZK cluster, including the original server 0 with its existing data, and a new server with ZK ID 1. 5) Node 0 has the highest zxid and is elected leader. 6) A client connecting to server 1 tries to get /membership and gets back a -101 error code (no such znode). 7) The same client then tries to create /membership and gets back a -110 error code (znode already exists). 8) Clients connecting to server 0 can successfully get /membership. I will attach a tarball with debug logs for both servers, annotating where steps #1 and #4 happen. You can see that the election involves a proposal for zxid 110 from server 0, but immediately following the election server 1 has these lines: 2011-12-05 17:18:48,308 9299 [QuorumPeer[myid=1]/127.0.0.1:2901] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x10001 expected 0x1 2011-12-05 17:18:48,313 9304 [SyncThread:1] INFO org.apache.zookeeper.server.persistence.FileTxnLog - Creating new log file: log.10001 Perhaps that's not relevant, but it struck me as odd. At the end of server 1's log you can see a repeated cycle of getData-create-getData as the client tries to make sense of the inconsistent responses. The other piece of information is that if I try to use the on-disk directories for either of the servers to start a new one-node ZK cluster, all the data are accessible. I haven't tried writing a program outside of my application to reproduce this, but I can do it very easily with some of my app's tests if anyone needs more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-31) Need a project logo
[ https://issues.apache.org/jira/browse/BOOKKEEPER-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163771#comment-13163771 ] Flavio Junqueira commented on BOOKKEEPER-31: +1, they look good. The one with digits in white has the shades of the digits a little twisted, though. It doesn't matter that much if the background is black, but it wouldn't hurt to fix it, just in case we end up using it with some dark background that is not black. Need a project logo --- Key: BOOKKEEPER-31 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-31 Project: Bookkeeper Issue Type: Improvement Reporter: Benjamin Reed Assignee: Benjamin Reed Attachments: bk_1.jpg, bk_2.jpg, bk_3.jpg, bk_4.jpg, bookeper_black_sm.png, bookeper_blk.png, bookeper_white_sm.png, bookeper_wht.png we need a logo for the project something that looks good in the big and the small and is easily recognizable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-53) race condition of outstandingMsgSet@SubscribeResponseHandler
[ https://issues.apache.org/jira/browse/BOOKKEEPER-53?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13158553#comment-13158553 ] Flavio Junqueira commented on BOOKKEEPER-53: Thanks, Ben. For completeness, I checked the documentation, and it must be a Boolean: http://docs.oracle.com/javase/6/docs/api/java/util/Collections.html race condition of outstandingMsgSet@SubscribeResponseHandler -- Key: BOOKKEEPER-53 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-53 Project: Bookkeeper Issue Type: Bug Components: hedwig-client Affects Versions: 4.0.0 Reporter: xulei Assignee: Flavio Junqueira Fix For: 4.0.0 Attachments: BOOKKEEPER-53.patch outstandingMsgSet is a Set, so it is not thread-safe. The detail is as below: MessageConsumeRetryTask is In a timer, so in timer thread, when the timer is up, it will cause a outstandingMsgSet add operation: MessageConsumeRetryTask.run() - outstandingMsgSet.add(message) - outstandingMsgSet.add(message) At the same time, in other thread(maybe main thread), there may be other operations of this outstandingMsgSet: MessageConsumeCallback.operationFinished() - messageConsumed(Message message) - outstandingMsgSet.remove(message); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
[ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156707#comment-13156707 ] Flavio Junqueira commented on BOOKKEEPER-112: - I'm not sure I have a good way to guarantee that it always happens this way, but my feeling is that we shouldn't try to recover a bookie that participated in a ledger ensemble for which the ledger is still open. One way is to check that all ledger fragments to recover are from ledgers that have been closed already. We can do it by checking the ledger metadata stored in zookeeper. Bookie recovery proceeds only if all ledger are closed. Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail -- Key: BOOKKEEPER-112 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112 Project: Bookkeeper Issue Type: Bug Reporter: Flavio Junqueira Assignee: Ivan Kelly Fix For: 4.0.0 Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata. Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-31) Need a project logo
[ https://issues.apache.org/jira/browse/BOOKKEEPER-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155144#comment-13155144 ] Flavio Junqueira commented on BOOKKEEPER-31: Looks great to me. Can we also have a version for a black background (letters in white and perhaps different tones for shades)? Need a project logo --- Key: BOOKKEEPER-31 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-31 Project: Bookkeeper Issue Type: Improvement Reporter: Benjamin Reed Assignee: Benjamin Reed Attachments: bk_1.jpg, bk_2.jpg, bk_3.jpg, bk_4.jpg, bookeper_black_sm.png, bookeper_white_sm.png we need a logo for the project something that looks good in the big and the small and is easily recognizable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-31) Need a project logo
[ https://issues.apache.org/jira/browse/BOOKKEEPER-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155480#comment-13155480 ] Flavio Junqueira commented on BOOKKEEPER-31: I like black backgrounds ;-( I'm otherwise happy. Need a project logo --- Key: BOOKKEEPER-31 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-31 Project: Bookkeeper Issue Type: Improvement Reporter: Benjamin Reed Assignee: Benjamin Reed Attachments: bk_1.jpg, bk_2.jpg, bk_3.jpg, bk_4.jpg, bookeper_black_sm.png, bookeper_white_sm.png we need a logo for the project something that looks good in the big and the small and is easily recognizable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1304) publish and subscribe methods get ServiceDownException even when the hubs, bookies, and zookeepers are running
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13152470#comment-13152470 ] Flavio Junqueira commented on ZOOKEEPER-1304: - Daniel, You may have missed that Bookkeeper is now a subproject of ZooKeeper (zookeeper.apache.org/bookkeeper) and Hedwig is part of the BookKeeper code base. publish and subscribe methods get ServiceDownException even when the hubs, bookies, and zookeepers are running -- Key: ZOOKEEPER-1304 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1304 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.5.0 Environment: CentOS 5.5 for all servers and workstations (however zookeeper, bookies, and hubs are all built in Ubuntu 11); OpenJDK Runtime Environment (IcedTea6 1.9.10) (rhel-1.23.1.9.10.el5_7-i386); OpenJDK Client VM (build 19.0-b09, mixed mode); Reporter: Daniel Kim Original Estimate: 336h Remaining Estimate: 336h Since I couldn't finish building all hedwig components in CentOS, I built it successfully in Ubuntu, then I deployed it to CentOS (no ubuntu image in my company's cloud). I configured zookeeper, bookies and hubs as they were described in the documentations. First, I copied TestPubSubClient.java's publish and subscribe tests into my own test code. I also had to create another object that extends ClientConfiguration. I named it HedwigConf, and overwrote getDefaultServerHedwigSocketAddress() method because the server was not on the same machine as the workstation. I targetted the right host and publish seemed to work. However, it throws me ServiceDownException for publish sometimes. I checked the logs of the hubs. They seem to have connected ok with the bookies. There was no error or warning there. However, the problem seemed to exist in bookies and zookeeper. This was found in the zookeeper log: Got user-level KeeperException when processing sessionid:0x--- type:create cxid:0x5 zxid:0x29 txntype:-1 reqpath:n/a Error Path:/hedwig/standalone/topics Error:KeeperErrorCode = NoNode for /hedwig/standalone/topics. Normally this znode path is created automatically. Also, some bookies complained this: WARN [NIOServerFactory] org.apache.bookkeeper.proto.NIOServerFactory - Exception in server socket loop: /0:0:0:0:0:0:0:0 java.lang.NullPointerException. For some reason, this problem comes and goes. Sometimes everything just works and the new topic is saved in a new znode, and the message is saved in bookie(s). I spent hours trying to recreate this yesterday, but I couldn't. Now it is back again. Subscribe seems to have the similar issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1277) servers stop serving when lower 32bits of zxid roll over
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149973#comment-13149973 ] Flavio Junqueira commented on ZOOKEEPER-1277: - The scenario I have in mind to say this is incorrect is more or less the following: # Leader L is currently in epoch 3 and it moves to epoch 4 in the way this patch proposes by simply adding 2 to hzxid. The leader proposes a transaction with zxid 4,1, which is acknowledged by some follower F, but not a quorum; # Concurrently, a new leader L' arises and selects 4 as its epoch (it hasn't talked to L or F); # L' proposes a transaction with zxid 4,1, which is different from the transaction L proposed with the same zxid and this transaction is acknowledged by a quorum; # L eventually gives up on leadership after noticing that it is not supported by a quorum; # L' crashes; # A new leader arises and its highest zxid is 4,1. It doesn't have to synchronize with any of the followers because they all have highest zxid 4,1. We have servers that have different transaction values for the same zxid, which constitutes an inconsistent state. servers stop serving when lower 32bits of zxid roll over Key: ZOOKEEPER-1277 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1277 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.3.3 Reporter: Patrick Hunt Assignee: Patrick Hunt Priority: Blocker Fix For: 3.3.4 Attachments: ZOOKEEPER-1277_br33.patch When the lower 32bits of a zxid roll over (zxid is a 64 bit number, however the upper 32 are considered the epoch number) the epoch number (upper 32 bits) are incremented and the lower 32 start at 0 again. This should work fine, however in the current 3.3 branch the followers see this as a NEWLEADER message, which it's not, and effectively stop serving clients. Attached clients seem to eventually time out given that heartbeats (or any operation) are no longer processed. The follower doesn't recover from this. I've tested this out on 3.3 branch and confirmed this problem, however I haven't tried it on 3.4/3.5. It may not happen on the newer branches due to ZOOKEEPER-335, however there is certainly an issue with updating the acceptedEpoch files contained in the datadir. (I'll enter a separate jira for that) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1277) servers stop serving when lower 32bits of zxid roll over
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13150047#comment-13150047 ] Flavio Junqueira commented on ZOOKEEPER-1277: - For a quorum setup, it sounds like a good place would be in ProposalRequestProcessor.proposeRequest(). For standalone, it sounds like we should be doing something along the lines of what you proposed in your patch. servers stop serving when lower 32bits of zxid roll over Key: ZOOKEEPER-1277 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1277 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.3.3 Reporter: Patrick Hunt Assignee: Patrick Hunt Priority: Blocker Fix For: 3.3.4 Attachments: ZOOKEEPER-1277_br33.patch When the lower 32bits of a zxid roll over (zxid is a 64 bit number, however the upper 32 are considered the epoch number) the epoch number (upper 32 bits) are incremented and the lower 32 start at 0 again. This should work fine, however in the current 3.3 branch the followers see this as a NEWLEADER message, which it's not, and effectively stop serving clients. Attached clients seem to eventually time out given that heartbeats (or any operation) are no longer processed. The follower doesn't recover from this. I've tested this out on 3.3 branch and confirmed this problem, however I haven't tried it on 3.4/3.5. It may not happen on the newer branches due to ZOOKEEPER-335, however there is certainly an issue with updating the acceptedEpoch files contained in the datadir. (I'll enter a separate jira for that) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1264) FollowerResyncConcurrencyTest failing intermittently
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144758#comment-13144758 ] Flavio Junqueira commented on ZOOKEEPER-1264: - I'll have a look. FollowerResyncConcurrencyTest failing intermittently Key: ZOOKEEPER-1264 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1264 Project: ZooKeeper Issue Type: Bug Components: tests Affects Versions: 3.3.3, 3.4.0, 3.5.0 Reporter: Patrick Hunt Assignee: Camille Fournier Priority: Blocker Fix For: 3.3.4, 3.4.0, 3.5.0 Attachments: ZOOKEEPER-1264-34-bad.patch, ZOOKEEPER-1264-branch34.patch, ZOOKEEPER-1264-merge.patch, ZOOKEEPER-1264.patch, ZOOKEEPER-1264.patch, ZOOKEEPER-1264.patch, ZOOKEEPER-1264.patch, ZOOKEEPER-1264_branch33.patch, ZOOKEEPER-1264_branch34.patch, ZOOKEEPER-1264unittest.patch, ZOOKEEPER-1264unittest.patch, followerresyncfailure_log.txt.gz, logs.zip, tmp.zip The FollowerResyncConcurrencyTest test is failing intermittently. saw the following on 3.4: {noformat} junit.framework.AssertionFailedError: Should have same number of ephemerals in both followers expected:11741 but was:14001 at org.apache.zookeeper.test.FollowerResyncConcurrencyTest.verifyState(FollowerResyncConcurrencyTest.java:400) at org.apache.zookeeper.test.FollowerResyncConcurrencyTest.testResyncBySnapThenDiffAfterFollowerCrashes(FollowerResyncConcurrencyTest.java:196) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1270) testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144171#comment-13144171 ] Flavio Junqueira commented on ZOOKEEPER-1270: - This is the code snippet for the handling of UPTODATE in Learner.java: {noformat} case Leader.UPTODATE: if (!snapshotTaken) { zk.takeSnapshot(); } self.cnxnFactory.setZooKeeperServer(zk); break outerLoop; {noformat} If the snapshot were completing, then it should be setting the zookeeper server. Given that we see the Snapshotting message in the logs, I assume that the follower is receiving an UPTODATE message, but it is failing to complete the snapshot. testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving. - Key: ZOOKEEPER-1270 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1270 Project: ZooKeeper Issue Type: Bug Components: server Reporter: Patrick Hunt Priority: Blocker Fix For: 3.4.0, 3.5.0 Attachments: ZOOKEEPER-1270tests.patch, ZOOKEEPER-1270tests2.patch, testEarlyLeaderAbandonment.txt.gz, testEarlyLeaderAbandonment2.txt.gz, testEarlyLeaderAbandonment3.txt.gz Looks pretty serious - quorum is formed but no clients can attach. Will attach logs momentarily. This test was introduced in the following commit (all three jira commit at once): ZOOKEEPER-335. zookeeper servers should commit the new leader txn to their logs. ZOOKEEPER-1081. modify leader/follower code to correctly deal with new leader ZOOKEEPER-1082. modify leader election to correctly take into account current -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1270) testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144237#comment-13144237 ] Flavio Junqueira commented on ZOOKEEPER-1270: - Here is some progress. I was actually looking at the wrong snippet. The correct one was the NEWLEADER handler: {noformat} case Leader.NEWLEADER: // it will be NEWLEADER in v1.0 zk.takeSnapshot(); snapshotTaken = true; writePacket(new QuorumPacket(Leader.ACK, newLeaderZxid, null, null), true); break; } {noformat} We also take a snapshot here and by looking at the stack trace that Pat posted, we see that the learner handlers are stuck in the loop right after receiving the ack, which essentially waits for the leader to start. By the same stack trace, the leader is not starting because it is waiting for the followers to acknowledge the NEWLEADER message... but the followers have acknowledged the NEWLEADER message, otherwise the learner handlers wouldn't be executing that loop (Line 450). Unless I'm missing anything, the problem must be in Leader.processAck. testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving. - Key: ZOOKEEPER-1270 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1270 Project: ZooKeeper Issue Type: Bug Components: server Reporter: Patrick Hunt Priority: Blocker Fix For: 3.4.0, 3.5.0 Attachments: ZOOKEEPER-1270tests.patch, ZOOKEEPER-1270tests2.patch, testEarlyLeaderAbandonment.txt.gz, testEarlyLeaderAbandonment2.txt.gz, testEarlyLeaderAbandonment3.txt.gz Looks pretty serious - quorum is formed but no clients can attach. Will attach logs momentarily. This test was introduced in the following commit (all three jira commit at once): ZOOKEEPER-335. zookeeper servers should commit the new leader txn to their logs. ZOOKEEPER-1081. modify leader/follower code to correctly deal with new leader ZOOKEEPER-1082. modify leader election to correctly take into account current -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1270) testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144244#comment-13144244 ] Flavio Junqueira commented on ZOOKEEPER-1270: - Here are the two cases that would cause the ack not to be added without generating any log message: {noformat} if (outstandingProposals.size() == 0) { if (LOG.isDebugEnabled()) { LOG.debug(outstanding is 0); } return; } if (lastCommitted = zxid) { if (LOG.isDebugEnabled()) { LOG.debug(proposal has already been committed, pzxid: + lastCommitted + zxid: 0x + Long.toHexString(zxid)); } // The proposal has already been committed return; } {noformat} I'm trying to decide which one is likely to be causing it to return. testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving. - Key: ZOOKEEPER-1270 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1270 Project: ZooKeeper Issue Type: Bug Components: server Reporter: Patrick Hunt Priority: Blocker Fix For: 3.4.0, 3.5.0 Attachments: ZOOKEEPER-1270tests.patch, ZOOKEEPER-1270tests2.patch, testEarlyLeaderAbandonment.txt.gz, testEarlyLeaderAbandonment2.txt.gz, testEarlyLeaderAbandonment3.txt.gz Looks pretty serious - quorum is formed but no clients can attach. Will attach logs momentarily. This test was introduced in the following commit (all three jira commit at once): ZOOKEEPER-335. zookeeper servers should commit the new leader txn to their logs. ZOOKEEPER-1081. modify leader/follower code to correctly deal with new leader ZOOKEEPER-1082. modify leader election to correctly take into account current -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1270) testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144272#comment-13144272 ] Flavio Junqueira commented on ZOOKEEPER-1270: - My latest conjecture is that we are getting acks for NEWLEADER before we have a chance to get it into the outstandingProposals queue, which would cause the ack to be dropped and the behavior we are observing. I tried to move the code in lead that adds NEWLEADER to the queue to before the point where we start the cnxAcceptor, but it broke a lot of tests, so I don't think this is the right approach. Pat, if you want to run again, I would suggest to put two info messages for the cases I posted last in Leader.processAck() to determine exactly which case is being triggered and preventing the leader from collecting the ack. testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving. - Key: ZOOKEEPER-1270 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1270 Project: ZooKeeper Issue Type: Bug Components: server Reporter: Patrick Hunt Priority: Blocker Fix For: 3.4.0, 3.5.0 Attachments: ZOOKEEPER-1270tests.patch, ZOOKEEPER-1270tests2.patch, testEarlyLeaderAbandonment.txt.gz, testEarlyLeaderAbandonment2.txt.gz, testEarlyLeaderAbandonment3.txt.gz Looks pretty serious - quorum is formed but no clients can attach. Will attach logs momentarily. This test was introduced in the following commit (all three jira commit at once): ZOOKEEPER-335. zookeeper servers should commit the new leader txn to their logs. ZOOKEEPER-1081. modify leader/follower code to correctly deal with new leader ZOOKEEPER-1082. modify leader election to correctly take into account current -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1270) testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144334#comment-13144334 ] Flavio Junqueira commented on ZOOKEEPER-1270: - It is ok to have the zero outstanding message, since the request might have been already acknowledged by a quorum. It is a problem if you get it and it hasn't been acknowledged by a quorum, which corresponds to the case I mentioned above. testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving. - Key: ZOOKEEPER-1270 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1270 Project: ZooKeeper Issue Type: Bug Components: server Reporter: Patrick Hunt Priority: Blocker Fix For: 3.4.0, 3.5.0 Attachments: ZOOKEEPER-1270tests.patch, ZOOKEEPER-1270tests2.patch, testEarlyLeaderAbandonment.txt.gz, testEarlyLeaderAbandonment2.txt.gz, testEarlyLeaderAbandonment3.txt.gz, testEarlyLeaderAbandonment4.txt.gz Looks pretty serious - quorum is formed but no clients can attach. Will attach logs momentarily. This test was introduced in the following commit (all three jira commit at once): ZOOKEEPER-335. zookeeper servers should commit the new leader txn to their logs. ZOOKEEPER-1081. modify leader/follower code to correctly deal with new leader ZOOKEEPER-1082. modify leader election to correctly take into account current -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1270) testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144416#comment-13144416 ] Flavio Junqueira commented on ZOOKEEPER-1270: - Camille, It sounds right that we need multiple acks from the follower, but it sounds awkward that they are acknowledging the same message. Perhaps this is to guarantee a correct implementation and for backward compatibility? testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving. - Key: ZOOKEEPER-1270 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1270 Project: ZooKeeper Issue Type: Bug Components: server Reporter: Patrick Hunt Priority: Blocker Fix For: 3.4.0, 3.5.0 Attachments: ZOOKEEPER-1270.patch, ZOOKEEPER-1270tests.patch, ZOOKEEPER-1270tests2.patch, testEarlyLeaderAbandonment.txt.gz, testEarlyLeaderAbandonment2.txt.gz, testEarlyLeaderAbandonment3.txt.gz, testEarlyLeaderAbandonment4.txt.gz Looks pretty serious - quorum is formed but no clients can attach. Will attach logs momentarily. This test was introduced in the following commit (all three jira commit at once): ZOOKEEPER-335. zookeeper servers should commit the new leader txn to their logs. ZOOKEEPER-1081. modify leader/follower code to correctly deal with new leader ZOOKEEPER-1082. modify leader election to correctly take into account current -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-105) A Bookkeeper can only open one LedgerHandle to a specific ledger at a time, if it expects them to work
[ https://issues.apache.org/jira/browse/BOOKKEEPER-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13144031#comment-13144031 ] Flavio Junqueira commented on BOOKKEEPER-105: - Ivan, Is there a use case for this or you're simply planning on preventing such cases from happening by failing one call to open? A Bookkeeper can only open one LedgerHandle to a specific ledger at a time, if it expects them to work -- Key: BOOKKEEPER-105 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-105 Project: Bookkeeper Issue Type: Bug Reporter: Ivan Kelly If you open two ledger handles pointing to the same ledger, using the same client, you will not be able to read from both. This is due to them sharing PerChannelBookieClient instances. PerChannelBookieClient has a member {code} ConcurrentHashMapCompletionKey, ReadCompletion readCompletions = new ConcurrentHashMapCompletionKey, ReadCompletion(); {code} where CompletionKey is the ledgerId and entryId. If both LedgerHandles try to read the same entryId, they'll override each other on this hashmap. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1270) testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143272#comment-13143272 ] Flavio Junqueira commented on ZOOKEEPER-1270: - For coordination purposes, I'm having a look at this issue. testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving. - Key: ZOOKEEPER-1270 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1270 Project: ZooKeeper Issue Type: Bug Components: server Reporter: Patrick Hunt Priority: Blocker Fix For: 3.4.0, 3.5.0 Attachments: testEarlyLeaderAbandonment.txt.gz, testEarlyLeaderAbandonment2.txt.gz Looks pretty serious - quorum is formed but no clients can attach. Will attach logs momentarily. This test was introduced in the following commit (all three jira commit at once): ZOOKEEPER-335. zookeeper servers should commit the new leader txn to their logs. ZOOKEEPER-1081. modify leader/follower code to correctly deal with new leader ZOOKEEPER-1082. modify leader election to correctly take into account current -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1270) testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143390#comment-13143390 ] Flavio Junqueira commented on ZOOKEEPER-1270: - Here is my progress so far. Mahadev is right, it sounds like we are not setting the zkServer of ServerCnxFactory, which we do both in Learner and Leader by calling self.cnxnFactory.setZooKeeperServer(zk). We should also be seeing a message like: {noformat} Have quorum of supporters; starting up and setting last processed zxid: {noformat} in the log file Pat posted, but we aren't, which implies that the leader establishment procedure is not completing. Is this a timing issue? I'm skeptical about it being a time issue because we wait 10 seconds for the waitForAll call to complete, but I'm not sure if this completely unrealistic or not assuming that the jenkins machine is overloaded. Is there anything I'm missing here? I'll keep investigating... testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving. - Key: ZOOKEEPER-1270 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1270 Project: ZooKeeper Issue Type: Bug Components: server Reporter: Patrick Hunt Priority: Blocker Fix For: 3.4.0, 3.5.0 Attachments: testEarlyLeaderAbandonment.txt.gz, testEarlyLeaderAbandonment2.txt.gz Looks pretty serious - quorum is formed but no clients can attach. Will attach logs momentarily. This test was introduced in the following commit (all three jira commit at once): ZOOKEEPER-335. zookeeper servers should commit the new leader txn to their logs. ZOOKEEPER-1081. modify leader/follower code to correctly deal with new leader ZOOKEEPER-1082. modify leader election to correctly take into account current -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1270) testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143400#comment-13143400 ] Flavio Junqueira commented on ZOOKEEPER-1270: - I forgot to mention that I haven't been able to reproduce it, and I have been running for the past few hours. testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving. - Key: ZOOKEEPER-1270 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1270 Project: ZooKeeper Issue Type: Bug Components: server Reporter: Patrick Hunt Priority: Blocker Fix For: 3.4.0, 3.5.0 Attachments: testEarlyLeaderAbandonment.txt.gz, testEarlyLeaderAbandonment2.txt.gz Looks pretty serious - quorum is formed but no clients can attach. Will attach logs momentarily. This test was introduced in the following commit (all three jira commit at once): ZOOKEEPER-335. zookeeper servers should commit the new leader txn to their logs. ZOOKEEPER-1081. modify leader/follower code to correctly deal with new leader ZOOKEEPER-1082. modify leader election to correctly take into account current -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1158) C# client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143532#comment-13143532 ] Flavio Junqueira commented on ZOOKEEPER-1158: - Joshua, This issue is marked for 3.5.0, so it should go into the next release, not the one we are working on currently. C# client - Key: ZOOKEEPER-1158 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1158 Project: ZooKeeper Issue Type: Improvement Reporter: Eric Hauser Assignee: Eric Hauser Fix For: 3.5.0 Attachments: ZOOKEEPER-1158-09032011.patch, ZOOKEEPER-1158-09082011-2.patch, log4net.dll, nunit.framework.dll Native C# client for ZooKeeper. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1270) testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143567#comment-13143567 ] Flavio Junqueira commented on ZOOKEEPER-1270: - I haven't been able to reproduce the problem, but I'll leave it running in a loop. On your comment about ClientBase, Camille, I was wondering if you have any other insight. testEarlyLeaderAbandonment failing intermittently, quorum formed, no serving. - Key: ZOOKEEPER-1270 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1270 Project: ZooKeeper Issue Type: Bug Components: server Reporter: Patrick Hunt Priority: Blocker Fix For: 3.4.0, 3.5.0 Attachments: ZOOKEEPER-1270tests.patch, ZOOKEEPER-1270tests2.patch, testEarlyLeaderAbandonment.txt.gz, testEarlyLeaderAbandonment2.txt.gz Looks pretty serious - quorum is formed but no clients can attach. Will attach logs momentarily. This test was introduced in the following commit (all three jira commit at once): ZOOKEEPER-335. zookeeper servers should commit the new leader txn to their logs. ZOOKEEPER-1081. modify leader/follower code to correctly deal with new leader ZOOKEEPER-1082. modify leader election to correctly take into account current -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-81) disk space of garbage collected entry logger files isn't reclaimed util process quit
[ https://issues.apache.org/jira/browse/BOOKKEEPER-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13142962#comment-13142962 ] Flavio Junqueira commented on BOOKKEEPER-81: This is mostly good, Sijie. Just a couple of quick comments: # Since we have been working hard to clean up the API, I was wondering if fileChannel() really needs to be public; # I don't think we have guidelines for writing log messages, but I think it would be nice to make in general message more concise and to try to pick some key words that help us to spot problems more easily. For example, I would rather say Exception while closing... rather than Trying to close This is really a small issue, though. disk space of garbage collected entry logger files isn't reclaimed util process quit - Key: BOOKKEEPER-81 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-81 Project: Bookkeeper Issue Type: Bug Affects Versions: 3.4.0 Reporter: Sijie Guo Assignee: Sijie Guo Fix For: 4.0.0 Attachments: bookkeeper-81.patch disk space of garbage collected entry logger files isn't reclaimed until process quit. it is caused by entry logger doesn't close the file channel of garbage collected files. so the process kept an reference to this file, filesystem only reclaim its space when the process quit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-83) Added versioning and flags to the bookie protocol
[ https://issues.apache.org/jira/browse/BOOKKEEPER-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13136965#comment-13136965 ] Flavio Junqueira commented on BOOKKEEPER-83: +1, looks good. Added versioning and flags to the bookie protocol - Key: BOOKKEEPER-83 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-83 Project: Bookkeeper Issue Type: Improvement Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: 4.0.0 Attachments: BOOKKEEPER-83.diff There is no concept of versions in the BookKeeper protocol at the moment. This patch addresses that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-82) support journal rolling
[ https://issues.apache.org/jira/browse/BOOKKEEPER-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129292#comment-13129292 ] Flavio Junqueira commented on BOOKKEEPER-82: Sijie, I'll review more carefully and give you comments. In the meanwhile, would you mind putting your patch on the review board and perhaps writing a test for this feature? support journal rolling --- Key: BOOKKEEPER-82 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-82 Project: Bookkeeper Issue Type: Improvement Components: bookkeeper-server Affects Versions: 3.4.0 Reporter: Sijie Guo Assignee: Sijie Guo Attachments: bookkeeper-82.patch now bookkeeper is writing a single journal file, so the journal file has no chance to be garbage collected and the disk space keeps growing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (BOOKKEEPER-36) Client backpressure
[ https://issues.apache.org/jira/browse/BOOKKEEPER-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13124515#comment-13124515 ] Flavio Junqueira commented on BOOKKEEPER-36: We might want to do it by bytes instead of number of requests. Also, throttling is currently done per ledger handle and not overall. Client backpressure --- Key: BOOKKEEPER-36 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-36 Project: Bookkeeper Issue Type: Improvement Components: bookkeeper-client Reporter: Flavio Junqueira Assignee: Flavio Junqueira Priority: Critical The way we currently throttle on the client is by counting the number of outstanding operation on LedgerHandle, and having the application select what an appropriate value is. This is not a good way of doing it because the application has to guess what a good value is. We need to implement some form of backpressure instead to make sure we throttle only when the system is saturated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1215) C client persisted cache
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123025#comment-13123025 ] Flavio Junqueira commented on ZOOKEEPER-1215: - Thanks for your proposal, Marc. I have a few comments about it. This proposal states that the goal is to reduce the traffic to zookeeper, but I'm not sure how you achieve it. You seem to assume that applications execute gets without setting watches for znodes they access frequently. I would think that any application carefully designed will set watches on znodes for which they need to know of changes. In that case, your proposal becomes a facility to help the application to manage. Is it the case? Does this comment make sense to you? Do you see it differently or have a use case? Assuming that it still makes sense, can't you implement it as a layer on top without modifying the API? C client persisted cache Key: ZOOKEEPER-1215 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1215 Project: ZooKeeper Issue Type: New Feature Components: c client Reporter: Marc Celani Assignee: Marc Celani Motivation: 1. Reduce the impact of client restarts on zookeeper by implementing a persisted cache, and only fetching deltas on restart 2. Reduce unnecessary calls to zookeeper. 3. Improve performance of gets by caching on the client 4. Allow for larger caches than in memory caches. Behavior Change: Zookeeper clients will not have the option to specify a folder path where it can cache zookeeper gets. If they do choose to cache results, the zookeeper library will check the persisted cache before actually sending a request to zookeeper. Watches will automatically be placed on all gets in order to invalidate the cache. Alternatively, we can add a cache flag to the get API - thoughts? On reconnect or restart, zookeeper clients will check the version number of each entries into its persisted cache, and will invalidate any old entries. In checking version number, zookeeper clients will also place a watch on those files. In regards to watches, client watch handlers will not fire until the invalidation step is completed, which may slow down client watch handling. Since setting up watches on all files is necessary on initialization, initialization will likely slow down as well. API Change: The zookeeper library will expose a new init interface that specifies a folder path to the cache. A new get API will specify whether or not to use cache, and whether or not stale data is safe to return if the connection is down. Design: The zookeeper handler structure will now include a cache_root_path (possibly null) string to cache all gets, as well as a bool for whether or not it is okay to serve stale data. Old API calls will default to a null path (which signifies no cache), and signify that it is not okay to serve stale data. The cache will be located at a cache_root_path. All files will be placed at cache_root_path/file_path. The cache will be an incomplete copy of everything that is in zookeeper, but everything in the cache will have the same relative path from the cache_root_path that it has as a path in zookeeper. Each file in the cache will include the Statstructure and the file contents. zoo_get will check the zookeeper handler to determine whether or not it has a cache. If it does, it will first go to the path to the persisted cache and append the get path. If the file exists and it is not invalidated, the zookeeper client will read it and return its value. If the file does not exist or is invalidated, the zookeeper library will perform the same get as is currently designed. After getting the results, the library will place the value in the persisted cache for subsequent reads. zoo_set will automatically invalidate the path in the cache. If caching is requested, then on each zoo_get that goes through to zookeeper, a watch will be placed on the path. A cache watch handler will handle all watch events by invalidating the cache, and placing another watch on it. Client watch handlers will handle the watch event after the cache watch handler. The cache watch handler will not call zoo_get, because it is assumed that the client watch handlers will call zoo_get if they need the fresh data as soon as it is invalidated (which is why the cache watch handler must be executed first). All updates to the cache will be done on a separate thread, but will be queued in order to maintain consistency in the cache. In addition, all client watch handlers will not be fired until the cache watch handler completes its invalidation write in order to ensure that client calls to zoo_get in the watch event handler are done after
[jira] [Commented] (ZOOKEEPER-1197) Incorrect socket handling of 4 letter words for NIO
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13121451#comment-13121451 ] Flavio Junqueira commented on ZOOKEEPER-1197: - Camille, is setting the socket to linger an option here? Incorrect socket handling of 4 letter words for NIO --- Key: ZOOKEEPER-1197 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1197 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.3.3, 3.4.0 Reporter: Camille Fournier Assignee: Camille Fournier Priority: Blocker Fix For: 3.3.4, 3.4.0, 3.5.0 Attachments: ZOOKEEPER-1197.patch When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key. Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do. See ZOOKEEPER-737 for more details -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira