[ https://issues.apache.org/jira/browse/ZOOKEEPER-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Shraer updated ZOOKEEPER-1355: ---------------------------------------- Attachment: ZOOKEEPER-1355-ver6.patch Hi, thanks for the comments!! I removed getCurrentHost, added a JavaDoc for updateServerList, and a paragraph to the documentation. I can do the C client changes (although, after looking at the code, this scares me a bit :) ) sometime next week or the following one. I also prefer this to be in a separate Jira. To clarify the rules some more, let me give you a few examples (this also appears in different levels of details in the javadoc and documentation I added to the code). First, in case the current host to which the client is connected is not in the new list updateServerList will always cause the connection to be dropped. Otherwise, the decision is based on whether the number of servers has increased or decreased and by how much. Suppose that the previous connection string contained 3 hosts and now the list contains these 3 hosts and 2 more hosts, 40% of clients connected to each of the 3 hosts will move to one of the new hosts in order to balance the load. The algorithm will cause the client to drop its connection to the current host to which it is connected with probability 0.4 (= 1 - 3/5, rule 1) and in this case cause the client to connect to one of the 2 new hosts, chosen at random. Another example - suppose we have 5 hosts and now update the list to remove 2 of the hosts, the clients connected to the 3 remaining hosts will stay connected (rule 3), whereas all clients connected to the 2 removed hosts will need to move to one of the 3 hosts, chosen at random. In this case, the formula in rule 4 simply gives 1 (3(5-3)/(3*2)) since no new servers were added, so clients connected to the removed hosts have no choice but connecting to the 3 old servers. These rules also take into account the case where servers are both added and removed at the same time, and that's why rule 4 doesn't always give probability 1. If the connection is dropped, the client moves to a special mode where he chooses a new server to connect to using the probabilistic algorithm, and not jus round robin. In the first example, each client decides to disconnect from the current host with probability 0.4 but once the decision is made, it will try to connect to a random NEW server and only if it cannot connect to any of the new servers will it try to connect to the old ones. After finding a server, or trying all servers in the new list and failing to connect, the client moves back to the normal mode of operation where it will pick an arbitrary server from the connectString and attempt to connect to it. > Add zk.updateServerList(newServerList) > --------------------------------------- > > Key: ZOOKEEPER-1355 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1355 > Project: ZooKeeper > Issue Type: New Feature > Components: java client > Reporter: Alexander Shraer > Assignee: Alexander Shraer > Fix For: 3.5.0 > > Attachments: ZOOKEEPER-1355-ver2.patch, ZOOKEEPER-1355-ver4.patch, > ZOOKEEPER-1355-ver5.patch, ZOOKEEPER-1355-ver6.patch, > ZOOKEEPER=1355-ver3.patch, ZOOOKEEPER-1355-test.patch, > ZOOOKEEPER-1355-ver1.patch, ZOOOKEEPER-1355.patch, > loadbalancing-more-details.pdf, loadbalancing.pdf > > > When the set of servers changes, we would like to update the server list > stored by clients without restarting the clients. > Moreover, assuming that the number of clients per server is the same (in > expectation) in the old configuration (as guaranteed by the current list > shuffling for example), we would like to re-balance client connections across > the new set of servers in a way that a) the number of clients per server is > the same for all servers (in expectation) and b) there is no > excessive/unnecessary client migration. > It is simple to achieve (a) without (b) - just re-shuffle the new list of > servers at every client. But this would create unnecessary migration, which > we'd like to avoid. > We propose a simple probabilistic migration scheme that achieves (a) and (b) > - each client locally decides whether and where to migrate when the list of > servers changes. The attached document describes the scheme and shows an > evaluation of it in Zookeeper. We also implemented re-balancing through a > consistent-hashing scheme and show a comparison. We derived the probabilistic > migration rules from a simple formula that we can also provide, if someone's > interested in the proof. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira