tdunning commented on a change in pull request #730: Zookeeper-3188: Improve
resilience to network
URL: https://github.com/apache/zookeeper/pull/730#discussion_r270258630
##########
File path:
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java
##########
@@ -306,6 +280,81 @@ protected void connectToLeader(InetSocketAddress addr,
String hostname)
leaderOs = BinaryOutputArchive.getArchive(bufferedOutput);
}
+ class LeaderConnector implements Runnable {
+
+ private AtomicReference<Socket> socket;
+ private InetSocketAddress address;
+ private CountDownLatch latch;
+
+ LeaderConnector(InetSocketAddress address, AtomicReference<Socket>
socket, CountDownLatch latch) {
+ this.address = address;
+ this.socket = socket;
+ this.latch = latch;
+ }
+
+ @Override
+ public void run() {
+ try {
+ Thread.currentThread().setName("LeaderConnector-" + address);
+ Socket sock = connectToLeader();
+
+ if (sock != null && sock.isConnected() &&
!socket.compareAndSet(null, sock)) {
+ LOG.info("Connection to the leader is already established,
close the redundant connection");
+ sock.close();
+ }
+
+ } catch (Exception e) {
+ LOG.error("Failed connect to {}", address, e);
+ } finally {
+ latch.countDown();
+ }
+ }
+
+ private Socket connectToLeader() throws IOException, X509Exception,
InterruptedException {
+ Socket sock = createSocket();
+
+ int initLimitTime = self.tickTime * self.initLimit;
+ int remainingInitLimitTime;
+ long startNanoTime = nanoTime();
+
+ for (int tries = 0; tries < 5 && socket.get() == null; tries++) {
Review comment:
The current design in which all retries are hidden in the connection logic
was specifically intended to make multipath networking transparent to the
higher level retry logic. The rationale was that this would substantially
reduce the probability of introducing bugs into commonly used logic (i.e.
connect / reconnect) to optimize a rarely used capability (reconnect in the
presence of redundant network options).
We felt that the benefit (very small, applicable rarely) was enormously
outweighed by the risk (small, commonly used).
I don't see any reason to rethink that. Do you have something in mind?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services