Hello,
Before I open a new JIRA for this, I wanted to bring it up here first.
Basically the issue I'm seeing is that the read requests fail when
connecting to a read only ensemble.
I made a test for this using TestingCluster, but it requires to set
iptables rules in order to mimic a real (remote) read-only ensemble. The
test will do this:
1) start a TestingCluster
2) stop 2 of the 3 nodes
3) user has to run the iptables commands (specified in the logs)
4) a read request is issued every 3 seconds
All the read requests block for approx 2 minutes, then they all throw an
exception (ConnectionLoss).
I was wondering if other users were using Curator with a read only ensemble?
Thanks
Benjamin
2016-11-23 11:45:26 WARN TestCuratorRetry:53 - sudo /sbin/iptables -A INPUT -p
tcp --destination-port 52485 -j DROP
2016-11-23 11:45:26 WARN TestCuratorRetry:53 - sudo /sbin/iptables -A INPUT -p
tcp --destination-port 39258 -j DROP
2016-11-23 11:45:26 WARN TestCuratorRetry:27 - Waiting for iptables commands
(30sec)
2016-11-23 11:45:56 WARN TestCuratorRetry:29 - Resuming...
2016-11-23 11:45:56 INFO o.a.c.f.i.CuratorFrameworkImpl:282 - Starting
2016-11-23 11:45:56 INFO o.a.c.f.i.CuratorFrameworkImpl:321 - Default schema
2016-11-23 11:45:56 WARN TestCuratorRetry:61 - Request 1
2016-11-23 11:45:56 INFO o.a.c.f.s.ConnectionStateManager:236 - State change:
READ_ONLY
2016-11-23 11:45:56 WARN TestCuratorRetry:40 - Connection state changed:
READ_ONLY
2016-11-23 11:45:56 WARN TestCuratorRetry:65 - Request 1 completed
successfully.
2016-11-23 11:45:59 WARN TestCuratorRetry:61 - Request 2
2016-11-23 11:46:02 WARN TestCuratorRetry:61 - Request 3
2016-11-23 11:46:05 WARN TestCuratorRetry:61 - Request 4
(...)
2016-11-23 11:47:59 WARN TestCuratorRetry:61 - Request 42
2016-11-23 11:48:02 WARN TestCuratorRetry:61 - Request 43
2016-11-23 11:48:03 INFO o.a.c.f.s.ConnectionStateManager:236 - State change:
SUSPENDED
2016-11-23 11:48:03 WARN TestCuratorRetry:40 - Connection state changed:
SUSPENDED
2016-11-23 11:48:04 ERROR o.a.c.f.i.CuratorFrameworkImpl:653 - Background
operation retry gave up
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =
ConnectionLoss
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
~[zookeeper-3.5.2.20160915-unstable.jar:3.5.2-alpha-1753710]
at
org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:814)
[curator-framework-3.2.1.jar:3.2.1]
at
org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:943)
[curator-framework-3.2.1.jar:3.2.1]
at
org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:895)
[curator-framework-3.2.1.jar:3.2.1]
at
org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:70)
[curator-framework-3.2.1.jar:3.2.1]
at
org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:314)
[curator-framework-3.2.1.jar:3.2.1]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[?:1.8.0_101]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
[?:1.8.0_101]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
[?:1.8.0_101]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[?:1.8.0_101]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[?:1.8.0_101]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101]
2016-11-23 11:48:05 ERROR TestCuratorRetry:67 - Request 34 failed
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =
ConnectionLoss for /
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
~[zookeeper-3.5.2.20160915-unstable.jar:3.5.2-alpha-1753710]
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
~[zookeeper-3.5.2.20160915-unstable.jar:3.5.2-alpha-1753710]
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1956)
~[zookeeper-3.5.2.20160915-unstable.jar:3.5.2-alpha-1753710]
at
org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:313)
~[curator-framework-3.2.1.jar:3.2.1]
at
org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:302)
~[curator-framework-3.2.1.jar:3.2.1]
at
org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:67)
~[curator-client-3.2.1.jar:?]
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:100)
~[curator-client-3.2.1.jar:?]
at
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:299)
~[curator-framework-3.2.1.jar:3.2.1]
at
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:290)
~[curator-framework-3.2.1.jar:3.2.1]
at
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:34)
~[curator-framework-3.2.1.jar:3.2.1]
at TestCuratorRetry$2.run(TestCuratorRetry.java:64) [bin/:?]
import java.io.IOException;
import org.apache.curator.framework.CuratorFramework;
import org.apache.curator.framework.CuratorFrameworkFactory;
import org.apache.curator.framework.state.ConnectionState;
import org.apache.curator.framework.state.ConnectionStateListener;
import org.apache.curator.retry.RetryNTimes;
import org.apache.curator.test.TestingCluster;
import org.apache.curator.test.TestingZooKeeperServer;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
public class TestCuratorRetry {
static Logger logger;
static CuratorFramework client;
public static void main(String[] args) throws Exception {
System.setProperty("log4j.configurationFile", "/usr/local/apps/log4j2-default-console.json");
logger = LogManager.getLogger(TestCuratorRetry.class);
System.setProperty("readonlymode.enabled", "true");
TestingCluster cluster = new TestingCluster(3);
cluster.start();
Thread.sleep(3000);
stopServer(cluster.getServers().get(1));
stopServer(cluster.getServers().get(2));
logger.warn("Waiting for iptables commands (30sec)");
Thread.sleep(30000);
logger.warn("Resuming...");
CuratorFrameworkFactory.Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
.connectString(cluster.getConnectString()).sessionTimeoutMs(5000).connectionTimeoutMs(5000)
.retryPolicy(new RetryNTimes(1, 1000)).canBeReadOnly(true);
client = curatorClientBuilder.build();
client.getConnectionStateListenable().addListener(new ConnectionStateListener() {
@Override
public void stateChanged(CuratorFramework client, ConnectionState newState) {
logger.warn("Connection state changed: " + newState.name());
}
});
client.start();
int i = 1;
while (true) {
request(i++);
Thread.sleep(3000);
}
}
private static void stopServer(TestingZooKeeperServer server) throws IOException {
logger.warn("sudo /sbin/iptables -A INPUT -p tcp --destination-port " + server.getInstanceSpec().getPort()
+ " -j DROP");
server.stop();
}
private static void request(final int i) {
Thread t = new Thread() {
public void run() {
logger.warn("Request " + i);
try {
client.getData().forPath("/");
logger.warn("Request " + i + " completed successfully.");
} catch (Throwable th) {
logger.error("Request " + i + " failed", th);
}
}
};
t.start();
}
}