[
https://issues.apache.org/jira/browse/ZOOKEEPER-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185285#comment-15185285
]
Steve Rowe commented on ZOOKEEPER-2383:
---------------------------------------
This program triggers the problem for me roughly 10% of the time with ZK 3.4.8
- note that if I don't use a thread to start ZooKeeperServer, the connection
always comes in after the server has had a chance to register itself with JMX
(imports omitted - attaching full file here in a sec):
{code:java|title=TestZkStandaloneJMXRegistrationRaceConcurrent.java}
public class TestZkStandaloneJMXRegistrationRaceConcurrent {
public static void main(String[] args) throws IOException,
InterruptedException, KeeperException {
class ServerThread extends Thread {
private ZooKeeperServer server;
private ServerCnxnFactory cnxnFactory;
@Override public void run() {
try {
File tempDir =
Files.createTempDirectory(FileSystems.getDefault().getPath("."),"test").toFile();
FileTxnSnapLog txnSnapLog = new FileTxnSnapLog(tempDir, tempDir);
server = new ZooKeeperServer
(txnSnapLog, 2000, 2000, 4000, null, new ZKDatabase(txnSnapLog));
cnxnFactory = ServerCnxnFactory.createFactory(55555, -1);
cnxnFactory.startup(server);
} catch (IOException e) {
throw new RuntimeException(e);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
public void shutdown() throws IOException, InterruptedException {
cnxnFactory.shutdown();
cnxnFactory.join();
server.shutdown();
}
}
ServerThread serverThread = new ServerThread();
serverThread.setDaemon(true);
serverThread.start();
Thread.sleep(3);
ZooKeeper zk = new ZooKeeper("127.0.0.1:55555", 45000, new Watcher() {
public void process(WatchedEvent event) {} });
zk.create("/testing123", new byte[]{}, Ids.OPEN_ACL_UNSAFE,
CreateMode.EPHEMERAL);
serverThread.shutdown();
serverThread.join();
}
}
{code}
Here's an excerpt from a log exhibiting the failure - I'll also attach the full
log (I've added some logging to ZK 3.4.8 - I'll attach a patch showing those
additions here in a minute):
{noformat}
2016-03-08 11:32:08,414 [myid:] - WARN [SyncThread:0:MBeanRegistry@100] - bean
'Connections/127.0.0.1/0x153571244a70000' with parent
'StandaloneServer_port55555' has null path.
java.lang.Throwable:
at
org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:98)
at
org.apache.zookeeper.server.ServerCnxnFactory.registerConnection(ServerCnxnFactory.java:147)
at
org.apache.zookeeper.server.ZooKeeperServer.finishSessionInit(ZooKeeperServer.java:613)
at
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:181)
at
org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200)
at
org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131)
2016-03-08 11:32:08,414 [myid:] - WARN [Thread-0:MBeanRegistry@118] -
registered bean 'StandaloneServer_port55555' with parent 'null' at path '/'
java.lang.Throwable:
at
org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:116)
at
org.apache.zookeeper.server.ZooKeeperServer.registerJMX(ZooKeeperServer.java:385)
at
org.apache.zookeeper.server.ZooKeeperServer.startup(ZooKeeperServer.java:418)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:119)
at
TestZkStandaloneJMXRegistrationRaceConcurrent$1ServerThread.run(TestZkStandaloneJMXRegistrationRaceConcurrent.java:29)
2016-03-08 11:32:08,415 [myid:] - ERROR
[SyncThread:0:ZooKeeperCriticalThread@49] - Severe unrecoverable error, from
thread : SyncThread:0
java.lang.AssertionError
at
org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:104)
at
org.apache.zookeeper.server.ServerCnxnFactory.registerConnection(ServerCnxnFactory.java:147)
at
org.apache.zookeeper.server.ZooKeeperServer.finishSessionInit(ZooKeeperServer.java:613)
at
org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:181)
at
org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:200)
at
org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131)
2016-03-08 11:32:08,416 [myid:] - WARN [Thread-0:MBeanRegistry@118] -
registered bean 'InMemoryDataTree' with parent 'StandaloneServer_port55555' at
path '/StandaloneServer_port55555'
java.lang.Throwable:
at
org.apache.zookeeper.jmx.MBeanRegistry.register(MBeanRegistry.java:116)
at
org.apache.zookeeper.server.ZooKeeperServer.registerJMX(ZooKeeperServer.java:389)
at
org.apache.zookeeper.server.ZooKeeperServer.startup(ZooKeeperServer.java:418)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:119)
at
TestZkStandaloneJMXRegistrationRaceConcurrent$1ServerThread.run(TestZkStandaloneJMXRegistrationRaceConcurrent.java:29)
{noformat}
> Startup race in ZooKeeperServer
> -------------------------------
>
> Key: ZOOKEEPER-2383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2383
> Project: ZooKeeper
> Issue Type: Bug
> Components: jmx, server
> Affects Versions: 3.4.8
> Reporter: Steve Rowe
>
> In attempting to upgrade Solr's ZooKeeper dependency from 3.4.6 to 3.4.8
> (SOLR-8724) I ran into test failures where attempts to create a node in a
> newly started standalone ZooKeeperServer were failing because of an assertion
> in MBeanRegistry.
> ZooKeeperServer.startup() first sets up its request processor chain then
> registers itself in JMX, but if a connection comes in before the server's JMX
> registration happens, registration of the connection will fail because it
> trips the assertion that (effectively) its parent (the server) has already
> registered itself.
> {code:java|title=ZooKeeperServer.java}
> public synchronized void startup() {
> if (sessionTracker == null) {
> createSessionTracker();
> }
> startSessionTracker();
> setupRequestProcessors();
> registerJMX();
> state = State.RUNNING;
> notifyAll();
> }
> {code}
> {code:java|title=MBeanRegistry.java}
> public void register(ZKMBeanInfo bean, ZKMBeanInfo parent)
> throws JMException
> {
> assert bean != null;
> String path = null;
> if (parent != null) {
> path = mapBean2Path.get(parent);
> assert path != null;
> }
> {code}
> This problem appears to be new with ZK 3.4.8 - AFAIK Solr never had this
> issue with ZK 3.4.6.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)