[
https://issues.apache.org/jira/browse/MESOS-670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752939#comment-13752939
]
Benjamin Mahler commented on MESOS-670:
---------------------------------------
Confirmed this is the commit:
{noformat}
commit eb1cd4a7c0ad4310f090d4f0643cf4059ac5246b
Author: Benjamin Mahler <[email protected]>
Date: Mon Aug 26 18:26:07 2013 -0700
Upgraded ZooKeeper from 3.3.4 to 3.3.6.
From: Vinson Lee <[email protected]>
Review: https://reviews.apache.org/r/13598
{noformat}
It appears that a 'make clean' is required to correctly pick up the ZK change
and cause the test failures, which is why I didn't catch this when running make
check prior to committing. Likely the same reason Vinson didn't notice.
I've looked through the ZK code, and it appears to be broken in 3.3.6 if one
makes the following sequence of calls:
ZooKeeperServer.startup() -> ZooKeeperServer.shutdown() ->
ZooKeeperServer.startup()
In 3.3.6:
{code}
372 public void startup() {
373 if (sessionTracker == null) {
374 createSessionTracker(); // Creates a new session tracker.
375 }
376 startSessionTracker(); // Calls Thread.start() on the session
tracker, unconditionally! This throws a java.lang.IllegalThreadStateException.
377 setupRequestProcessors();
378
379 registerJMX();
380
381 synchronized (this) {
382 running = true;
383 notifyAll();
384 }
385 }
{code}
In 3.3.4:
{code}
370 public void startup() {
371 createSessionTracker(); // Creates a new session tracker and starts
it.
372 setupRequestProcessors();
373
374 registerJMX();
375
376 synchronized (this) {
377 running = true;
378 notifyAll();
379 }
380 }
{code}
It's difficult to tell from the documentation whether we're using the API
correctly or whether this was an accidental bug when they pulled it into the
3.3.x branch.
This is the ZK commit:
{noformat}
➜ zookeeper-3.3.6 svn log --revision 1239983
------------------------------------------------------------------------
r1239983 | mahadev | 2012-02-02 18:41:08 -0800 (Thu, 02 Feb 2012) | 1 line
ZOOKEEPER-1367. Data inconsistencies and unexpired ephemeral nodes after
cluster restart. (Benjamin Reed via mahadev)
------------------------------------------------------------------------
{noformat}
See ZOOKEEPER-1367.
> GroupTest.GroupJoinWithDisconnect fails on master.
> --------------------------------------------------
>
> Key: MESOS-670
> URL: https://issues.apache.org/jira/browse/MESOS-670
> Project: Mesos
> Issue Type: Bug
> Components: test
> Reporter: Benjamin Mahler
> Assignee: Benjamin Mahler
>
> [ RUN ] GroupTest.GroupJoinWithDisconnect
> 2013-08-28
> 14:15:21,348:40067(0x11c447000):ZOO_ERROR@handle_socket_error_msg@1579:
> Socket [127.0.0.1:64547] zk retcode=-4, errno=61(Connection refused): server
> refused to accept the client
> Exception in thread "AWT-AppKit" java.lang.IllegalThreadStateException
> at java.lang.Thread.start(Thread.java:656)
> at
> org.apache.zookeeper.server.ZooKeeperServer.startSessionTracker(ZooKeeperServer.java:402)
> at
> org.apache.zookeeper.server.ZooKeeperServer.startup(ZooKeeperServer.java:376)
> at
> org.apache.zookeeper.server.NIOServerCnxn$Factory.startup(NIOServerCnxn.java:161)
> Caught a JVM exception, not propagating
> I committed this patch from Vinson Lee:
> https://reviews.apache.org/r/13598/
> It appears this has possibly affected the ZK tests.
> There appears to be a code change between 3.3.4 and 3.3.6 relevant to this
> issue:
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.zookeeper/zookeeper/3.3.4/org/apache/zookeeper/server/ZooKeeperServer.java#370
> vs
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.zookeeper/zookeeper/3.3.6/org/apache/zookeeper/server/ZooKeeperServer.java#372
> I'll dig a little further, hopefully I can avoid needing to revert this
> commit.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira