Benno Evers created MESOS-9467:
----------------------------------

             Summary: 
ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster
 is flaky
                 Key: MESOS-9467
                 URL: https://issues.apache.org/jira/browse/MESOS-9467
             Project: Mesos
          Issue Type: Bug
         Environment: Mac OSX with ssl enabled
            Reporter: Benno Evers


The following error was observed in an internal CI run:
{noformat}
../../src/tests/master_contender_detector_tests.cpp:872: Failure
Failed to wait 15secs for detected
{noformat}

Full log:
{noformat}
[ RUN      ] 
ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster
I1210 16:18:13.068011 338650560 zookeeper_test_server.cpp:156] Started 
ZooKeeperTestServer on port 54990
2018-12-10 16:18:13,068:28813(0x70000e2f6000):ZOO_INFO@log_env@753: Client 
environment:zookeeper.version=zookeeper C client 3.4.8
2018-12-10 16:18:13,068:28813(0x70000e2f6000):ZOO_INFO@log_env@757: Client 
environment:host.name=Jenkinss-Mac-mini.local
2018-12-10 16:18:13,068:28813(0x70000e2f6000):ZOO_INFO@log_env@764: Client 
environment:os.name=Darwin
2018-12-10 16:18:13,068:28813(0x70000e2f6000):ZOO_INFO@log_env@765: Client 
environment:os.arch=18.2.0
2018-12-10 16:18:13,068:28813(0x70000e2f6000):ZOO_INFO@log_env@766: Client 
environment:os.version=Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 
2018; root:xnu-4903.231.4~2/RELEASE_X86_64
2018-12-10 16:18:13,068:28813(0x70000e2f6000):ZOO_INFO@log_env@774: Client 
environment:user.name=jenkins
2018-12-10 16:18:13,068:28813(0x70000e2f6000):ZOO_INFO@log_env@782: Client 
environment:user.home=/Users/jenkins
2018-12-10 16:18:13,068:28813(0x70000e2f6000):ZOO_INFO@log_env@794: Client 
environment:user.dir=/Users/jenkins/workspace/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mac/mesos/build
2018-12-10 16:18:13,068:28813(0x70000e2f6000):ZOO_INFO@zookeeper_init@827: 
Initiating client connection, host=127.0.0.1:54990 sessionTimeout=10000 
watcher=0x116d03e00 sessionId=0 sessionPasswd=<null> context=0x7fd3883958d0 
flags=0
2018-12-10 16:18:13,068:28813(0x70000ed1d000):ZOO_INFO@check_events@1764: 
initiated connection to server [127.0.0.1:54990]
I1210 16:18:13.069262 236376064 contender.cpp:152] Joining the ZK group
2018-12-10 16:18:13,070:28813(0x70000ed1d000):ZOO_INFO@check_events@1811: 
session establishment complete on server [127.0.0.1:54990], 
sessionId=0x1679aa0ddc90000, negotiated timeout=10000
I1210 16:18:13.070789 239058944 group.cpp:341] Group process 
(zookeeper-group(28)@10.0.49.4:54069) connected to ZooKeeper
I1210 16:18:13.070853 239058944 group.cpp:831] Syncing group operations: queue 
size (joins, cancels, datas) = (1, 0, 0)
I1210 16:18:13.070868 239058944 group.cpp:419] Trying to create path '/mesos' 
in ZooKeeper
I1210 16:18:13.073835 235839488 contender.cpp:268] New candidate (id='0') has 
entered the contest for leadership
I1210 16:18:13.074319 237985792 detector.cpp:152] Detected a new leader: 
(id='0')
I1210 16:18:13.074406 237449216 group.cpp:700] Trying to get 
'/mesos/json.info_0000000000' in ZooKeeper
I1210 16:18:13.075139 239058944 zookeeper.cpp:262] A new leading master 
(UPID=@0.152.150.128:10000) is detected
2018-12-10 16:18:13,075:28813(0x70000e273000):ZOO_INFO@log_env@753: Client 
environment:zookeeper.version=zookeeper C client 3.4.8
2018-12-10 16:18:13,075:28813(0x70000e273000):ZOO_INFO@log_env@757: Client 
environment:host.name=Jenkinss-Mac-mini.local
2018-12-10 16:18:13,075:28813(0x70000e273000):ZOO_INFO@log_env@764: Client 
environment:os.name=Darwin
2018-12-10 16:18:13,075:28813(0x70000e273000):ZOO_INFO@log_env@765: Client 
environment:os.arch=18.2.0
2018-12-10 16:18:13,075:28813(0x70000e273000):ZOO_INFO@log_env@766: Client 
environment:os.version=Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 
2018; root:xnu-4903.231.4~2/RELEASE_X86_64
2018-12-10 16:18:13,075:28813(0x70000e273000):ZOO_INFO@log_env@774: Client 
environment:user.name=jenkins
2018-12-10 16:18:13,075:28813(0x70000e273000):ZOO_INFO@log_env@782: Client 
environment:user.home=/Users/jenkins
2018-12-10 16:18:13,075:28813(0x70000e273000):ZOO_INFO@log_env@794: Client 
environment:user.dir=/Users/jenkins/workspace/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mac/mesos/build
2018-12-10 16:18:13,075:28813(0x70000e273000):ZOO_INFO@zookeeper_init@827: 
Initiating client connection, host=127.0.0.1:54990 sessionTimeout=10000 
watcher=0x116d03e00 sessionId=0 sessionPasswd=<null> context=0x7fd3886b40e0 
flags=0
2018-12-10 16:18:13,075:28813(0x70000f944000):ZOO_INFO@check_events@1764: 
initiated connection to server [127.0.0.1:54990]
I1210 16:18:13.076236 238522368 contender.cpp:152] Joining the ZK group
2018-12-10 16:18:13,077:28813(0x70000f944000):ZOO_INFO@check_events@1811: 
session establishment complete on server [127.0.0.1:54990], 
sessionId=0x1679aa0ddc90001, negotiated timeout=10000
I1210 16:18:13.077278 239058944 group.cpp:341] Group process 
(zookeeper-group(29)@10.0.49.4:54069) connected to ZooKeeper
I1210 16:18:13.077311 239058944 group.cpp:831] Syncing group operations: queue 
size (joins, cancels, datas) = (1, 0, 0)
I1210 16:18:13.077327 239058944 group.cpp:419] Trying to create path '/mesos' 
in ZooKeeper
I1210 16:18:13.079026 236912640 contender.cpp:268] New candidate (id='1') has 
entered the contest for leadership
I1210 16:18:13.079605 237449216 detector.cpp:152] Detected a new leader: 
(id='0')
I1210 16:18:13.079732 236912640 group.cpp:700] Trying to get 
'/mesos/json.info_0000000000' in ZooKeeper
I1210 16:18:13.080291 237449216 zookeeper.cpp:262] A new leading master 
(UPID=@0.152.150.128:10000) is detected
2018-12-10 16:18:13,080:28813(0x70000e3fc000):ZOO_INFO@log_env@753: Client 
environment:zookeeper.version=zookeeper C client 3.4.8
2018-12-10 16:18:13,080:28813(0x70000e3fc000):ZOO_INFO@log_env@757: Client 
environment:host.name=Jenkinss-Mac-mini.local
2018-12-10 16:18:13,080:28813(0x70000e3fc000):ZOO_INFO@log_env@764: Client 
environment:os.name=Darwin
2018-12-10 16:18:13,080:28813(0x70000e3fc000):ZOO_INFO@log_env@765: Client 
environment:os.arch=18.2.0
2018-12-10 16:18:13,080:28813(0x70000e3fc000):ZOO_INFO@log_env@766: Client 
environment:os.version=Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 
2018; root:xnu-4903.231.4~2/RELEASE_X86_64
2018-12-10 16:18:13,080:28813(0x70000e3fc000):ZOO_INFO@log_env@774: Client 
environment:user.name=jenkins
2018-12-10 16:18:13,080:28813(0x70000e3fc000):ZOO_INFO@log_env@782: Client 
environment:user.home=/Users/jenkins
2018-12-10 16:18:13,080:28813(0x70000e3fc000):ZOO_INFO@log_env@794: Client 
environment:user.dir=/Users/jenkins/workspace/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mac/mesos/build
2018-12-10 16:18:13,080:28813(0x70000e3fc000):ZOO_INFO@zookeeper_init@827: 
Initiating client connection, host=127.0.0.1:54990 sessionTimeout=10000 
watcher=0x116d03e00 sessionId=0 sessionPasswd=<null> context=0x7fd388623260 
flags=0
2018-12-10 16:18:13,081:28813(0x70000fa4a000):ZOO_INFO@check_events@1764: 
initiated connection to server [127.0.0.1:54990]
2018-12-10 16:18:13,082:28813(0x70000fa4a000):ZOO_INFO@check_events@1811: 
session establishment complete on server [127.0.0.1:54990], 
sessionId=0x1679aa0ddc90002, negotiated timeout=10000
I1210 16:18:13.082370 239058944 group.cpp:341] Group process 
(zookeeper-group(30)@10.0.49.4:54069) connected to ZooKeeper
I1210 16:18:13.082406 239058944 group.cpp:831] Syncing group operations: queue 
size (joins, cancels, datas) = (0, 0, 0)
I1210 16:18:13.082422 239058944 group.cpp:419] Trying to create path '/mesos' 
in ZooKeeper
I1210 16:18:13.083561 235302912 detector.cpp:152] Detected a new leader: 
(id='0')
I1210 16:18:13.083673 237985792 group.cpp:700] Trying to get 
'/mesos/json.info_0000000000' in ZooKeeper
I1210 16:18:13.084331 235302912 zookeeper.cpp:262] A new leading master 
(UPID=@0.152.150.128:10000) is detected
2018-12-10 
16:18:13,085:28813(0x70000fa4a000):ZOO_ERROR@handle_socket_error_msg@1782: 
Socket [127.0.0.1:54990] zk retcode=-4, errno=64(Host is down): failed while 
receiving a server response
I1210 16:18:13.085988 236376064 group.cpp:452] Lost connection to ZooKeeper, 
attempting to reconnect ...
2018-12-10 
16:18:13,086:28813(0x70000ed1d000):ZOO_ERROR@handle_socket_error_msg@1782: 
Socket [127.0.0.1:54990] zk retcode=-4, errno=64(Host is down): failed while 
receiving a server response
I1210 16:18:13.086808 236912640 group.cpp:452] Lost connection to ZooKeeper, 
attempting to reconnect ...
I1210 16:18:13.087200 236376064 detector.cpp:140] The current leader (id=0) is 
lost
I1210 16:18:13.087239 236376064 detector.cpp:152] Detected a new leader: 
(id='1')
I1210 16:18:13.087339 237985792 group.cpp:700] Trying to get 
'/mesos/json.info_0000000001' in ZooKeeper
I1210 16:18:13.087929 236376064 zookeeper.cpp:262] A new leading master 
(UPID=@0.152.150.129:10001) is detected
W1210 16:18:23.086897 239058944 group.cpp:495] Timed out waiting to connect to 
ZooKeeper. Forcing ZooKeeper session (sessionId=1679aa0ddc90002) expiration
I1210 16:18:23.087049 239058944 group.cpp:511] ZooKeeper session expired
I1210 16:18:23.087095 235302912 detector.cpp:140] The current leader (id=0) is 
lost
I1210 16:18:23.087132 235302912 detector.cpp:152] Detected a new leader: None
2018-12-10 16:18:23,087:28813(0x70000e3fc000):ZOO_INFO@zookeeper_close@2579: 
Freeing zookeeper resources for sessionId=0x1679aa0ddc90002

2018-12-10 16:18:23,087:28813(0x70000e067000):ZOO_INFO@log_env@753: Client 
environment:zookeeper.version=zookeeper C client 3.4.8
2018-12-10 16:18:23,087:28813(0x70000e067000):ZOO_INFO@log_env@757: Client 
environment:host.name=Jenkinss-Mac-mini.local
2018-12-10 16:18:23,087:28813(0x70000e067000):ZOO_INFO@log_env@764: Client 
environment:os.name=Darwin
2018-12-10 16:18:23,087:28813(0x70000e067000):ZOO_INFO@log_env@765: Client 
environment:os.arch=18.2.0
2018-12-10 16:18:23,087:28813(0x70000e067000):ZOO_INFO@log_env@766: Client 
environment:os.version=Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 
2018; root:xnu-4903.231.4~2/RELEASE_X86_64
2018-12-10 16:18:23,087:28813(0x70000e067000):ZOO_INFO@log_env@774: Client 
environment:user.name=jenkins
2018-12-10 16:18:23,087:28813(0x70000e067000):ZOO_INFO@log_env@782: Client 
environment:user.home=/Users/jenkins
2018-12-10 16:18:23,087:28813(0x70000e067000):ZOO_INFO@log_env@794: Client 
environment:user.dir=/Users/jenkins/workspace/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mac/mesos/build
2018-12-10 16:18:23,087:28813(0x70000e067000):ZOO_INFO@zookeeper_init@827: 
Initiating client connection, host=127.0.0.1:54990 sessionTimeout=10000 
watcher=0x116d03e00 sessionId=0 sessionPasswd=<null> context=0x7fd388623260 
flags=0
2018-12-10 16:18:23,087:28813(0x70000f4b5000):ZOO_INFO@check_events@1764: 
initiated connection to server [127.0.0.1:54990]
W1210 16:18:23.087962 235302912 group.cpp:495] Timed out waiting to connect to 
ZooKeeper. Forcing ZooKeeper session (sessionId=1679aa0ddc90000) expiration
I1210 16:18:23.088090 235302912 group.cpp:511] ZooKeeper session expired
I1210 16:18:23.088133 237449216 detector.cpp:140] The current leader (id=0) is 
lost
I1210 16:18:23.088150 237449216 detector.cpp:152] Detected a new leader: None
I1210 16:18:23.088217 237985792 contender.cpp:217] Membership cancelled: 0
2018-12-10 16:18:23,088:28813(0x70000e067000):ZOO_INFO@zookeeper_close@2579: 
Freeing zookeeper resources for sessionId=0x1679aa0ddc90000

2018-12-10 16:18:23,088:28813(0x70000e16d000):ZOO_INFO@log_env@753: Client 
environment:zookeeper.version=zookeeper C client 3.4.8
2018-12-10 16:18:23,088:28813(0x70000e16d000):ZOO_INFO@log_env@757: Client 
environment:host.name=Jenkinss-Mac-mini.local
2018-12-10 16:18:23,088:28813(0x70000e16d000):ZOO_INFO@log_env@764: Client 
environment:os.name=Darwin
2018-12-10 16:18:23,088:28813(0x70000e16d000):ZOO_INFO@log_env@765: Client 
environment:os.arch=18.2.0
2018-12-10 16:18:23,088:28813(0x70000e16d000):ZOO_INFO@log_env@766: Client 
environment:os.version=Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 
2018; root:xnu-4903.231.4~2/RELEASE_X86_64
2018-12-10 16:18:23,088:28813(0x70000e16d000):ZOO_INFO@log_env@774: Client 
environment:user.name=jenkins
2018-12-10 16:18:23,088:28813(0x70000e16d000):ZOO_INFO@log_env@782: Client 
environment:user.home=/Users/jenkins
2018-12-10 16:18:23,088:28813(0x70000e16d000):ZOO_INFO@log_env@794: Client 
environment:user.dir=/Users/jenkins/workspace/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mac/mesos/build
2018-12-10 16:18:23,088:28813(0x70000e16d000):ZOO_INFO@zookeeper_init@827: 
Initiating client connection, host=127.0.0.1:54990 sessionTimeout=10000 
watcher=0x116d03e00 sessionId=0 sessionPasswd=<null> context=0x7fd3885d9a10 
flags=0
2018-12-10 16:18:23,088:28813(0x70000ed1d000):ZOO_INFO@check_events@1764: 
initiated connection to server [127.0.0.1:54990]
2018-12-10 16:18:23,165:28813(0x70000f944000):ZOO_WARN@zookeeper_interest@1597: 
Exceeded deadline by 6746ms
2018-12-10 
16:18:23,166:28813(0x70000f944000):ZOO_ERROR@handle_socket_error_msg@1702: 
Socket [127.0.0.1:54990] zk retcode=-7, errno=60(Operation timed out): 
connection to 127.0.0.1:54990 timed out (exceeded timeout by 3412ms)
2018-12-10 16:18:23,166:28813(0x70000f944000):ZOO_WARN@zookeeper_interest@1597: 
Exceeded deadline by 6746ms
I1210 16:18:23.166213 236376064 group.cpp:452] Lost connection to ZooKeeper, 
attempting to reconnect ...
2018-12-10 16:18:25,268:28813(0x70000f4b5000):ZOO_INFO@check_events@1811: 
session establishment complete on server [127.0.0.1:54990], 
sessionId=0x1679aa0ddc90003, negotiated timeout=10000
I1210 16:18:25.268710 237449216 group.cpp:341] Group process 
(zookeeper-group(30)@10.0.49.4:54069) connected to ZooKeeper
I1210 16:18:25.268859 237449216 group.cpp:831] Syncing group operations: queue 
size (joins, cancels, datas) = (0, 0, 0)
I1210 16:18:25.268879 237449216 group.cpp:419] Trying to create path '/mesos' 
in ZooKeeper
2018-12-10 16:18:25,269:28813(0x70000ed1d000):ZOO_INFO@check_events@1811: 
session establishment complete on server [127.0.0.1:54990], 
sessionId=0x1679aa0ddc90004, negotiated timeout=10000
I1210 16:18:25.269650 239058944 group.cpp:341] Group process 
(zookeeper-group(28)@10.0.49.4:54069) connected to ZooKeeper
I1210 16:18:25.269680 239058944 group.cpp:831] Syncing group operations: queue 
size (joins, cancels, datas) = (0, 0, 0)
I1210 16:18:25.269692 239058944 group.cpp:419] Trying to create path '/mesos' 
in ZooKeeper
2018-12-10 16:18:26,509:28813(0x70000f944000):ZOO_WARN@zookeeper_interest@1597: 
Exceeded deadline by 10088ms
2018-12-10 16:18:26,509:28813(0x70000f944000):ZOO_INFO@check_events@1764: 
initiated connection to server [127.0.0.1:54990]
2018-12-10 
16:18:26,510:28813(0x70000f944000):ZOO_ERROR@handle_socket_error_msg@1800: 
Socket [127.0.0.1:54990] zk retcode=-112, errno=70(Stale NFS file handle): 
sessionId=0x1679aa0ddc90001 has expired.
I1210 16:18:26.510257 237985792 group.cpp:511] ZooKeeper session expired
I1210 16:18:26.510342 236376064 detector.cpp:140] The current leader (id=1) is 
lost
I1210 16:18:26.510363 236376064 detector.cpp:152] Detected a new leader: None
I1210 16:18:26.510382 235839488 contender.cpp:217] Membership cancelled: 1
2018-12-10 16:18:26,510:28813(0x70000e2f6000):ZOO_INFO@zookeeper_close@2579: 
Freeing zookeeper resources for sessionId=0x1679aa0ddc90001

2018-12-10 16:18:26,510:28813(0x70000e3fc000):ZOO_INFO@log_env@753: Client 
environment:zookeeper.version=zookeeper C client 3.4.8
2018-12-10 16:18:26,510:28813(0x70000e3fc000):ZOO_INFO@log_env@757: Client 
environment:host.name=Jenkinss-Mac-mini.local
2018-12-10 16:18:26,510:28813(0x70000e3fc000):ZOO_INFO@log_env@764: Client 
environment:os.name=Darwin
2018-12-10 16:18:26,510:28813(0x70000e3fc000):ZOO_INFO@log_env@765: Client 
environment:os.arch=18.2.0
2018-12-10 16:18:26,510:28813(0x70000e3fc000):ZOO_INFO@log_env@766: Client 
environment:os.version=Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 
2018; root:xnu-4903.231.4~2/RELEASE_X86_64
2018-12-10 16:18:26,510:28813(0x70000e3fc000):ZOO_INFO@log_env@774: Client 
environment:user.name=jenkins
2018-12-10 16:18:26,510:28813(0x70000e3fc000):ZOO_INFO@log_env@782: Client 
environment:user.home=/Users/jenkins
2018-12-10 16:18:26,510:28813(0x70000e3fc000):ZOO_INFO@log_env@794: Client 
environment:user.dir=/Users/jenkins/workspace/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mac/mesos/build
2018-12-10 16:18:26,510:28813(0x70000e3fc000):ZOO_INFO@zookeeper_init@827: 
Initiating client connection, host=127.0.0.1:54990 sessionTimeout=10000 
watcher=0x116d03e00 sessionId=0 sessionPasswd=<null> context=0x7fd3885d9bc0 
flags=0
2018-12-10 16:18:26,511:28813(0x70000f944000):ZOO_INFO@check_events@1764: 
initiated connection to server [127.0.0.1:54990]
2018-12-10 16:18:26,512:28813(0x70000f944000):ZOO_INFO@check_events@1811: 
session establishment complete on server [127.0.0.1:54990], 
sessionId=0x1679aa0ddc90005, negotiated timeout=10000
I1210 16:18:26.512154 239058944 group.cpp:341] Group process 
(zookeeper-group(29)@10.0.49.4:54069) connected to ZooKeeper
I1210 16:18:26.512214 239058944 group.cpp:831] Syncing group operations: queue 
size (joins, cancels, datas) = (0, 0, 0)
I1210 16:18:26.512230 239058944 group.cpp:419] Trying to create path '/mesos' 
in ZooKeeper
2018-12-10 16:18:29,859:28813(0x70000f944000):ZOO_WARN@zookeeper_interest@1597: 
Exceeded deadline by 13ms
../../src/tests/master_contender_detector_tests.cpp:872: Failure
Failed to wait 15secs for detected
2018-12-10 16:18:38,091:28813(0x1142f65c0):ZOO_INFO@zookeeper_close@2562: 
Closing zookeeper sessionId=0x1679aa0ddc90003 to [127.0.0.1:54990]

I1210 16:18:38.091266 338650560 contender.cpp:207] Now cancelling the 
membership: 1
2018-12-10 16:18:38,091:28813(0x1142f65c0):ZOO_INFO@zookeeper_close@2562: 
Closing zookeeper sessionId=0x1679aa0ddc90005 to [127.0.0.1:54990]

I1210 16:18:38.091816 338650560 contender.cpp:207] Now cancelling the 
membership: 0
2018-12-10 16:18:38,092:28813(0x1142f65c0):ZOO_INFO@zookeeper_close@2562: 
Closing zookeeper sessionId=0x1679aa0ddc90004 to [127.0.0.1:54990]

I1210 16:18:38.092182 338650560 zookeeper_test_server.cpp:116] Shutting down 
ZooKeeperTestServer on port 54990
[  FAILED  ] 
ZooKeeperMasterContenderDetectorTest.MasterDetectorExpireSlaveZKSessionNewMaster
 (25032 ms)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to