Hello guys I'm using aurora 0.9 and tried to update to mesos 0.24. Right after the update I started to get this messages in the aurora leader and it crashed. Every new leader crashed in the same way. Mesos was updated in a rolling fashion, one node at the time, and it was looking healthy, even marathon was able to register itself and start jobs but aurora never did it. Here's a sample of the log I got on each leader, see the 'failed to parse data' at the end.
I saw this comment in the mesos upgrades notes [1] "Master now publishes its information in ZooKeeper in JSON (instead of protobuf). Make sure schedulers are linked against >= 0.23.0 libmesos before upgrading the master." so I was wondering if it's supported or not. 2015-09-25 18:49:37,923:1(0x7fd9dc6b4700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2015-09-25 18:49:37,923:1(0x7fd9dc6b4700):ZOO_INFO@log_env@716: Client environment:host.name=11f23e5685b3 2015-09-25 18:49:37,923:1(0x7fd9dc6b4700):ZOO_INFO@log_env@723: Client environment:os.name=Linux 2015-09-25 18:49:37,923:1(0x7fd9dc6b4700):ZOO_INFO@log_env@724: Client environment:os.arch=3.19.0-28-generic 2015-09-25 18:49:37,923:1(0x7fd9dc6b4700):ZOO_INFO@log_env@725: Client environment:os.version=#30~14.04.1-Ubuntu SMP Tue Sep 1 09:32:55 UTC 2015 I0925 18:49:37.923105 871 sched.cpp:157] Version: 0.22.0 2015-09-25 18:49:37,923:1(0x7fd9dc6b4700):ZOO_INFO@log_env@733: Client environment:user.name=(null) 2015-09-25 18:49:37,923:1(0x7fd9dc6b4700):ZOO_INFO@log_env@741: Client environment:user.home=/root 2015-09-25 18:49:37,923:1(0x7fd9dc6b4700):ZOO_INFO@log_env@753: Client environment:user.dir=/ 2015-09-25 18:49:37,923:1(0x7fd9dc6b4700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=192.168.255.31:2181,192.168.255.32:2181, 192.168.255.33:2181,192.168.255.34:2181,192.168.255.35:2181 sessionTimeout=10000 watcher=0x7fd9e6d88cd0 sessionId=0 sessionPasswd=<null> context=0x7fd9a8000b70 flags=0 I0925 18:49:37.923 THREAD800 org.apache.aurora.scheduler.mesos.SchedulerDriverService.startUp: Driver started with code DRIVER_RUNNING 2015-09-25 18:49:37,923:1(0x7fd9c9333700):ZOO_INFO@check_events@1703: initiated connection to server [192.168.255.32:2181] I0925 18:49:37.924 THREAD133 org.apache.aurora.scheduler.SchedulerLifecycle$DefaultDelayedActions.onRegistrationTimeout: Giving up on registration in (1, mins) 2015-09-25 18:49:37,930:1(0x7fd9c9333700):ZOO_INFO@check_events@1750: session establishment complete on server [192.168.255.32:2181], sessionId=0x250038b61690f12, negotiated timeout=10000 I0925 18:49:37.930192 224 group.cpp:313] Group process (group(3)@ 10.224.255.23:8083) connected to ZooKeeper I0925 18:49:37.930253 224 group.cpp:790] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) I0925 18:49:37.930297 224 group.cpp:385] Trying to create path '/mesos' in ZooKeeper I0925 18:49:37.930974 224 group.cpp:717] Found non-sequence node 'log_replicas' at '/mesos' in ZooKeeper I0925 18:49:37.931046 224 detector.cpp:138] Detected a new leader: (id='2513') I0925 18:49:37.931131 224 group.cpp:659] Trying to get '/mesos/json.info_0000002513' in ZooKeeper Failed to detect a master: Failed to parse data of unknown label 'json.info' [1] http://mesos.apache.org/documentation/latest/upgrades/
