>I0412 11:01:50.586612 3732 recover.cpp:578] Successfully joined the Paxos group
According to this, master 1 should connect to zk successfully. >root@slave1:/var/log/mesos# tail -f mesos-slave.slave1.invalid-user.log.INFO.20160412-110554.1696 >I0413 03:12:54.532676 1711 group.cpp:519] ZooKeeper session expired >I0413 03:12:58.757953 1715 slave.cpp:4304] Current disk usage 6.44%. Max allowed age: 5.848917453828577days >W0413 03:13:04.539577 1715 group.cpp:503] Timed out waiting to connect to ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration How about check whether you could connect to zk on slave1 or not? On Wed, Apr 13, 2016 at 11:49 AM, <[email protected]> wrote: > I checked the zookeeper status by running the command: > > > > root@master1:/home/ubuntu# echo stat | nc 30.30.30.52 2181 | grep Mode > > Mode: follower > > root@master1:/home/ubuntu# echo stat | nc 30.30.30.53 2181 | grep Mode > > Mode: leader > > root@master1:/home/ubuntu# echo stat | nc 30.30.30.54 2181 | grep Mode > > Mode: follower > > > > And it seems like it’s working fine. Is there another way to check the > health status? > > > > > > *From:* Abhishek Amralkar [mailto:[email protected]] > *Sent:* 13 April 2016 09:10 > > *To:* [email protected] > *Subject:* Re: Slaves not getting registered > > > > Have you checked if your ZooKeeper cluster is healthy? accessible from > Mesos Masters? > > > > W0413 03:12:24.512336 1715 group.cpp:503] Timed out waiting to connect to > ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration > > W0413 03:12:34.519641 1710 group.cpp:503] Timed out waiting to connect to > ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration > > W0413 03:12:44.521181 1713 group.cpp:503] Timed out waiting to connect to > ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration > > W0413 03:12:54.532501 1711 group.cpp:503] Timed out waiting to connect to > ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration > > > > It seems Mesos masters are not able to communicate to Zookeeper. > > > > -Abhishek > > On 13-Apr-2016, at 9:06 AM, [email protected] wrote: > > > > Hi, > > > > I have been following the document from the digitalocean (mesos-doc-link > <https://www.digitalocean.com/community/tutorials/how-to-configure-a-production-ready-mesosphere-cluster-on-ubuntu-14-04>) > where I have set 3 masters and one slave. Below are the log details: > > > > root@master1:/var/log/mesos# tail -f mesos-master.INFO > > I0412 11:01:50.579818 3736 recover.cpp:193] Received a recover response > from a replica in VOTING status > > I0412 11:01:50.579903 3736 recover.cpp:564] Updating replica status to > RECOVERING > > I0412 11:01:50.583102 3736 leveldb.cpp:304] Persisting metadata (8 bytes) > to leveldb took 3.154399ms > > I0412 11:01:50.583137 3736 replica.cpp:320] Persisted replica status to > RECOVERING > > I0412 11:01:50.583176 3736 recover.cpp:543] Starting catch-up from > position 1 to 2 > > I0412 11:01:50.583732 3736 recover.cpp:564] Updating replica status to > VOTING > > I0412 11:01:50.586318 3736 leveldb.cpp:304] Persisting metadata (8 bytes) > to leveldb took 2.540703ms > > I0412 11:01:50.586484 3736 replica.cpp:320] Persisted replica status to > VOTING > > I0412 11:01:50.586612 3732 recover.cpp:578] Successfully joined the Paxos > group > > I0412 11:01:50.586745 3731 recover.cpp:462] Recover process terminated > > > > root@master1:/var/log/mesos# tail -f mesos-master.WARNING > > Log file created at: 2016/04/12 11:01:49 > > Running on machine: master1 > > Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg > > W0412 11:01:49.024226 3712 authenticator.cpp:511] No credentials > provided, authentication requests will be refused > > > > root@master1:/var/log/mesos# tail -f > mesos-master.master1.invalid-user.log.INFO.20160412-11014 > > tail: cannot open > ‘mesos-master.master1.invalid-user.log.INFO.20160412-11014’ for reading: No > such file or directory > > root@master1:/var/log/mesos# tail -f > mesos-master.master1.invalid-user.log.INFO.20160412-11014 > > mesos-master.master1.invalid-user.log.INFO.20160412-110143.3651 > mesos-master.master1.invalid-user.log.INFO.20160412-110148.3712 > > root@master1:/var/log/mesos# tail -f > mesos-master.master1.invalid-user.log.INFO.20160412-110143.3651 > > I0412 11:01:46.424433 3676 replica.cpp:673] Replica in EMPTY status > received a broadcasted recover request from (5)@30.30.30.53:5050 > > I0412 11:01:47.068586 3675 replica.cpp:673] Replica in EMPTY status > received a broadcasted recover request from (8)@30.30.30.53:5050 > > I0412 11:01:47.592926 3677 replica.cpp:673] Replica in EMPTY status > received a broadcasted recover request from (11)@30.30.30.53:5050 > > I0412 11:01:48.188248 3680 replica.cpp:673] Replica in EMPTY status > received a broadcasted recover request from (14)@30.30.30.53:5050 > > I0412 11:01:48.887104 3678 group.cpp:460] Lost connection to ZooKeeper, > attempting to reconnect ... > > I0412 11:01:48.887177 3674 group.cpp:460] Lost connection to ZooKeeper, > attempting to reconnect ... > > I0412 11:01:48.887229 3677 group.cpp:460] Lost connection to ZooKeeper, > attempting to reconnect ... > > I0412 11:01:48.919545 3675 group.cpp:519] ZooKeeper session expired > > I0412 11:01:48.919848 3680 detector.cpp:154] Detected a new leader: None > > I0412 11:01:48.919922 3680 master.cpp:1710] The newly elected leader is > None > > > > > > root@slave1:/var/log/mesos# tail -f > mesos-slave.slave1.invalid-user.log.INFO.20160412-110554.1696 > > I0413 03:12:54.532676 1711 group.cpp:519] ZooKeeper session expired > > I0413 03:12:58.757953 1715 slave.cpp:4304] Current disk usage 6.44%. Max > allowed age: 5.848917453828577days > > W0413 03:13:04.539577 1715 group.cpp:503] Timed out waiting to connect to > ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration > > I0413 03:13:04.539798 1715 group.cpp:519] ZooKeeper session expired > > W0413 03:13:14.542245 1713 group.cpp:503] Timed out waiting to connect to > ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration > > I0413 03:13:14.542434 1713 group.cpp:519] ZooKeeper session expired > > > > root@slave1:/var/log/mesos# tail -f mesos-slave.WARNING > > W0413 03:12:24.512336 1715 group.cpp:503] Timed out waiting to connect to > ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration > > W0413 03:12:34.519641 1710 group.cpp:503] Timed out waiting to connect to > ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration > > W0413 03:12:44.521181 1713 group.cpp:503] Timed out waiting to connect to > ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration > > W0413 03:12:54.532501 1711 group.cpp:503] Timed out waiting to connect to > ZooKeeper. Forcing ZooKeeper session (sessionId=0) expiration > > > > Thank you. > > > > > > *From:* June Taylor [mailto:[email protected] <[email protected]>] > *Sent:* 12 April 2016 18:06 > *To:* [email protected] > *Subject:* Re: Slaves not getting registered > > > > Try looking in /var/log/mesos/ at these files: mesos-slave.WARNING, > mesos-slave.INFO, mesos-slave.ERROR > > > > > Thanks, > > June Taylor > > System Administrator, Minnesota Population Center > > University of Minnesota > > > > On Tue, Apr 12, 2016 at 4:36 AM, Dick Davies <[email protected]> > wrote: > > There's no mention of a slave there, have a look at the logs on the > slaves filesystem and see if it is giving any errors. > > > On 12 April 2016 at 10:17, <[email protected]> wrote: > > The GUI log shows like this: > > > > > > > > I0412 08:45:51.379609 3616 master.cpp:3673] Processing DECLINE call for > > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O282 ] for framework > > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at > > [email protected]:42208 > > > > I0412 08:45:54.637461 3612 http.cpp:501] HTTP GET for /master/state.json > > from 10.211.203.147:59463 with User-Agent='Mozilla/5.0 (Windows NT 6.0; > > WOW64; rv:43.0) Gecko/20100101 Firefox/43.0' > > > > I0412 08:45:57.376288 3619 master.cpp:5350] Sending 1 offers to > framework > > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at > > [email protected]:42208 > > > > I0412 08:45:57.385325 3613 <385325%20%203613> master.cpp:3673] > Processing DECLINE call for > > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O283 ] for framework > > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at > > [email protected]:42208 > > > > I0412 08:46:03.383728 3614 master.cpp:5350] Sending 1 offers to > framework > > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at > > [email protected]:42208 > > > > I0412 08:46:03.396531 3612 master.cpp:3673] Processing DECLINE call for > > offers: [ 74f33592-fc48-4066-a59c-977818b4c13c-O284 ] for framework > > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at > > [email protected]:42208 > > > > I0412 08:46:04.665582 3612 http.cpp:501] HTTP GET for /master/state.json > > from 10.211.203.147:59464 with User-Agent='Mozilla/5.0 (Windows NT 6.0; > > WOW64; rv:43.0) Gecko/20100101 Firefox/43.0' > > > > I0412 08:46:09.389493 3616 master.cpp:5350] Sending 1 offers to > framework > > 74f33592-fc48-4066-a59c-977818b4c13c-0001 (chronos-2.4.0) at > > [email protected]:42208 > > > > > > > > > > > > Is there a way to find out the number of masters that are present in the > > environment together through CLI/GUI? > > > > > > > > > > > > > > > > From: haosdent [mailto:[email protected]] > > Sent: 12 April 2016 13:37 > > To: user <[email protected]> > > Subject: Re: Slaves not getting registered > > > > > > > >>but am unable to get it registered. > > > > Hi, @aishwarya Could you post master and slave log to provide more > details? > > Usually it is because of network problem. > > > > > > > > On Tue, Apr 12, 2016 at 4:02 PM, <[email protected]> > wrote: > > > > Hi, > > > > > > > > I’m unable to get the slave registered with the master node. I’ve > configured > > both the masters and slave machines but am unable to get it registered. > > > > > > > > Thank you. > > > > > > > > ________________________________ > > > > > > This message is for the designated recipient only and may contain > > privileged, proprietary, or otherwise confidential information. If you > have > > received it in error, please notify the sender immediately and delete the > > original. Any other use of the e-mail by you is prohibited. Where > allowed by > > local law, electronic communications with Accenture and its affiliates, > > including e-mail and instant messaging (including content), may be > scanned > > by our systems for the purposes of information security and assessment of > > internal compliance with Accenture policy. > > > ______________________________________________________________________________________ > > > > www.accenture.com > > > > > > > > > > > > -- > > > > Best Regards, > > > > Haosdent Huang > > > > > ------------------------------ > > > This message is for the designated recipient only and may contain > privileged, proprietary, or otherwise confidential information. If you have > received it in error, please notify the sender immediately and delete the > original. Any other use of the e-mail by you is prohibited. Where allowed > by local law, electronic communications with Accenture and its affiliates, > including e-mail and instant messaging (including content), may be scanned > by our systems for the purposes of information security and assessment of > internal compliance with Accenture policy. > > ______________________________________________________________________________________ > > www.accenture.com > > > -- Best Regards, Haosdent Huang

