what do the master and slave logs say?

On Mon, Aug 25, 2014 at 9:03 AM, Frank Hinek <[email protected]> wrote:

> I was able to get a single node environment setup on Ubuntu 14.04.1
> following this guide: http://mesosphere.io/learn/install_ubuntu_debian/
>
> The single slave registered with the master via the local Zookeeper and I
> could run basic commands by posting to Marathon.
>
> I then tried to build a multi node cluster following this guide:
> http://mesosphere.io/docs/mesosphere/getting-started/cloud-install/
>
> The guide walks you through using the Mesosphere packages to install
> Mesos, Marathon, and Zookeeper one one node that will be the master and on
> the slave just Mesos.  You then disable automatic start of: mesos-slave on
> the master, mesos-master on the slave, and zookeeper on the slave.  It ends
> up looking like:
>
> NODE 1 (MASTER):
> - IP Address: 10.1.100.116
> - mesos-master
> - marathon
> - zookeeper
>
> NODE 2 (SLAVE):
> - IP Address: 10.1.100.117
> - mesos-slave
>
> The issue I’m running into is that the slave rarely is able to register
> with the master using the Zookeeper.  I can never run any jobs from
> marathon (just trying a simple sleep 5 command).  Even when the slave does
> register the Mesos UI shows 1 “Deactivated” slave — it never goes active.
>
> Here are the values I have for /etc/mesos/zk:
>
> MASTER: zk://10.1.100.116:2181/mesos
> SLAVE: zk://10.1.100.116:2181/mesos
>
> Any ideas of what to troubleshoot?  Would greatly appreciate pointers.
>
> Environment details:
> - Ubuntu Server 14.04.1 running as VMs on ESXi 5.5U1
> - Mesos: 0.20.0
> - Marathon 0.6.1
>
> There are no apparent connectivity issues, and I’m not having any problems
> with other VMs on the ESXi host.  All VM to VM communication is on the same
> VLAN and within the same host.
>
> Zookeeper log on master (slave briefly registered so I tried to run a
> sleep 5 command from marathon and then the slave disconnected):
>
> 2014-08-25 11:50:34,976 - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket
> connection from /10.1.100.117:45778
> 2014-08-25 11:50:34,977 - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@793] - Connection request from old
> client /10.1.100.117:45778; will be dropped if server is in r-o mode
> 2014-08-25 11:50:34,977 - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:2181:ZooKeeperServer@839] - Client attempting to
> establish new session at /10.1.100.117:45778
> 2014-08-25 11:50:34,978 - INFO  [SyncThread:0:ZooKeeperServer@595] -
> Established session 0x1480b22f7f0000c with negotiated timeout 10000 for
> client /10.1.100.117:45778
> 2014-08-25 11:51:05,724 - INFO  [ProcessThread(sid:0
> cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException
> when processing sessionid:0x1480b22f7f00001 type:create cxid:0x53faafa9
> zxid:0x49 txntype:-1 reqpath:n/a Error Path:/marathon Error:KeeperErrorCode
> = NodeExists for /marathon
> 2014-08-25 11:51:05,724 - INFO  [ProcessThread(sid:0
> cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException
> when processing sessionid:0x1480b22f7f00001 type:create cxid:0x53faafaa
> zxid:0x4a txntype:-1 reqpath:n/a Error Path:/marathon/state
> Error:KeeperErrorCode = NodeExists for /marathon/state
> 2014-08-25 11:51:09,145 - INFO  [ProcessThread(sid:0
> cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException
> when processing sessionid:0x1480b22f7f00001 type:create cxid:0x53faafb5
> zxid:0x4d txntype:-1 reqpath:n/a Error Path:/marathon Error:KeeperErrorCode
> = NodeExists for /marathon
> 2014-08-25 11:51:09,146 - INFO  [ProcessThread(sid:0
> cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException
> when processing sessionid:0x1480b22f7f00001 type:create cxid:0x53faafb6
> zxid:0x4e txntype:-1 reqpath:n/a Error Path:/marathon/state
> Error:KeeperErrorCode = NodeExists for /marathon/state
>
>

Reply via email to