I'm not sure about #3. I have seen things go awry when restarting the whole cluster. When doing an upgrade from mesos 0.23.0 to 0.24.1, I restarted all of the mesos-masters. Waited a few moments for a leader to be elected, then restarted the slaves. When I went back to look at Marathon all of the tasks were being redeployed, as though they had all been killed off for some reason. That wasn't what I expected to happen since the upgrade was suppose to be as simple as install and restart. Perhaps you're experiencing a similar issue?
On Fri, Oct 9, 2015 at 8:25 AM, Badal Naik <[email protected]> wrote: > Any idea about #1 ? > > Any one has experienced #3 ? > > On 09-Oct-2015, at 5:53 pm, craig w <[email protected]> wrote: > > With regards to item #2, I saw the same issue. it's been fixed in mesos > 0.25 (release candidates are out now), see > https://issues.apache.org/jira/browse/MESOS-3282. > > On Fri, Oct 9, 2015 at 8:16 AM, Badal Naik <[email protected]> wrote: > >> Hello Mesos-Users, >> >> I have set up 3 node mess cluster with ubuntu 14.04. i have started >> zookeeper,Mesos and marathon. Every thing working fine expect three things. >> >> 1) When i restart the whole cluster mesos does not show completed tasks. >> is it expected behaviour? if not what i should do? >> >> 2) in mesos web ui i’m not able to see >> staged/started/finished/killed/failed/lost task numbers even when tasks are >> running. >> >> 3) Every zookeeper instance throws this exception regularly: >> >> 2015-10-09 17:27:26,302 [myid:3] - WARN >> [SendWorker:1:QuorumCnxManager$SendWorker@679] - Interrupted while >> waiting for message on queue >> java.lang.InterruptedException >> at >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014) >> at >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088) >> at >> java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418) >> at >> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:831) >> at >> org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:62) >> at >> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:667) >> >> >> >> >> *Here is my Mesos-master configuration:* >> >> mesos master --ip=10.1.0.72 --work_dir=/var/lib/mesos-master --zk= >> file:///etc/mesos/conf/zk --quorum=file:///etc/mesos/conf/quorum >> Where zk=zk://zoo.service.consul:2181/mesos >> quorum=2 >> >> >> >> *Mesos-Slave Configuration:* >> >> mesos slave --work_dir=/var/lib/mesos-slave --ip=10.1.0.72 >> --hostname=10.1.0.72 --strict=false --master= >> file:///etc/mesos/conf/master FrameworkInfo.checkpoint=True >> >> >> >> *Marathon Configuration:* >> >> java -jar /opt/marathon.jar --master zk://zoo.service.consul:2181/mesos >> --zk zk://zoo.service.consul:2181/marathon --ha --hostname 10.1.0.72 >> --checkpoint >> >> >> >> >> *Zookeeper configs with java version *"1.8.0_45"*:* >> >> >> >> dataDir=/var/lib/zookeeper >> clientPort=2181 >> tickTime=2000 >> initLimit=10 >> syncLimit=20 >> >> >> autopurge.purgeInterval=0 >> >> >> zookeeper.connection.timeout.ms=6000 >> server.1=10.1.0.70:2888:3888 >> server.2=10.1.0.71:2888:3888 >> server.3=10.1.0.72:2888:3888 >> >> And different *myid* has been given. >> >> >> Can Anyone Help!!! >> >> >> > > > -- > > https://github.com/mindscratch > https://www.google.com/+CraigWickesser > https://twitter.com/mind_scratch > https://twitter.com/craig_links > > > -- https://github.com/mindscratch https://www.google.com/+CraigWickesser https://twitter.com/mind_scratch https://twitter.com/craig_links

