Got it. I have build mesos from source so I need to initialize log-dir manually 
for once.
And now things are working.Thank you 

-----Original Message-----
From: "Badal Naik" <[email protected]>
Sent: ā€Ž09/ā€Ž10/ā€Ž2015 06:38 PM
To: "[email protected]" <[email protected]>
Subject: Re: After restarting cluster task disappeared

No i have not deleted anything.
i  am just restarting physical nodes and then i am not able to check old 
completed tasks.And after restarting many times marathon and mess just stay out 
of sync and when i restart marathon service both get synced.




On 09-Oct-2015, at 6:32 pm, haosdent <[email protected]> wrote:


For #1, do you delete something in your work_dir or zookeeper?
For #3, is this zookeeper issue related to yours 
http://stackoverflow.com/questions/15842553/zookeeper-network-ensemble-does-not-start-appropiately
 ?


On Fri, Oct 9, 2015 at 8:30 PM, craig w <[email protected]> wrote:

I'm not sure about #3. I have seen things go awry when restarting the whole 
cluster. When doing an upgrade from mesos 0.23.0 to 0.24.1, I restarted all of 
the mesos-masters. Waited a few moments for a leader to be elected, then 
restarted the slaves. When I went back to look at Marathon all of the tasks 
were being redeployed, as though they had all been killed off for some reason. 
That wasn't what I expected to happen since the upgrade was suppose to be as 
simple as install and restart. Perhaps you're experiencing a similar issue?



On Fri, Oct 9, 2015 at 8:25 AM, Badal Naik <[email protected]> wrote:

Any idea about #1 ?


Any one has experienced #3 ?


On 09-Oct-2015, at 5:53 pm, craig w <[email protected]> wrote:


With regards to item #2, I saw the same issue. it's been fixed in mesos 0.25 
(release candidates are out now), see 
https://issues.apache.org/jira/browse/MESOS-3282.



On Fri, Oct 9, 2015 at 8:16 AM, Badal Naik <[email protected]> wrote:

Hello Mesos-Users,


I have set up 3 node mess cluster with ubuntu 14.04. i have started 
zookeeper,Mesos and marathon. Every thing working fine expect three things.


1) When i restart the whole cluster mesos does not show completed tasks. is it 
expected behaviour? if not what i should do?


2) in mesos web ui i’m not able to see 
staged/started/finished/killed/failed/lost task numbers even when tasks are 
running.


3) Every zookeeper instance throws this exception regularly:


 2015-10-09 17:27:26,302 [myid:3] - WARN  
[SendWorker:1:QuorumCnxManager$SendWorker@679] - Interrupted while waiting for 
message on queue
java.lang.InterruptedException
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:831)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:62)
at 
org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:667)








Here is my Mesos-master configuration:


mesos master --ip=10.1.0.72  --work_dir=/var/lib/mesos-master 
--zk=file:///etc/mesos/conf/zk --quorum=file:///etc/mesos/conf/quorum
Where zk=zk://zoo.service.consul:2181/mesos
              quorum=2
              




Mesos-Slave Configuration:


mesos slave --work_dir=/var/lib/mesos-slave --ip=10.1.0.72 --hostname=10.1.0.72 
--strict=false  --master=file:///etc/mesos/conf/master 
FrameworkInfo.checkpoint=True
      




Marathon Configuration:


java -jar /opt/marathon.jar  --master zk://zoo.service.consul:2181/mesos  --zk 
zk://zoo.service.consul:2181/marathon  --ha --hostname 10.1.0.72  --checkpoint








Zookeeper configs with java version "1.8.0_45":






dataDir=/var/lib/zookeeper
clientPort=2181
tickTime=2000
initLimit=10
syncLimit=20



autopurge.purgeInterval=0



zookeeper.connection.timeout.ms=6000
server.1=10.1.0.70:2888:3888
server.2=10.1.0.71:2888:3888
server.3=10.1.0.72:2888:3888


And different myid has been given.




Can Anyone Help!!!







-- 

https://github.com/mindscratchhttps://www.google.com/+CraigWickesserhttps://twitter.com/mind_scratchhttps://twitter.com/craig_links




-- 

https://github.com/mindscratchhttps://www.google.com/+CraigWickesserhttps://twitter.com/mind_scratchhttps://twitter.com/craig_links




-- 

Best Regards,

Haosdent Huang

Reply via email to