I don't remember the condition exactly, but I have faced similar issue in my deployments and have been fixed when I moved to 0.26.0. Upgrade the marathon to compatible version as well.
On Wed, Aug 10, 2016 at 9:30 AM, Paul Bell <[email protected]> wrote: > Hi Jeff, > > Thanks for your reply. > > Yeah....that thought occurred to me late last night. But customer is > sensitive to too much churn, so it wouldn't be my first choice. If I knew > with certainty that such a problem existed in the versions they are running > AND that more recent versions fixed it, then I'd do my best to compel the > upgrade. > > Docker version is also old, 1.6.2. > > -Paul > > On Wed, Aug 10, 2016 at 9:18 AM, Jeff Schroeder < > [email protected]> wrote: > >> Have you considered upgrading Mesos and Marathon? Those are quite old >> versions of both with some fairly glaring problems with the docker >> containerizer if memory serves. Also what version of docker? >> >> >> On Wednesday, August 10, 2016, Paul Bell <[email protected]> wrote: >> >>> Hello, >>> >>> One of our customers has twice encountered a problem wherein Mesos & >>> Marathon appear to lose track of the application containers that they >>> started. >>> >>> Platform & version info: >>> >>> Ubuntu 14.04 (running under VMware) >>> Mesos (master & agent): 0.23.0 >>> ZK: 3.4.5--1 >>> Marathon: 0.10.0 >>> >>> The phenomena: >>> >>> When I log into either the Mesos or Marathon UIs I see no evidence of >>> *any* tasks, active or completed. Yet, in the Linux shell, a "docker ps" >>> command shows the containers up & running. >>> >>> I've seen some confusing appearances before, but never this. For >>> example, I've seen what might be described as the *reverse* of the >>> above phenomena. I mean the case where a customer powers cycles the VM. In >>> such a case you typically see in Marathon's UI the (mere) appearance of the >>> containers up & running, but a "docker ps" command shows no containers >>> running. As folks on this list have explained to me, this is the result of >>> "stale state" and after 10 minutes (by default), Mesos figures out that the >>> supposedly active tasks aren't there and restarts them. >>> >>> But that's not the case here. I am hard-pressed to understand what >>> conditions/causes might lead to Mesos & Marathon becoming unaware of >>> containers that they started. >>> >>> I would be very grateful if someone could help me understand what's >>> going on here (so would our customer!). >>> >>> Thanks. >>> >>> -Paul >>> >>> >>> >> >> -- >> Text by Jeff, typos by iPhone >> > > -- ever tried. ever failed. no matter. try again. fail again. fail better. -- Samuel Beckett

