The fact that the slave is retrying means that the TASK_FAILED status hasn't reached the master or the scheduler or the acknowledgement hasn't reached the slave. Since the master hasn't released the resources for the task (from what you say), I imagine it's the former.
What do master logs say? On Wed, Nov 25, 2015 at 2:01 AM, James Vanns <[email protected]> wrote: > Er, I could. At the moment it's pretty huge so maybe I'll just try and > trim it down a bit. I've noticed that Chronos does the same, actually. > There is a task that is 'active' and still holding onto resources yet it > has already completed unsuccessfully with TASK_FAILED (16hrs ago!) state. > Attached is a log of the events from the mesos slave that executed this > particular Chronos task (before it continues to forward the same state over > and over). Note that the last pair of lines is repeated ad-infinitum. I can > confirm that this Chronos framework with the same ID is still running. > > Sorry to switch frameworks suddenly - this was simpler because it was one > task instead of 100s. > > Jim > > On 24 November 2015 at 17:57, Vinod Kone <[email protected]> wrote: > >> Can you paste the logs? >> >> On Tue, Nov 24, 2015 at 2:16 AM, James Vanns <[email protected]> >> wrote: >> >>> Hi again list. >>> >>> Mesos 0.24 >>> C++ Framework (still using the Protobufs based comms, not REST) >>> >>> My framework appears to be holding onto offers (somehow) from tasks that >>> are finished!? I don't understand why. The task comprises of a shell >>> command that executes within a docker container. >>> The return code to the OS from the shell command is indeed zero for >>> success, which Mesos honours and transitions to TASK_FINISHED state. >>> However, using the UI these still register as 'active' (though acknowledged >>> as FINISHED) and thus the resources are not yet freed. >>> >>> Any pointers appreciated! >>> >>> Cheers, >>> >>> Jim >>> >>> -- >>> Senior Code Pig >>> Industrial Light & Magic >>> >> >> > > > -- > -- > Senior Code Pig > Industrial Light & Magic >

