Logs please.

On Wed, Oct 21, 2015 at 12:44 PM, John Omernik <[email protected]> wrote:

> I am running 0.24.
>
> I am running some tasks in marathon, and when they hit an OOM condition a
> task is killed that is expected. Than I get a bunch of errors related to
> "Failed to read "meory.limit_in_bytes', 'memory.max_usage_in_bytes' and
> memory.stat.
>
> In addition the task tries to restart but keeps failing.
>
> A few notes, when the tasks fails, the sandbox becomes unavailable making
> troubleshooting difficult. When this has occurred before, it seemed the
> only way to get things working was to stop the slave, clear out the tmp
> directory, and start it again. I'd like to understand why my task won't get
> moving again.
>
> There are also lots of errors related to "failed to clean up isolator" and
> invalid cgroups, I can get specific logs if people think it's needed.  I am
> thinking it's related to checkpointing or something like that? I.e. an
> executor hit the OOM got killed, and it is trying to start back up, but
> something isn't right?
>
> I know this is a jumped unorganized question, I can logs if needed.
>
>
>

Reply via email to