Re: [foreman-dev] Jenkins slaves not responding

Dominic Cleal Thu, 01 Sep 2016 01:40:43 -0700

On 01/09/16 09:00, Timo Goebel wrote:
> Dominic,
> 
>> On 31.08.2016, at 09:43, Dominic Cleal <[email protected]> wrote:
>>
>> The slave that I attempted this on crashed within a day, so they'll
>> remain on two slots each.
> 
> Do you know why these crashes happen? Are there any monitoring graphs, that 
> show cpu/memory usage over time? The issue sounds like a oom problem judging 
> from what i read here. Do you see anything related in the system's logs?


No, I don't know precisely and the slave that crashed yesterday doesn't
have the Rackspace monitoring agent installed to collect memory data.

slave09, one of those that crashed on the morning of 16/08, was showing
high memory and swap usage on 15/08
(http://paste.fedoraproject.org/418780/47271889/), so memory exhaustion
is quite likely. I don't have any data about what processes were using
the memory in this instance.

slave10, that crashed yesterday only logged hanging tasks
(https://paste.fedoraproject.org/418772/72718379/), which I've seen on
consoles and logs from other crashed systems. There's no ability to send
a sysrq through the provided console, and nothing like kdump/watchdogs
are configured on those systems.

-- 
Dominic Cleal
[email protected]

-- 
You received this message because you are subscribed to the Google Groups 
"foreman-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [foreman-dev] Jenkins slaves not responding

Reply via email to