Re: [Jenkins] poor handling of offline slaves

Nicolas Lalevée Fri, 01 Jun 2012 05:54:44 -0700

Le 1 juin 2012 à 10:45, Kristian Waagan a écrit :

> On 01.06.12 10:35, Nicolas Lalevée wrote:
>> 
>> Le 1 juin 2012 à 10:03, Kristian Waagan a écrit :
>> 
>>> On 01.06.12 05:46, Gavin McDonald wrote:
>>>>>>  If many projects are configured to run on multiple operating systems, of
>>>>>>  which two have only one slave (Windows and Solaris), these projects may
>>>>>>  cause jobs to pile up on Linux. Maybe there are other mechanisms in 
>>>>>> place to
>>>>>>  deal with this, I don't know.
>>>> Not sure what you mean, jobs run independent of each other on multiple 
>>>> slaves.
>>>> 
>>> 
>>> From what I could see, jobs configured to run on multiple slaves using the 
>>> "Configuration Matrix" plugin/feature will hang on to the current slave 
>>> while waiting for the next one. For instance, commons-vfs-trunk had been 
>>> running for five days and was occupying one executor on ubuntuX while 
>>> waiting for windows1 to become available. The timeout was set to 188 
>>> minutes, so waiting for the next slave doesn't seem to count as being stuck.
>>> 
>>> The two other jobs I mentioned are also using the Configuration Matrix 
>>> feature.
>>> 
>>> Of course, this will only be a problem if the system is overloaded, or a 
>>> slave, or group of slaves, is off line for a longer period of time and 
>>> these jobs eat up the executor slots on the healthy slaves.
>> 
>> A "Matrix" job is not consuming any executor actually. It only trigger jobs 
>> and monitor then. Notice how Jenkins is displaying them while they are 
>> running, they are not in the first two boxes of a slave (the executor 
>> slots), they are in a extra one.
> 
> Ah, I see.
> Thanks for that explanation, Nicolas.
> 
> That only leaves why the job doesn't time out, but maybe that's as designed 
> too?


I don't know.
I think they should time out too, so the job maintainers get notified.

Nicolas

Re: [Jenkins] poor handling of offline slaves

Reply via email to