Re: OST Network suite is failing on "OSError: [Errno 28] No space left on device"

Yedidyah Bar David Tue, 20 Mar 2018 01:54:12 -0700

On Tue, Mar 20, 2018 at 10:11 AM, Barak Korren <[email protected]> wrote:
> On 20 March 2018 at 09:17, Yedidyah Bar David <[email protected]> wrote:
>> On Mon, Mar 19, 2018 at 6:56 PM, Dominik Holler <[email protected]> wrote:
>>> Thanks Gal, I expect the problem is fixed until something eats
>>> all space in /dev/shm.
>>> But the usage of /dev/shm is logged in the output, so we would be able
>>> to detect the problem next time instantly.
>>>
>>> From my point of view it would be good to know why /dev/shm was full,
>>> to prevent this situation in future.
>>
>> Gal already wrote below - it was because some build failed to clean up
>> after itself.
>>
>> I don't know about this specific case, but I was told that I am
>> personally causing such issues by using the 'cancel' button, so I
>> sadly stopped. Sadly, because our CI system is quite loaded and when I
>> know that some build is useless, I wish to kill it and save some
>> load...
>>
>> Back to your point, perhaps we should make jobs check /dev/shm when
>> they _start_, and either alert/fail/whatever if it's not almost free,
>> or, if we know what we are doing, just remove stuff there? That might
>> be much easier than fixing things to clean up in end, and/or debugging
>> why this cleaning failed.
>
> Sure thing, patches to:
>
>     [jenkins repo]/jobs/confs/shell-scripts/cleanup_slave.sh
>
> Are welcome, we often find interesting stuff to add there...
>
> If constrained for time, please turn this comment into an orderly RFE in 
> Jira...


Searched for '/dev/shm' and found way too many places to analyze them
all and add something to cleanup_slave to cover all.

Pushed this for now:

https://gerrit.ovirt.org/89215

>
> --
> Barak Korren
> RHV DevOps team , RHCE, RHCi
> Red Hat EMEA
> redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted



-- 
Didi
_______________________________________________
Infra mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/infra

Re: OST Network suite is failing on "OSError: [Errno 28] No space left on device"

Reply via email to