Yes this is fixed in commit c8f34560c87c in 14.03.11 which has not been release yet. However it is straightforward to back port.

On 11/18/2014 07:52 AM, Trey Dockendorf wrote:
As far as I've experienced, those error messages are harmless.  We
updated from 14.03.6 to 14.03.10 and one motivation was those errors are
no longer in user job output.  It may have been 14.03.9 that the solved
that issue of outputting the error to user output. I still see it in my
slurmd logs, but appears to not be causing problems.  This is on
14.03.10 with CentOS 6.5.

- Trey

On Nov 18, 2014 3:38 AM, "Bjørn-Helge Mevik" <[email protected]
<mailto:[email protected]>> wrote:


    We are running slurm 14.03.7, and using cgroups to limit memory usage.

    We see this type of message in slurmd.log after each job:

    [2014-11-18T03:32:15.316] [8990773] _slurm_cgroup_destroy: problem
    deleting step cgroup path
    /dev/cgroup/freezer/slurm/uid_126634/job_8990773/step_batch: Device
    or resource busy

    Recently, one of our users reported seeing the following message in the
    slurm-NNN.out file:

    slurmstepd: _slurm_cgroup_destroy: problem deleting step cgroup path
    /dev/cgroup/freezer/slurm/uid_168662/job_8985631/step_batch: Device or
    resource busy

    When checking afterwards, it seems the .../uid_NNN directories are
    removed.

    Does anyone know what these messages mean?  Should we just ignore them?

    --
    Regards,
    Bjørn-Helge Mevik, dr. scient,
    Department for Research Computing, University of Oslo


--

Thanks,
      /David/Bigagli

www.schedmd.com

Reply via email to