Well.. now i've change to cgroup.conf: ### # Slurm cgroup support configuration file ### CgroupAutomount=no CgroupMountpoint=/sys/fs/cgroup #CgroupReleaseAgentDir="/etc/slurm/cgroup" ConstrainCores=yes
#TaskAffinity=no # but still don't work. In fact, I'm using slurm inside docker container. Would it can cause problem with using slurm with cgroup? Sumin Han Undergraduate '13, School of Computing Korea Advanced Institute of Science and Technology Daehak-ro 291 Yuseong-gu, Daejeon Republic of Korea 305-701 Tel. +82-10-2075-6911 2017-08-02 13:34 GMT+09:00 Lachlan Musicman <data...@gmail.com>: > You will see here > > https://groups.google.com/forum/#!msg/slurm-devel/lKX8st9aztI/dF5Kvz4gDAAJ > > that you need to set > > CgroupAutomount=no > > in cgroup.conf > > if you are running a system using systemd > > cheers > L. > > ------ > "The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic > civics is the insistence that we cannot ignore the truth, nor should we > panic about it. It is a shared consciousness that our institutions have > failed and our ecosystem is collapsing, yet we are still here — and we are > creative agents who can shape our destinies. Apocalyptic civics is the > conviction that the only way out is through, and the only way through is > together. " > > *Greg Bloom* @greggish https://twitter.com/greggish/ > status/873177525903609857 > > On 2 August 2017 at 14:27, 한수민 <hsm6...@gmail.com> wrote: > >> My slurmd.log says: >> >> [2017-08-02T04:25:45.453] debug2: _file_read_content: unable to open >> '/sys/fs/cgroup/freezer//release_agent' for reading : No such file or >> directory >> [2017-08-02T04:25:45.453] debug2: xcgroup_get_param: unable to get >> parameter 'release_agent' for '/sys/fs/cgroup/freezer/' >> [2017-08-02T04:25:45.453] error: unable to mount freezer cgroup >> namespace: Device or resource busy >> [2017-08-02T04:25:45.453] error: unable to create freezer cgroup namespace >> [2017-08-02T04:25:45.453] error: Couldn't load specified plugin name for >> proctrack/cgroup: Plugin init() callback failed >> [2017-08-02T04:25:45.453] error: cannot create proctrack context for >> proctrack/cgroup >> [2017-08-02T04:25:45.453] error: slurmd initialization failed >> >> >> hmm... >> >> Sumin Han >> Undergraduate '13, School of Computing >> Korea Advanced Institute of Science and Technology >> Daehak-ro 291 >> Yuseong-gu, Daejeon >> Republic of Korea 305-701 >> Tel. +82-10-2075-6911 <+82%2010-2075-6911> >> >> 2017-08-02 13:05 GMT+09:00 Lachlan Musicman <data...@gmail.com>: >> >>> [root@n6 /]# si >>>> >>>> PARTITION NODES NODES(A/I/O/T) S:C:T MEMORY >>>> TMP_DISK TIMELIMIT AVAIL_FEATURES NODELIST >>>> >>>> debug* 6 0/6/0/6 1:4:2 7785 >>>> 113264 infinite (null) c[1-6] >>>> >>>> (for a moment) >>>> >>>> [root@n6 /]# si >>>> >>>> PARTITION NODES NODES(A/I/O/T) S:C:T MEMORY >>>> TMP_DISK TIMELIMIT AVAIL_FEATURES NODELIST >>>> >>>> debug* 6 0/0/6/6 1:4:2 7785 >>>> 113264 infinite (null) c[1-6] >>>> >>>> >>> >>> >>> >>> 0/0/6/6 means your nodes are dying. >>> >>> You need to look into the /var/log/slurm/slurmd.log (*or where ever you >>> put the slurmd logs on the machine, as dictated by >>> SlurmdLogFile= ) on each of the nodes. >>> >>> I would predict that there is something wrong with your cgroup.conf >>> >>> try: >>> >>> - confirming that /etc/slurm/cgroup directory exists on all nodes (as >>> per your cgroup.conf) >>> - commenting out everything in cgroup.conf except CgroupAutomount=yes >>> ConstrainCores=yes >>> >>> Cheers >>> L. >>> >>> >>> ------ >>> "The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic >>> civics is the insistence that we cannot ignore the truth, nor should we >>> panic about it. It is a shared consciousness that our institutions have >>> failed and our ecosystem is collapsing, yet we are still here — and we are >>> creative agents who can shape our destinies. Apocalyptic civics is the >>> conviction that the only way out is through, and the only way through is >>> together. " >>> >>> *Greg Bloom* @greggish https://twitter.com/greggish/s >>> tatus/873177525903609857 >>> >>> >> >