Hello Hardik,

Are you sure that this is slurm-related?
It might actually be something else i think.

If you want to be sure, check slurm logs and see if the shutdown it is
actually initialized by slurm.

I believe that there is something else initializing the shutdown.

Cheers,

On 14 April 2017 at 15:48, Hardik Kothari <[email protected]>
wrote:

> Dear all,
>
> I am handling a small cluster in our institute.
>
> I have noticed when a job uses more resources then available, the system
> receives a SIGTEM and then nodes goes in a down state.
>
> Apr 14 02:01:17 node19 journal: Runtime journal is using 8.0M (max 3.1G,
> leaving 4.0G of free 31.3G, current limit 3.1G).
> Apr 14 02:01:17 node19 journal: Runtime journal is using 8.0M (max 3.1G,
> leaving 4.0G of free 31.3G, current limit 3.1G).
> Apr 14 02:01:17 node19 systemd-journald: Received SIGTERM
>
> I have to put nodes abc in the idle state each time a user crosses this
> limit.
> Is there a way to handle this problem directly within slurm and which
> would avoid a node to go in the down state.
>
> Thanks,
> Hardik
>



-- 

[image: clustervision_logo.png]
Andrea Del Monaco
Internal Engineer


Mob: +31 64 166 4003
Skype: delmonaco.andrea
[email protected]

ClusterVision BV
Gyroscoopweg 56
1042 AC Amsterdam
The Netherlands
Tel: +31 20 407 7550
Fax: +31 84 759 8389
www.clustervision.com

Reply via email to