Re: [slurm-users] monitoring and update regime for Power Saving nodes

2022-02-24 Thread Tina Friedrich
users On Behalf Of Tina Friedrich Sent: 24 February 2022 09:43 To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] monitoring and update regime for Power Saving nodes Hi David, it's also not actually a problem if the slurm.conf is not exactly the same immediately on boot - really

Re: [slurm-users] monitoring and update regime for Power Saving nodes

2022-02-24 Thread David Simpson
dmd.com Subject: Re: [slurm-users] monitoring and update regime for Power Saving nodes Hi David, it's also not actually a problem if the slurm.conf is not exactly the same immediately on boot - really. Unless there's changes that are very fundamental, nothing bad will happen if they pick

Re: [slurm-users] monitoring and update regime for Power Saving nodes

2022-02-24 Thread Hermann Schwärzler
*Sent:* 23 February 2022 15:27 *To:* slurm-users@lists.schedmd.com *Subject:* Re: [slurm-users] monitoring and update regime for Power Saving nodes *External email to Cardiff University - *Take care when replying/opening attachments or links. *Nid ebost mewnol o Brifysgol Caerdydd yw hwn - *Cymerwch ofal

Re: [slurm-users] monitoring and update regime for Power Saving nodes

2022-02-24 Thread Tina Friedrich
Andrus *Sent:* 23 February 2022 15:27 *To:* slurm-users@lists.schedmd.com *Subject:* Re: [slurm-users] monitoring and update regime for Power Saving nodes *External email to Cardiff University - *Take care when replying/opening attachments or links. *Nid ebost mewnol o Brifysgol Caerdydd y

Re: [slurm-users] monitoring and update regime for Power Saving nodes

2022-02-24 Thread David Simpson
ebruary 2022 15:27 To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] monitoring and update regime for Power Saving nodes External email to Cardiff University - Take care when replying/opening attachments or links. Nid ebost mewnol o Brifysgol Caerdydd yw hwn - Cymerwch ofal wrth ateb

Re: [slurm-users] monitoring and update regime for Power Saving nodes

2022-02-23 Thread Brian Andrus
David, For monitoring, I use a combination of netdata+prometheus. Data is gathered whenever the nodes are up and stored for history. Yes, when the nodes are powered down, there are empty gaps, but that is interpreted as the node is powered down. For the config, I have no access to DNS for co

Re: [slurm-users] monitoring and update regime for Power Saving nodes

2022-02-23 Thread Bjørn-Helge Mevik
David Simpson writes: > * When you want to make changes to slurm.conf (or anything else) to > a node which is down due to power saving (during a > maintenance/reservation) what is your approach? Do you end up with 2 > slurm.confs (one for power saving and one that keeps everything up, to > work

[slurm-users] monitoring and update regime for Power Saving nodes

2022-02-23 Thread David Simpson
Hi all, Interested to know what common approaches were to: * Monitoring of power saving nodes (e.g. health of the node), when potentially the monitoring system will see it go up and down. Do you limit to BMC only monitoring/health? * When you want to make changes to slurm.conf (or anyt