[slurm-dev] Re: Send notification email

2016-10-06 Thread Christopher Samuel
On 06/10/16 03:07, Fanny Pagés Díaz wrote: > Oct 5 11:34:52 compute-0-3 postfix/smtp[6469]: connect to > 10.8.52.254[10.8.52.254]:25: Connection refused So you are blocked from connecting to the mail server you are trying to talk to on port 25/tcp (SMTP) - you need to get that opened up. --

[slurm-dev] Re: Best way to control synchronized clocks in cluster?

2016-10-06 Thread Christopher Samuel
On 07/10/16 01:17, Per Lönnborg wrote: > But what is the preferred way to check that the compute nodes on our > have correct time, and if not, see to it that Slurm doesn�t allocate > these nodes to perform tasks? We run NTP everywhere - we have to because GPFS depends on correct clocks as well

[slurm-dev] Slurm User Group 2016 presentations online

2016-10-06 Thread Tim Wickberg
Many thanks to all the attendees, and especially to all those who presented at the Slurm User Group 2016 meeting in Athens. PDFs of the presentations are online now at http://slurm.schedmd.com/publications.html For those of you who will be at SC16 in Salt Lake City - we hope to see you stop

[slurm-dev] Gres function node_config_load() clarification

2016-10-06 Thread Daniel Letai
In the gres.conf man page it's mentioned that "If generic resource counts are set by the gres plugin function node_config_load(), this file may be optional." When looking at http://slurm.schedmd.com/gres_plugins.html I can't figure out from the description for node_config_load() how to remove

[slurm-dev] Re: Best way to control synchronized clocks in cluster?

2016-10-06 Thread Daniel Letai
One simple thing to do is enable 1http://slurm.schedmd.com/slurm.conf.html#OPT_HealthCheckProgram and use a simple script along the lines of: #!/bin/bash ntpdate -u ntpsserver.cluster.local ; rc=$? [[ rc -ne 0 ]] && scontrol update NodeName=$HOSTNAME State=drain

[slurm-dev] Re: Send notification email

2016-10-06 Thread John Hearns
Fany, Is there a reason why you are choosing to use ssmtp rather than Postfix? I ask because I know a little about Postfix, But nothing about ssmtp! Please look at this page on diagnostics for email: https://www.port25.com/how-to-check-an-smtp-connection-with-a-manual-telnet-session-2/ You could

[slurm-dev] Best way to control synchronized clocks in cluster?

2016-10-06 Thread Per Lönnborg
Hi, as a sysadmin, I know the importance of keeping correct time on "things". We use (of course) ntp for that. But what is the preferred way to check that the compute nodes on our have correct time, and if not, see to it that Slurm doesn´t allocate these nodes to perform tasks? For about a year

[slurm-dev] Accounting missunderstood

2016-10-06 Thread Pardo Diaz, Alfonso
Hello, We have a problem or a misunderstanding with CPU and Nodes accounting. Our nodes have 8 CPU (hyper threading is not enabled) and exclusive (oversubscribe=no and ExclusiveUser=YES) by default. When a user submit a job with “-N 1” with 1 hour of duration, this node will be used