Slurm version 16.05.6 also includes a new node_features/knl_generic
plugin, which can allow regular users to modify NUMA and MCDRAM modes of
KNL nodes. For more information see:
http://slurm.schedmd.com/intel_knl.html
On 2016-10-27 16:36, Danny Auble wrote:
Slurm version 16.05.6 is now
On 28 October 2016 at 09:20, Christopher Samuel
wrote:
>
> On 28/10/16 08:44, Lachlan Musicman wrote:
>
> > So I checked the system, noticed that one node was drained, resumed it.
> > Then I tried both
> >
> > scontrol requeue 230591
> > scontrol resume 230591
>
> What
Slurm version 16.05.6 is now available and includes around 40 bug fixes
developed over the past month.
We have also made the third pre-release of version 17.02, which is under
development and scheduled for release in February 2017.
Slurm downloads are available from
On 28/10/16 08:44, Lachlan Musicman wrote:
> So I checked the system, noticed that one node was drained, resumed it.
> Then I tried both
>
> scontrol requeue 230591
> scontrol resume 230591
What happens if you "scontrol hold" it first before "scontrol release"'ing it?
--
Christopher Samuel
Morning,
Yesterday we had some internal network issues that caused havoc on our
system. By the end of the day everything was ok on the whole.
This morning I came in to see one job on the queue (which was otherwise
relatively quiet) with the error message/Nodelist Reason (launch failed
requeued
Hi Ole,
I don’t see a reason for a firewall to exist on a compute node, is it a
requirement on your new cluster? If not, disable it. I don’t see Moe’s
statement as saying that you can’t have a firewall, just that if there is one,
you should open it up to allow all slurm communication.
Best,
You might want to check out my Wiki-page for setting up Slurm on CentOS
7.2: https://wiki.fysik.dtu.dk/niflheim/SLURM.
Perhaps you'll solve the problem using this information?
On 10/27/2016 04:14 PM, Mikhail Kuzminsky wrote:
I worked w/PBS and SGE; now I'm beginner w/slurm, and installed
In the process of developing our new cluster using Slurm, I've been
bitten by the firewall settings on the compute nodes preventing MPI jobs
from spawning tasks on remote nodes.
I now believe that Slurm actually has a requirement that compute nodes
must have their Linux firewall disabled.
On 10/27/2016 09:42 AM, Loris Bennett wrote:
So is restarting slurmctld the only way to let it pick up changes in slurm.conf?
No. You can also do
scontrol reconfigure
This does not restart slurmctld.
Question: How are the slurmd daemons notified about the changes in
slurm.conf? Will
Tuo Chen Peng writes:
> I thought ‘scontrol update’ command is for letting slurmctld to pick up any
> change in slurm.conf.
>
> But after reading the manual again, it seems this command is instead to change
> the setting at runtime, instead of reading any change from
Baker D.J. writes:
> Hello,
>
> Looking at the Slurm documentation I see that it is possible to handle basic
> license management (this is the link http://slurm.schedmd.com/licenses.html).
> In
> other words software licenses can be treated as a resource, however things
Hi Benjamin
Thank you for your response.
In fact, i forgot that i set OverTimeLimit to 10 min, by which a job can
exceed its time limit before being canceled. That is why the job runs
beyond the time limit.
Thank you again and regards,
Hamza
On 26 October 2016 at 23:11, Benjamin Redling
12 matches
Mail list logo