Actually did mean node below. The job launched on a node and then, with
no user input, later appeared to be running (or trying to run) on a different
node. This is rare but happens from time to time. I'm not sure if this is the
default scheduling algorithm trying make things fit better.
Cheers,
~Mike C.
-----Original Message-----
From: Marcin Stolarek
[http://lists.schedmd.com/cgi-bin/dada/mail.cgi/r/slurmdev/284278332309/]
Sent: Friday, March 22, 2013 1:43 AM
To: slurm-dev
Subject: [slurm-dev] Re: node switching / selection
2013/3/22 Michael Colonno <[email protected]>
>
>
> Hi Folks ~
Hi,
>
>
> A couple (hopefully) simple questions; I can't find anything that
> obviously / easily solves these in the man pages. I have a fairly ordinary
> deployment in which scheduling is done by core so some high-memory systems
> can be shared.
>
> - Users have observed that sometimes jobs are being moved from one
> node to another while running. This makes the particular tool being used
> unhappy. Is there a >way to prevent this either with a flag or config file
> entry?
by node you mean cpu?
If so using ProctrackType=proctrack/cgroup (check man for cgroup.conf) should
solve your problem, if you are using non cgroup aware kernel (for instance RHEL
5) you can use cpuset spank plugin.
> - When scheduling by core the default behavior seems to be to fill up
> the first node with tasks, then move to the second, etc. Since memory is
> being shared between >tasks it would be preferable to select a node on which
> no other jobs (or the minimum number of other jobs) are running before piling
> onto a node already running a job(s). >How can a tell SLURM the equivalent of
> "pick an unused node first if available".
I'm not sure if it's possible. Do we have possibility of changing node
allocation algorithm in slurm (like in moab/maui?)
cheers,
marcin