Is it possible the job runs on several nodes, say -N 3, then one node is lost so it ends up running on 2 nodes only? Such a job should have been submitted with ---no-kill.
/David On Fri, Mar 22, 2013 at 4:06 PM, Michael Colonno <[email protected]>wrote: > > Actually did mean node below. The job launched on a node and then, > with no user input, later appeared to be running (or trying to run) on a > different node. This is rare but happens from time to time. I'm not sure if > this is the default scheduling algorithm trying make things fit better. > > Cheers, > ~Mike C. > > -----Original Message----- > From: Marcin Stolarek [ > http://lists.schedmd.com/cgi-bin/dada/mail.cgi/r/slurmdev/520036565912/] > Sent: Friday, March 22, 2013 1:43 AM > To: slurm-dev > Subject: [slurm-dev] Re: node switching / selection > > > 2013/3/22 Michael Colonno <[email protected]> > > > > > > Hi Folks ~ > > Hi, > > > > > > A couple (hopefully) simple questions; I can't find anything > that obviously / easily solves these in the man pages. I have a fairly > ordinary deployment in which scheduling is done by core so some high-memory > systems can be shared. > > > > - Users have observed that sometimes jobs are being moved from > one node to another while running. This makes the particular tool being > used unhappy. Is there a >way to prevent this either with a flag or config > file entry? > > by node you mean cpu? > > If so using ProctrackType=proctrack/cgroup (check man for cgroup.conf) > should solve your problem, if you are using non cgroup aware kernel (for > instance RHEL 5) you can use cpuset spank plugin. > > > > - When scheduling by core the default behavior seems to be to > fill up the first node with tasks, then move to the second, etc. Since > memory is being shared between >tasks it would be preferable to select a > node on which no other jobs (or the minimum number of other jobs) are > running before piling onto a node already running a job(s). >How can a tell > SLURM the equivalent of "pick an unused node first if available". > > > I'm not sure if it's possible. Do we have possibility of changing node > allocation algorithm in slurm (like in moab/maui?) > > > cheers, > marcin >
