That was certainly my opinion as well and I would not have believed it 
unless the users showed me two different squeue outputs showing the job running 
on a certain node at one point and then a different node later. They claim that 
nothing was done to change the job after launching it. Unfortunately I didn't 
have any accounting method set up at the time so I will have to wait until if / 
when it happens again to look at the detailed history. I will update the threat 
if new data emerges. 

        Thanks,
        ~Mike C.  

-----Original Message-----
From: Moe Jette 
[http://lists.schedmd.com/cgi-bin/dada/mail.cgi/r/slurmdev/121929059028/] 
Sent: Friday, March 22, 2013 12:46 PM
To: slurm-dev; Michael Colonno
Subject: Re: [slurm-dev] Re: node switching / selection

Slurm does not move jobs between nodes without someone explicitly removing 
nodes and adding other nodes to the job allocation.


Quoting Michael Colonno <[email protected]>:

>             This was seen with –N 1, restricting to one node. I’m not 
> even certain what to call this feature / issue.
>
>
>
>             Thanks,
>
>             ~Mike C.
>
>
>
> From: David Bigagli
> [http://lists.schedmd.com/cgi-bin/dada/mail.cgi/r/slurmdev/995718609080/
> 5/]
> Sent: Friday, March 22, 2013 9:42 AM
> To: slurm-dev
> Subject: [slurm-dev] Re: node switching / selection
>
>
>
> Is it possible the job runs on several nodes, say -N 3, then one node 
> is lost so it ends up running on 2 nodes only? Such a job should have 
> been submitted with ---no-kill.
>
>
>
> /David
>
> On Fri, Mar 22, 2013 at 4:06 PM, Michael Colonno  
> <[email protected]  
> <http://lists.schedmd.com/cgi-bin/dada/mail.cgi/r/slurmdev/468717416052/> >  
> wrote:
>
>
>         Actually did mean node below. The job launched on a node and  
> then, with no user input, later appeared to be running (or trying to  
> run) on a different node. This is rare but happens from time to  
> time. I'm not sure if this is the default scheduling algorithm  
> trying make things fit better.
>
>         Cheers,
>         ~Mike C.
>
>
> -----Original Message-----
> From: Marcin Stolarek  
> [http://lists.schedmd.com/cgi-bin/dada/mail.cgi/r/slurmdev/983453023251/  
> <http://lists.schedmd.com/cgi-bin/dada/mail.cgi/r/slurmdev/418469732284/>  
> ]
> Sent: Friday, March 22, 2013 1:43 AM
> To: slurm-dev
> Subject: [slurm-dev] Re: node switching / selection
>
>
> 2013/3/22 Michael Colonno <[email protected]  
> <http://lists.schedmd.com/cgi-bin/dada/mail.cgi/r/slurmdev/468717416052/>  
> >
>>
>>
>>         Hi Folks ~
>
> Hi,
>>
>>
>>         A couple (hopefully) simple questions; I can't find  
>> anything that obviously / easily solves these in the man pages. I  
>> have a fairly ordinary deployment in which scheduling is done by  
>> core so some high-memory systems can be shared.
>>
>>         - Users have observed that sometimes jobs are being moved  
>> from one node to another while running. This makes the particular  
>> tool being used unhappy. Is there a >way to prevent this either  
>> with a flag or config file entry?
>
> by node you mean cpu?
>
> If so using ProctrackType=proctrack/cgroup (check man for  
> cgroup.conf)  should solve your problem, if you are using non cgroup  
> aware kernel (for instance RHEL 5) you can use cpuset spank plugin.
>
>
>>         - When scheduling by core the default behavior seems to be  
>> to fill up the first node with tasks, then move to the second, etc.  
>> Since memory is being shared between >tasks it would be preferable  
>> to select a node on which no other jobs (or the minimum number of  
>> other jobs) are running before piling onto a node already running a  
>> job(s). >How can a tell SLURM the equivalent of "pick an unused  
>> node first if available".
>
>
> I'm not sure if it's possible. Do we have possibility of changing  
> node allocation algorithm in slurm (like in moab/maui?)
>
>
> cheers,
> marcin
>
>
>
> Image removed by sender.
>
>


Reply via email to