On 2016-01-29 17:04, David Roman wrote:
> My problem is simple. I have 2 nodes, each  with 8 cpus. I can use at the 
> same time a maximum of 16 cpus. In the first case Job_A use 8 cpus and Job_B 
> wait to use 16 cpus. But, in the other case, Job_B use 16 cpus, and Job_A use 
> 8 cpus in the same time. But 16+8 = 24 and it is great than 16 !

Can you cat /proc/cpuinfo -- I still think one of the nodes might not
fit your configuration.

As I tried to explain: depending on your real hardware Fastschedule=0
will consider this and not your configuration and suddenly the sequence
of job submission is relevant.

Anyway can you have a look via scontrol show -d job <jobidA> and
scontrol show <jobidA> into details of both running jobs. For a quick
glimpse.

After that you can try to raise SlurmdDebug on the compute node and
Slurmctld on the master up to 9 and inspect SlurmdLogFile on the compute
node and SlurmctldLogFile on the master. To really get _all_ the details
of the job allocation.

Benjamin


> David
> 
> 
> De : Dennis Mungai [mailto:[email protected]]
> Envoyé : vendredi 29 janvier 2016 16:18
> À : slurm-dev <[email protected]>
> Objet : [slurm-dev] Re: Ressouces allocation problem
> 
> 
> Can you change your consumable resources from CR_Core_Memory to CR_CPU_Memory?
> On Jan 29, 2016 5:42 PM, Benjamin Redling 
> <[email protected]<mailto:[email protected]>> wrote:
> 
> Am 29.01.2016 um 15:31 schrieb Dennis Mungai:
>> Add SHARE=FORCE to your partition settings for each partition entry in
>> the configuration file.
> 
> https://computing.llnl.gov/linux/slurm/cons_res_share.html
> 
> selection setting was:
> SelectType=select/cons_res
> SelectTypeParameters=CR_Core_Memory
> 
> Shared=FORCE as you recommend leads to:
> "
> Cores are allocated to jobs. A core may run more than one job.
> "
> 
> What does that have to do with the problem?
> Can you elaborate on that?
> 
> /Benjamin
> 
> 
>> On Jan 29, 2016 5:08 PM, David Roman 
>> <[email protected]<mailto:[email protected]>> wrote:
>> Hello,
>>
>> I'm a newbies with SLURM. Perhaps could you help me to understand my
>> mistake.
>>
>> I have 2 nodes (2 sockets with 4 core per socket = 8 CPUs per node) I
>> created 3 partitions
>>
>> DEV with node2
>> OP    with node1
>> LOW with node1 and node2
>>
>> I created 2 jobs
>> Job_A uses 8 CPUS in partion DEV
>> Job_B uses 16 CPUS in partion LOW
>>
>> If I start Job_A before Job_B, all is ok. Job_A is in RUNNING state and
>> Job_B is in PENDING state
>>
>> BUT, If I start Job_B before Job_A. The both jobs are in RUNNING state.
>>
>> Thanks for your help,
>>
>> David.
>>
>>
>> Here my slurm.conf without comments
>>
>> ClusterName=Noveltits
>> ControlMachine=slurm
>> SlurmUser=slurm
>> SlurmctldPort=6817
>> SlurmdPort=6818
>> AuthType=auth/munge
>> StateSaveLocation=/tmp
>> SlurmdSpoolDir=/tmp/slurmd
>> SwitchType=switch/none
>> MpiDefault=none
>> SlurmctldPidFile=/var/run/slurmctld.pid
>> SlurmdPidFile=/var/run/slurmd.pid
>> ProctrackType=proctrack/pgid
>> CacheGroups=0
>> ReturnToService=0
>> SlurmctldTimeout=300
>> SlurmdTimeout=300
>> InactiveLimit=0
>> MinJobAge=300
>> KillWait=30
>> Waittime=0
>> SchedulerType=sched/backfill
>> SelectType=select/cons_res
>> SelectTypeParameters=CR_CORE_Memory
>> FastSchedule=0
>> SlurmctldDebug=3
>> SlurmdDebug=3
>> JobCompType=jobcomp/none
>>
>> PreemptMode=SUSPEND,GANG
>> PreemptType=preempt/partition_prio
>>
>>
>> NodeName=slurm_node[1-2] CPUs=8 SocketsPerBoard=2 CoresPerSocket=4
>> ThreadsPerCore=1
>> PartitionName=op  Nodes=slurm_node1     Priority=100 Default=No
>> MaxTime=INFINITE State=UP
>> PartitionName=dev Nodes=slurm_node2     Priority=1   Default=yes
>> MaxTime=INFINITE State=UP PreemptMode=OFF
>> PartitionName=low Nodes=slurm_node[1-2] Priority=1   Default=No
>> MaxTime=INFINITE State=UP
>>
>>
>> ______________________________________________________________________
>>
>> This e-mail contains information which is confidential. It is intended
>> only for the use of the named recipient. If you have received this
>> e-mail in error, please let us know by replying to the sender, and
>> immediately delete it from your system. Please note, that in these
>> circumstances, the use, disclosure, distribution or copying of this
>> information is strictly prohibited. KEMRI-Wellcome Trust Programme
>> cannot accept any responsibility for the accuracy or completeness of
>> this message as it has been transmitted over a public network. Although
>> the Programme has taken reasonable precautions to ensure no viruses are
>> present in emails, it cannot accept responsibility for any loss or
>> damage arising from the use of the email or attachments. Any views
>> expressed in this message are those of the individual sender, except
>> where the sender specifically states them to be the views of
>> KEMRI-Wellcome Trust Programme.
>> ______________________________________________________________________
> 
> --
> FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html
> vox: +49 3641 9 44323 | fax: +49 3641 9 44321
> 
> ______________________________________________________________________
> 
> This e-mail contains information which is confidential. It is intended only 
> for the use of the named recipient. If you have received this e-mail in 
> error, please let us know by replying to the sender, and immediately delete 
> it from your system. Please note, that in these circumstances, the use, 
> disclosure, distribution or copying of this information is strictly 
> prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility 
> for the accuracy or completeness of this message as it has been transmitted 
> over a public network. Although the Programme has taken reasonable 
> precautions to ensure no viruses are present in emails, it cannot accept 
> responsibility for any loss or damage arising from the use of the email or 
> attachments. Any views expressed in this message are those of the individual 
> sender, except where the sender specifically states them to be the views of 
> KEMRI-Wellcome Trust Programme.
> ______________________________________________________________________
> 


-- 
FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html
vox: +49 3641 9 44323 | fax: +49 3641 9 44321

Reply via email to