On 2016-01-29 17:04, David Roman wrote: > My problem is simple. I have 2 nodes, each with 8 cpus. I can use at the > same time a maximum of 16 cpus. In the first case Job_A use 8 cpus and Job_B > wait to use 16 cpus. But, in the other case, Job_B use 16 cpus, and Job_A use > 8 cpus in the same time. But 16+8 = 24 and it is great than 16 !
Can you cat /proc/cpuinfo -- I still think one of the nodes might not fit your configuration. As I tried to explain: depending on your real hardware Fastschedule=0 will consider this and not your configuration and suddenly the sequence of job submission is relevant. Anyway can you have a look via scontrol show -d job <jobidA> and scontrol show <jobidA> into details of both running jobs. For a quick glimpse. After that you can try to raise SlurmdDebug on the compute node and Slurmctld on the master up to 9 and inspect SlurmdLogFile on the compute node and SlurmctldLogFile on the master. To really get _all_ the details of the job allocation. Benjamin > David > > > De : Dennis Mungai [mailto:[email protected]] > Envoyé : vendredi 29 janvier 2016 16:18 > À : slurm-dev <[email protected]> > Objet : [slurm-dev] Re: Ressouces allocation problem > > > Can you change your consumable resources from CR_Core_Memory to CR_CPU_Memory? > On Jan 29, 2016 5:42 PM, Benjamin Redling > <[email protected]<mailto:[email protected]>> wrote: > > Am 29.01.2016 um 15:31 schrieb Dennis Mungai: >> Add SHARE=FORCE to your partition settings for each partition entry in >> the configuration file. > > https://computing.llnl.gov/linux/slurm/cons_res_share.html > > selection setting was: > SelectType=select/cons_res > SelectTypeParameters=CR_Core_Memory > > Shared=FORCE as you recommend leads to: > " > Cores are allocated to jobs. A core may run more than one job. > " > > What does that have to do with the problem? > Can you elaborate on that? > > /Benjamin > > >> On Jan 29, 2016 5:08 PM, David Roman >> <[email protected]<mailto:[email protected]>> wrote: >> Hello, >> >> I'm a newbies with SLURM. Perhaps could you help me to understand my >> mistake. >> >> I have 2 nodes (2 sockets with 4 core per socket = 8 CPUs per node) I >> created 3 partitions >> >> DEV with node2 >> OP with node1 >> LOW with node1 and node2 >> >> I created 2 jobs >> Job_A uses 8 CPUS in partion DEV >> Job_B uses 16 CPUS in partion LOW >> >> If I start Job_A before Job_B, all is ok. Job_A is in RUNNING state and >> Job_B is in PENDING state >> >> BUT, If I start Job_B before Job_A. The both jobs are in RUNNING state. >> >> Thanks for your help, >> >> David. >> >> >> Here my slurm.conf without comments >> >> ClusterName=Noveltits >> ControlMachine=slurm >> SlurmUser=slurm >> SlurmctldPort=6817 >> SlurmdPort=6818 >> AuthType=auth/munge >> StateSaveLocation=/tmp >> SlurmdSpoolDir=/tmp/slurmd >> SwitchType=switch/none >> MpiDefault=none >> SlurmctldPidFile=/var/run/slurmctld.pid >> SlurmdPidFile=/var/run/slurmd.pid >> ProctrackType=proctrack/pgid >> CacheGroups=0 >> ReturnToService=0 >> SlurmctldTimeout=300 >> SlurmdTimeout=300 >> InactiveLimit=0 >> MinJobAge=300 >> KillWait=30 >> Waittime=0 >> SchedulerType=sched/backfill >> SelectType=select/cons_res >> SelectTypeParameters=CR_CORE_Memory >> FastSchedule=0 >> SlurmctldDebug=3 >> SlurmdDebug=3 >> JobCompType=jobcomp/none >> >> PreemptMode=SUSPEND,GANG >> PreemptType=preempt/partition_prio >> >> >> NodeName=slurm_node[1-2] CPUs=8 SocketsPerBoard=2 CoresPerSocket=4 >> ThreadsPerCore=1 >> PartitionName=op Nodes=slurm_node1 Priority=100 Default=No >> MaxTime=INFINITE State=UP >> PartitionName=dev Nodes=slurm_node2 Priority=1 Default=yes >> MaxTime=INFINITE State=UP PreemptMode=OFF >> PartitionName=low Nodes=slurm_node[1-2] Priority=1 Default=No >> MaxTime=INFINITE State=UP >> >> >> ______________________________________________________________________ >> >> This e-mail contains information which is confidential. It is intended >> only for the use of the named recipient. If you have received this >> e-mail in error, please let us know by replying to the sender, and >> immediately delete it from your system. Please note, that in these >> circumstances, the use, disclosure, distribution or copying of this >> information is strictly prohibited. KEMRI-Wellcome Trust Programme >> cannot accept any responsibility for the accuracy or completeness of >> this message as it has been transmitted over a public network. Although >> the Programme has taken reasonable precautions to ensure no viruses are >> present in emails, it cannot accept responsibility for any loss or >> damage arising from the use of the email or attachments. Any views >> expressed in this message are those of the individual sender, except >> where the sender specifically states them to be the views of >> KEMRI-Wellcome Trust Programme. >> ______________________________________________________________________ > > -- > FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html > vox: +49 3641 9 44323 | fax: +49 3641 9 44321 > > ______________________________________________________________________ > > This e-mail contains information which is confidential. It is intended only > for the use of the named recipient. If you have received this e-mail in > error, please let us know by replying to the sender, and immediately delete > it from your system. Please note, that in these circumstances, the use, > disclosure, distribution or copying of this information is strictly > prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility > for the accuracy or completeness of this message as it has been transmitted > over a public network. Although the Programme has taken reasonable > precautions to ensure no viruses are present in emails, it cannot accept > responsibility for any loss or damage arising from the use of the email or > attachments. Any views expressed in this message are those of the individual > sender, except where the sender specifically states them to be the views of > KEMRI-Wellcome Trust Programme. > ______________________________________________________________________ > -- FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html vox: +49 3641 9 44323 | fax: +49 3641 9 44321
