As far as I understood Slurm with setting Share=FORCE you risk over-committing.
/Benjamin On 2016-01-29 16:10, Dennis Mungai wrote: > And with the SHARE=FORCE:8 parameter, each consumable processor, socket or > core can be shared by 8 jobs, as an example. > > On Jan 29, 2016 5:08 PM, David Roman <[email protected]> wrote: > Hello, > > I'm a newbies with SLURM. Perhaps could you help me to understand my mistake. > > I have 2 nodes (2 sockets with 4 core per socket = 8 CPUs per node) I created > 3 partitions > > DEV with node2 > OP with node1 > LOW with node1 and node2 > > I created 2 jobs > Job_A uses 8 CPUS in partion DEV > Job_B uses 16 CPUS in partion LOW > > If I start Job_A before Job_B, all is ok. Job_A is in RUNNING state and Job_B > is in PENDING state > > BUT, If I start Job_B before Job_A. The both jobs are in RUNNING state. > > Thanks for your help, > > David. > > > Here my slurm.conf without comments > > ClusterName=Noveltits > ControlMachine=slurm > SlurmUser=slurm > SlurmctldPort=6817 > SlurmdPort=6818 > AuthType=auth/munge > StateSaveLocation=/tmp > SlurmdSpoolDir=/tmp/slurmd > SwitchType=switch/none > MpiDefault=none > SlurmctldPidFile=/var/run/slurmctld.pid > SlurmdPidFile=/var/run/slurmd.pid > ProctrackType=proctrack/pgid > CacheGroups=0 > ReturnToService=0 > SlurmctldTimeout=300 > SlurmdTimeout=300 > InactiveLimit=0 > MinJobAge=300 > KillWait=30 > Waittime=0 > SchedulerType=sched/backfill > SelectType=select/cons_res > SelectTypeParameters=CR_CORE_Memory > FastSchedule=0 > SlurmctldDebug=3 > SlurmdDebug=3 > JobCompType=jobcomp/none > > PreemptMode=SUSPEND,GANG > PreemptType=preempt/partition_prio > > > NodeName=slurm_node[1-2] CPUs=8 SocketsPerBoard=2 CoresPerSocket=4 > ThreadsPerCore=1 > PartitionName=op Nodes=slurm_node1 Priority=100 Default=No > MaxTime=INFINITE State=UP > PartitionName=dev Nodes=slurm_node2 Priority=1 Default=yes > MaxTime=INFINITE State=UP PreemptMode=OFF > PartitionName=low Nodes=slurm_node[1-2] Priority=1 Default=No > MaxTime=INFINITE State=UP > > > ______________________________________________________________________ > > This e-mail contains information which is confidential. It is intended only > for the use of the named recipient. If you have received this e-mail in > error, please let us know by replying to the sender, and immediately delete > it from your system. Please note, that in these circumstances, the use, > disclosure, distribution or copying of this information is strictly > prohibited. KEMRI-Wellcome Trust Programme cannot accept any responsibility > for the accuracy or completeness of this message as it has been transmitted > over a public network. Although the Programme has taken reasonable > precautions to ensure no viruses are present in emails, it cannot accept > responsibility for any loss or damage arising from the use of the email or > attachments. Any views expressed in this message are those of the individual > sender, except where the sender specifically states them to be the views of > KEMRI-Wellcome Trust Programme. > ______________________________________________________________________ > -- FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html vox: +49 3641 9 44323 | fax: +49 3641 9 44321
