Thanks. The field size was just changed for the next major release. HUMMEL Michel <[email protected]> wrote: >I think i've found the problem it comes from a comparison between >integers of different types here is a patch which solve the problem : > >--- >slurm-slurm-13-12-0-0pre4/src/plugins/sched/backfill/backfill.c.org >2013-11-18 >17:56:09.741413223 +0100 >+++ >slurm-slurm-13-12-0-0pre4/src/plugins/sched/backfill/backfill.c >2013-11-18 >17:57:42.903468026 +0100 >@@ -712,7 +712,7 @@ > continue; /* started in other partition */ > if (!avail_front_end(job_ptr)) > continue; /* No available frontend for this job */ >- if (job_ptr->array_task_id != (uint16_t) NO_VAL) { >+ if (job_ptr->array_task_id != (uint32_t) NO_VAL) { > if (reject_array_job_id == job_ptr->array_job_id) > continue; /* already rejected array element */ > /* assume reject whole array for now, clear if OK */ > > > >Regards, > >[@@ THALES GROUP INTERNAL @@] > >De : HUMMEL Michel [mailto:[email protected]] >Envoyé : lundi 18 novembre 2013 16:32 >À : slurm-dev >Objet : [slurm-dev] unable to configure backfill > >I am trying the backfill scheduler without success. >I just want to test it with the most simple configuration possible* >(see slurm.conf at the end). >7 homogenous nodes, 12 CPU per node > >I submit three jobs and the last should be backfilled, but … : >$ sbatch --nice=0 -N 5 -c 12 --time-min="09:00" --time="10:00" >~/slurm/job.sh >Submitted batch job 65574 >$ sbatch --nice=0 -N 5 -c 12 --time-min="09:00" --time="10:00" >~/slurm/job.sh >Submitted batch job 65575 >sbatch --nice=0 -N 1 -c 12 --time-min="00:40" --time="01:00" >~/slurm/job.sh >Submitted batch job 65576 >$ squeue > JOBID PARTITION NAME USER ST TIME NODES >NODELIST(REASON) > 65575 prod.q job.sh hummelm PD 0:00 5 >(Resources) > 65576 prod.q job.sh hummelm PD 0:00 1 >(Priority) > 65574 prod.q job.sh hummelm R 7:54 5 >OGSE[1-5] > >I hope someone here can show me the error I’ve made, thks. > >(* slurm.conf ) >ControlMachine=OGSE1 ># >AuthType=auth/munge >CryptoType=crypto/munge >MailProg=/bin/mail >MpiDefault=none >ProctrackType=proctrack/pgid >ReturnToService=1 >SlurmctldPidFile=/var/run/slurmctld.pid >SlurmctldPort=6817 >SlurmdPidFile=/var/run/slurmd.pid >SlurmdPort=6818 >SlurmdSpoolDir=/var/spool/slurmd >SlurmUser=root >StateSaveLocation=/var/spool >SwitchType=switch/none >TaskPlugin=task/none > >InactiveLimit=0 >KillWait=30 >MinJobAge=300 >SlurmctldTimeout=120 >SlurmdTimeout=300 >Waittime=0 ># ># ># SCHEDULING >FastSchedule=1 >SchedulerType=sched/backfill >SchedulerParameters=bf_interval=20,bf_resolution=10 >SchedulerPort=7321 >##### Round robin select for nodes >#SelectType=select/cons_res >#SelectTypeParameters=CR_LLN ># ># ># JOB PRIORITY >PriorityType=priority/multifactor >PriorityWeightPartition=1000 >############ ># >#Preemption >#PreemptMode=REQUEUE >#PreemptType=preempt/partition_prio ># ># LOGGING AND ACCOUNTING >AccountingStorageType=accounting_storage/none > >ClusterName=cluster >DebugFlags=Backfill >JobCompType=jobcomp/none >JobAcctGatherType=jobacct_gather/none >SlurmctldDebug=6 >SlurmdDebug=1 ># ># COMPUTE NODES >NodeName=OGSE[1-7] CPUs=12 State=UNKNOWN >PartitionName=prod.q Nodes=OGSE[1-7] Default=YES MaxTime="01:00:00" >State=UP Priority=10 >PartitionName=urgent.q Nodes=OGSE[1-7] Default=NO MaxTime="01:00:00" >State=UP Priority=20 > > >[@@ THALES GROUP INTERNAL @@]
-- Sent from my Android phone with K-9 Mail. Please excuse my brevity.
