Dear Slurm Developers and Users, I would like to constrain an 8 cpu job to run in one socket of 16 cpu, with one task per core. Unfortunately, when using the script : --- sbatch -J $JOB -N 1 -B '1:8:1' --ntasks-per-socket=8 --ntasks-per-core=1 << eof ... mpirun -np 8 nwchem_64to32 $JOB.nwc >& $JOB.out ... eof --- top command on compute node shows 2 tasks running on the same core : --- $ top 11838 11846 51 edrisse 20 0 12.3g 9452 95m R 46.7 0.0 0:01.43 nwchem_64to32 11838 11845 59 edrisse 20 0 12.3g 9600 96m R 46.4 0.0 0:01.42 nwchem_64to32 11838 11844 47 edrisse 20 0 12.3g 9592 95m R 46.4 0.0 0:01.42 nwchem_64to32 11838 11843 43 edrisse 20 0 12.3g 9844 96m R 46.4 0.0 0:01.42 nwchem_64to32 11838 11842 3 edrisse 20 0 12.3g 9.8m 96m R 46.4 0.0 0:01.43 nwchem_64to32 11838 11841 35 edrisse 20 0 12.3g 9.8m 92m R 45.7 0.0 0:01.41 nwchem_64to32 11838 11840 39 edrisse 20 0 12.3g 10m 96m R 46.1 0.0 0:01.42 nwchem_64to32 11838 11839 55 edrisse 20 0 12.3g 10m 109m R 46.4 0.0 0:01.42 nwchem_64to32 --- Unfortunately, cpu 55 and cpu 51 own to the same core in our node's architecture: (see NUMA node7) --- $ lscpu CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 4 NUMA node(s): 8 ... NUMA node0 CPU(s): 0,4,8,12,16,20,24,28 ... NUMA node7 CPU(s): 3,7,11,15,19,23,27,31 --- I perhaps missed something, if you could guide me to the right option it would be great. I also attached my slurm.conf file.
Best Regards, Edrisse -- Edrisse Chermak Post-Doctoral Fellow Catalysis center - KAUST, Thuwal, Saudi Arabia kcc.kaust.edu.sa ________________________________ This message and its contents including attachments are intended solely for the original recipient. If you are not the intended recipient or have received this message in error, please notify me immediately and delete this message from your computer system. Any unauthorized use or distribution is prohibited. Please consider the environment before printing this email.
# # Example slurm.conf file. Please run configurator.html # (in doc/html) to build a configuration file customized # for your environment. # # # slurm.conf file generated by configurator.html. # # See the slurm.conf man page for more information. # ClusterName=kcc ControlMachine=node3 ControlAddr=10.62.222.3 #BackupController= #BackupAddr= # SlurmUser=slurm #SlurmdUser=root SlurmctldPort=6817 SlurmdPort=6818 AuthType=auth/munge CryptoType=crypto/munge #JobCredentialPrivateKey= #JobCredentialPublicCertificate= StateSaveLocation=/var/spool SlurmdSpoolDir=/var/spool/slurmd SwitchType=switch/none MpiDefault=none SlurmctldPidFile=/var/run/slurmctld.pid SlurmdPidFile=/var/run/slurmd.pid ProctrackType=proctrack/pgid #PluginDir= CacheGroups=0 #FirstJobId= ReturnToService=0 #MaxJobCount= #PlugStackConfig= #PropagatePrioProcess= #PropagateResourceLimits= #PropagateResourceLimitsExcept= #Prolog= #Epilog= #SrunProlog= #SrunEpilog= #TaskProlog= #TaskEpilog= TaskPlugin=task/affinity #TrackWCKey=no #TreeWidth=50 #TmpFS= #UsePAM= # # TIMERS SlurmctldTimeout=300 SlurmdTimeout=300 InactiveLimit=0 MinJobAge=300 KillWait=30 Waittime=0 # # SCHEDULING SchedulerType=sched/backfill SchedulerPort=7321 #SchedulerAuth= #SchedulerPort= #SchedulerRootFilter= SelectType=select/cons_res SelectTypeParameters=CR_Socket FastSchedule=1 #PriorityType=priority/multifactor #PriorityDecayHalfLife=14-0 #PriorityUsageResetPeriod=14-0 #PriorityWeightFairshare=100000 #PriorityWeightAge=1000 #PriorityWeightPartition=10000 #PriorityWeightJobSize=1000 #PriorityMaxAge=1-0 # # LOGGING SlurmctldDebug=3 #SlurmctldLogFile= SlurmdDebug=3 #SlurmdLogFile= JobCompType=jobcomp/filetxt #JobCompLoc= # # ACCOUNTING JobAcctGatherType=jobacct_gather/none JobAcctGatherFrequency=30 # AccountingStorageType=accounting_storage/filetxt AccountingStoreJobComment=YES #AccountingStorageHost= #AccountingStorageLoc= #AccountingStoragePass= #AccountingStorageUser= # # COMPUTE NODES NodeName=c2bay2 NodeAddr=192.168.10.142 Sockets=4 CoresPerSocket=8 ThreadsPerCore=2 State=UNKNOWN PartitionName=debug Nodes=c2bay2 Default=YES MaxTime=INFINITE State=UP
