Hi guys. I have the Intel Knights Landing server and I set slurm on my knl server. but, I have failed to submit test jobs to use high bandwidth memory. I need your help. When I run my command, this is my slurmctld.log #srun --gres=hbm:1 numactl --membind=1 mpirun -np 24 osu_latency =============================================================== [2017-03-02T17:19:17.426] slurmctld version 17.02.0-0pre4 started on cluster cluster [2017-03-02T17:19:17.430] AllowMCDRAM=cache,hybrid,flat,equal,auto AllowNUMA=a2a,snc2,hemi,quad [2017-03-02T17:19:17.430] AllowUserBoot=ALL [2017-03-02T17:19:17.430] DefaultMCDRAM=flat DefaultNUMA=quad [2017-03-02T17:19:17.431] McPath=/sys/devices/system/edac/mc [2017-03-02T17:19:17.431] SyscfgPath=/usr/bin/syscfg/syscfg [2017-03-02T17:19:17.431] UmeCheckInterval=0 [2017-03-02T17:19:17.434] layouts: no layout to initialize [2017-03-02T17:19:17.451] layouts: loading entities/relations information [2017-03-02T17:19:17.451] Recovered state of 1 nodes [2017-03-02T17:19:17.452] Recovered information about 0 jobs [2017-03-02T17:19:17.452] gres/hbm: state for knl02 [2017-03-02T17:19:17.452] gres_cnt found:TBD configured:0 avail:0 alloc:0 [2017-03-02T17:19:17.452] gres_bit_alloc: [2017-03-02T17:19:17.452] gres_used:(null) [2017-03-02T17:19:17.453] Recovered state of 0 reservations [2017-03-02T17:19:17.453] _preserve_plugins: backup_controller not specified [2017-03-02T17:19:17.453] Running as primary controller [2017-03-02T17:19:17.454] No parameter for mcs plugin, default values set [2017-03-02T17:19:17.454] mcs: MCSParameters = (null). ondemand set. [2017-03-02T17:19:20.013] _update_node_avail_features: nodes knl02 available features set to: a2a,hemi,quad,snc2,snc4,cache,flat,hybrid,auto,knl [2017-03-02T17:19:20.017] _update_node_active_features: nodes knl02 active features set to: quad,flat [2017-03-02T17:19:20.017] gres/hbm: state for knl02 [2017-03-02T17:19:20.018] gres_cnt found:17179869184 configured:17179869184 avail:17179869184 alloc:0 [2017-03-02T17:19:20.018] gres_bit_alloc: [2017-03-02T17:19:20.018] gres_used:(null) [2017-03-02T17:19:20.462] SchedulerParameters=default_queue_depth=100,max_rpc_cnt=0,max_sched_time=2,partition_job_depth=0,sched_max_job_start=0,sched_min_interval=0 [2017-03-02T17:24:21.535] gres: hbm state for job 171 [2017-03-02T17:24:21.535] gres_cnt:1 node_cnt:0 type:(null) [2017-03-02T17:24:21.535] error: gres/hbm: node knl02 gres bitmap size bad (0 < 17179869184) ========================================================================================== slurm.conf ==================================== # LOGGING AND ACCOUNTING #AccountingStorageType=accounting_storage/none AccountingStorageType=accounting_storage/filetxt ClusterName=cluster #JobAcctGatherFrequency=30 JobAcctGatherType=jobacct_gather/linux SlurmctldDebug=3 SlurmctldLogFile=/var/log/slurm/slurmctld.log SlurmdDebug=3 SlurmdLogFile=/var/log/slurm/slurmd.log NodeFeaturesPlugins=knl_generic DebugFlags=NodeFeatures,Gres GresTypes=hbm RebootProgram=/sbin/reboot #Nodes Nodename=knl02 Sockets=1 CoresPerSocket=68 ThreadsPerCore=4 RealMemory=95891 Feature=knl PartitionName=hbm Default=YES MaxTime=INFINITE State=UP Nodes=knl02 #Auth AuthType=auth/none ======================================================= knl_generic.conf ====================================================== # Sample knl_generic.conf SyscfgPath=/usr/bin/syscfg/syscfg DefaultNUMA=quad # NUMA=all2all AllowNUMA=quad,a2a,snc2,hemi DefaultMCDRAM=flat # MCDRAM=cache ========================================================== gres.conf ================================================ # Configure support Name=hbm File=/dev/shm/mcdram ================================================= what's the problem? When I use gres:hbm option, I could run normally with ddr memory not mcdram. ===================================== Seungwoo Rho National Institute of Supercomputing and Networking, KISTI, 52-11, Eoeundong, Yuseonggu, Daejeon, 305-806, Republic of Korea e-mail : [email protected] Phone : +82-42-869-1643 Mobie : +82-10-8849-4001 =====================================
