Hello everyone.
Last week, I sent one email. But I did't have got any replies.
So, I want to share my problem again.
I hava one KNL server and I want to use HBM(high bandwidth memory).
I set all the configuration files with reference to slurm manual(the KNL
document).
gres.conf, knl_generic.conf, slurm.conf
I attached these files and my logs.
In my opinion, I think the following log has some problem.
[2017-03-03T18:00:27.633] gres_cnt found:17179869184 configured:17179869184
avail:17179869184 alloc:0
[2017-03-03T18:00:27.633] gres_bit_alloc:
[2017-03-03T18:00:27.633] gres_used:(null)
alloc:0 is right?
When I submit a test job, my job was assigned to only ddr memory not hbm.
Moreover, I was confused to use gres option.
"srun --gres=hbm:0 test.sh" works (it's assgined to ddr)
"srun --gres=hbm:1g test.sh" doesn't work with next error log
gres: hbm state for job 178
gres_cnt:1073741824 node_cnt:0 type:(null)
error: gres/hbm: node knl02 gres bitmap size bad (0 < 17179869184)
any problem? I can't find my error.
yours sincerlely
=====================================
Seungwoo Rho
National Institute of Supercomputing and Networking,
KISTI,
52-11, Eoeundong, Yuseonggu,
Daejeon, 305-806, Republic of Korea
e-mail : [email protected]
Phone : +82-42-869-1643
Mobie : +82-10-8849-4001
=====================================
knl_generic.conf
Description: Binary data
gres.conf
Description: Binary data
slurm.conf
Description: Binary data
slurmctld.log
Description: Binary data
slurmd.log
Description: Binary data
