Since updating to 20.11.2 from 20.02.3 or 20.02.6,  and not before, we are 
seeing this error for every job in slurmd.log files; there has been no other 
change besides the slurm version update, including the node configurations.

[2020-12-30T07:56:40.692] [9540590.batch] error: xcpuinfo_abs_to_mac: failed
[2020-12-30T07:56:40.692] [9540590.batch] error: task_cgroup_cpuset_create: 
unable to build job physical cores

The node configuration is as follows:
NodeName=gcp13040[1-2] CPUs=256 RealMemory=512000 Weight=5120000 
Feature="AuthenticAMD,amd"

# slurmd -C
NodeName=gcp130401 CPUs=256 Boards=1 SocketsPerBoard=2 CoresPerSocket=64 
ThreadsPerCore=2 RealMemory=515689

This configuration is intentional; we schedule to the core.

Regards,
Jenny Williams

Reply via email to