The current logic requires job steps to explicitly request the generic
resources (GRES, e.g. GPUs) to be allocated. This decision was based
upon users commonly running many job steps within a job allocation and
using different resources for each job step. If a job step inherits
all of the job's GRES by default, that would require job steps to
explicitly request no GRES if desired
(e.g. "srun --gres=gpu:0 ..."). This may not be the best design for
all users, but it is what exists today.
Moe
Quoting Carles Fenoy <[email protected]>:
Hi all,
We are considering using cgroups in a new GPU cluster, and I want to know
which is the current status of the devices part of the cgroups plugin.
We have also observed that the tasks, of a job requesting gres, that don't
request generic resources explicitly are not assigned any resources.
Example:
A job request 2 gpus with
sbatch --gres=gpu:1 --ntasks=2 --cpus-per-task=2 --wrap="env; srun env |
grep CUDA"
The first env shows:
CUDA_VISIBLE_DEVICES=0
although "srun env" shows:
CUDA_VISIBLE_DEVICES=NoDevFiles
CUDA_VISIBLE_DEVICES=NoDevFiles
Is this the expected behavior?
Maybe if a job request gres and its steps don't, slurmstepd should not
overwrite the job environment in:
gres_gpu.c(211):
} else {
/* The gres.conf file must identify specific device files
* in order to set the CUDA_VISIBLE_DEVICES env var */
env_array_overwrite(job_env_ptr,"CUDA_VISIBLE_DEVICES",
"NoDevFiles");
}
--
--
Carles Fenoy