date:20220406

[slurm-users] ntasks and gres question

2022-04-06 Thread Chip Seraphine

Hello, In my cluster, every node has one instance of a gres called ‘io_nic’. The intent of it is to make it easier for users to ensure that jobs that perform excessive network I/O do not get scheduled simultaneously on the same machine. $ sinfo -N -o '%N %Gres' NODELIST GRESres

[slurm-users] Strange memory limit behavior with --mem-per-gpu

2022-04-06 Thread Paul Raines

I have a user who submitted an interactive srun job using: srun --mem-per-gpu 64 --gpus 1 --nodes 1 From sacct for this job we see: ReqTRES : billing=4,cpu=1,gres/gpu=1,mem=10G,node=1 AllocTRES : billing=4,cpu=1,gres/gpu=1,mem=64M,node=1 (where 10G I assume comes from

Re: [slurm-users] Memory usage not tracked

2022-04-06 Thread Chin,David

Hi, Xand: How does adding "ReqMem" to the sacct change the output? E.g. on my cluster running Slurm 20.02.7 (on RHEL8), our GPU nodes have TRESBillingWeights=CPU=0,Mem=0,GRES/gpu=43: $ sacct --format=JobID%25,State,AllocTRES%50,ReqTRES,ReqMem,ReqCPUS|grep RUNNING JobID

Re: [slurm-users] Configuring SLURM on single node GPU cluster

2022-04-06 Thread Stephen Cousins

Hi Sushil, Try changing NodeName specification to: NodeName=localhost CPUs=96 State=UNKNOWN Gres=gpu*:8* Also: TaskPlugin=task/cgroup Best, Steve On Wed, Apr 6, 2022 at 9:56 AM Sushil Mishra wrote: > Dear SLURM users, > > I am very new to alarm and need some help in configuring slurm in

Re: [slurm-users] Configuring SLURM on single node GPU cluster

2022-04-06 Thread Kamil Wilczek

Hello, try to comment out the line: AutoDetect=nvml And then restart "slurmd" and "slurmctld". Job allocations to the same GPU might be an effect of automatic MPS configuration, thogugh I'm not sure for 100%: https://slurm.schedmd.com/gres.html#MPS_Management Kind Regards -- Kamil

Re: [slurm-users] [EXTERNAL] Re: Managing shared memory (/dev/shm) usage per job?

2022-04-06 Thread John Hanks

Thanks, Greg! This looks like the right way to do this. I will have to stop putting off learning to use spank plugins :) griznog On Wed, Apr 6, 2022 at 1:40 AM Greg Wickham wrote: > Hi John, Mark, > > > > We use a spank plugin > https://gitlab.com/greg.wickham/slurm-spank-private-tmpdir (this

Re: [slurm-users] [EXTERNAL] Re: Managing shared memory (/dev/shm) usage per job?

2022-04-06 Thread Greg Wickham

Hi John, Mark, We use a spank plugin https://gitlab.com/greg.wickham/slurm-spank-private-tmpdir (this was derived from other authors but modified for functionality required on site). It can bind tmpfs mount points to the users cgroup allocation, additionally bind options can be provided (ie:

[slurm-users] ntasks and gres question

[slurm-users] Strange memory limit behavior with --mem-per-gpu

Re: [slurm-users] Memory usage not tracked

Re: [slurm-users] Configuring SLURM on single node GPU cluster

Re: [slurm-users] Configuring SLURM on single node GPU cluster

Re: [slurm-users] [EXTERNAL] Re: Managing shared memory (/dev/shm) usage per job?

Re: [slurm-users] [EXTERNAL] Re: Managing shared memory (/dev/shm) usage per job?

7 matches

Site Navigation

Mail list logo

Footer information