Try setting RawShares to something greater than 1. I've seen it be the
case then when you set 1 it creates weirdness like this.
-Paul Edmon-
On 7/9/2020 1:12 PM, Dumont, Joey wrote:
Hi,
We recently set up fair tree scheduling (we have 19.05 running), and
are trying to use sshare to see usage information. Unfortunately,
sshare reports all zeros, even though there seems to be data in the
backend DB. Here's an example output:
$ sshare -l
Account User RawShares NormShares RawUsage
NormUsage EffectvUsage FairShare LevelFS
GrpTRESMins TRESRunMins
-- -- --- ---
--- - -- --
-- --
root 0
0.00 0.00
cpu=0,mem=0,energy=0,node=0,b+
covid 1 0
0.00 0.00
cpu=0,mem=0,energy=0,node=0,b+
covid-01 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
covid-02 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
group1 1 0
0.00 0.00
cpu=0,mem=0,energy=0,node=0,b+
subgroup1 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
othersubgroups 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
othersubgroups 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
othersubgroups 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
othersubgroups 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
othersubgroups 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
othersubgroups 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
othersubgroups 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
othersubgroups 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
othersubgroups 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
othersubgroups 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
subgroups 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
subgroups 4 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
subgroups 1 0 0.00
0.00 cpu=0,mem=0,energy=0,node=0,b+
SUBGROUP 1 0
0.00 0.00
cpu=0,mem=0,energy=0,node=0,b+
SUBGROUP 1 0
0.00 0.00
cpu=0,mem=0,energy=0,node=0,b+
And the slurm.conf config:
ClusterName=trixie
SlurmctldHost=trixie(10.10.0.11)
SlurmctldHost=hn2(10.10.0.12)
GresTypes=gpu
SlurmUser=slurm
SlurmctldPort=6817
SlurmdPort=6818
AuthType=auth/munge
StateSaveLocation=/gpfs/share/slurm/
SlurmdSpoolDir=/var/spool/slurm/d
SwitchType=switch/none
MpiDefault=none
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
ProctrackType=proctrack/cgroup
ReturnToService=2
PrologFlags=x11
TaskPlugin=task/cgroup
# TIMERS
SlurmctldTimeout=60
SlurmdTimeout=300
InactiveLimit=0
MinJobAge=300
KillWait=30
Waittime=0
#
# SCHEDULING
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory
FastSchedule=1
SchedulerParameters=bf_interval=60,bf_continue,bf_resolution=600,bf_window=2880,bf_max_job_test=5000,bf_max_job_part=1000,bf_max_job_user=10,bf_max_job_start=100
PriorityType=priority/multifactor
PriorityDecayHalfLife=14-0
PriorityWeightFairshare=10
PriorityWeightAge=1000
PriorityWeightPartition=1
PriorityWeightJobSize=1000
PriorityMaxAge=1-0
# LOGGING
SlurmctldDebug=3