Have you looked at sshare?

Phil Eckert
LLNL

From: Mario Kadastik <[email protected]<mailto:[email protected]>>
Reply-To: slurm-dev <[email protected]<mailto:[email protected]>>
Date: Tuesday, January 22, 2013 11:17 AM
To: slurm-dev <[email protected]<mailto:[email protected]>>
Subject: [slurm-dev] fairshare usage

Hi,

is there some decent way to get multifactor fairshare current state? Something 
akin to maui's diagnose -f output that shows groups (accounts for slurm) and 
users with their fairshare target as well as their historic usage over the past 
N days. This would seriously help understand how the fairshare is computed 
based on the actual usage statistics and current cluster state.

For example we have all user fairshares set as parent and for the accounts:

   Account     Share
---------- ---------
      root         1
      grid         1
  grid-ops         1
  hepusers       100
 kbfiusers         1

now let's assume one of the users in hepusers spends the past N days computing 
with the full cluster and then another user submits a number of jobs it would 
be logical to assume that as there is no distinction between the  users in an 
account the newcomers priority would be higher as (s)he hasn't had any 
allocated time.

[root@slurm-1 slurm]# sreport cluster accountutilizationbyuser start=2013-01-08
--------------------------------------------------------------------------------
Cluster/Account/User Utilization 2013-01-08T00:00:00 - 2013-01-21T23:59:59 
(1209600 secs)
Time reported in CPU Minutes
--------------------------------------------------------------------------------
  Cluster         Account     Login     Proper Name       Used
--------- --------------- --------- --------------- ----------
t2estonia            root                              7801048
t2estonia            grid                                    0
t2estonia            grid    cms134 mapped user fo+          0
t2estonia            grid sgmcms000 mapped user fo+          0
t2estonia        hepusers                              7801048
t2estonia        hepusers    andres     Andres Tiko      85048
t2estonia        hepusers     mario  Mario Kadastik    7716000

so according to this Mario (me) has computed a huge amount of time in 
comparison to andres. However if I look at the priorities from sinfo -nl I see 
this:

[root@slurm-1 slurm]# sprio -nl|head -3
  JOBID     USER PRIORITY   AGE        FAIRSHARE  JOBSIZE    PARTITION  QOS
  53498    mario 0.00003497 0.2404977  0.4897101  0.9919238  1.0000000  
0.0000000
  53499    mario 0.00003497 0.2404977  0.4897101  0.9919238  1.0000000  
0.0000000
[root@slurm-1 slurm]# sprio -nl|grep andres|head -1
  53835   andres 0.00003497 0.2396412  0.4897101  0.9919238  1.0000000  
0.0000000

so in fact the fairshare factor is equivalent for both users no matter that one 
has been getting a lot of the resource while the other has not.

or do I misunderstand the =parent part?  I tried also setting all users shares 
to 1 and have no clue how long it will take for sprio to recompute this, but 
right now it's showing the same priorities.

That's one of the reasons why I'd like to be able to see how the actual usage 
and decay over time affect the factor so that I can better understand the 
algorithm and tune the weights.

Thanks,

Mario Kadastik, PhD
Researcher

---
  "Physics is like sex, sure it may have practical reasons, but that's not why 
we do it"
     -- Richard P. Feynman


Reply via email to