Look at the sshare and sprio tools.

Quoting Mario Kadastik <[email protected]>:

> Hi,
>
> is there some decent way to get multifactor fairshare current state?  
> Something akin to maui's diagnose -f output that shows groups  
> (accounts for slurm) and users with their fairshare target as well  
> as their historic usage over the past N days. This would seriously  
> help understand how the fairshare is computed based on the actual  
> usage statistics and current cluster state.
>
> For example we have all user fairshares set as parent and for the accounts:
>
>    Account     Share
> ---------- ---------
>       root         1
>       grid         1
>   grid-ops         1
>   hepusers       100
>  kbfiusers         1
>
> now let's assume one of the users in hepusers spends the past N days  
> computing with the full cluster and then another user submits a  
> number of jobs it would be logical to assume that as there is no  
> distinction between the  users in an account the newcomers priority  
> would be higher as (s)he hasn't had any allocated time.
>
> [root@slurm-1 slurm]# sreport cluster accountutilizationbyuser  
> start=2013-01-08
> --------------------------------------------------------------------------------
> Cluster/Account/User Utilization 2013-01-08T00:00:00 -  
> 2013-01-21T23:59:59 (1209600 secs)
> Time reported in CPU Minutes
> --------------------------------------------------------------------------------
>   Cluster         Account     Login     Proper Name       Used
> --------- --------------- --------- --------------- ----------
> t2estonia            root                              7801048
> t2estonia            grid                                    0
> t2estonia            grid    cms134 mapped user fo+          0
> t2estonia            grid sgmcms000 mapped user fo+          0
> t2estonia        hepusers                              7801048
> t2estonia        hepusers    andres     Andres Tiko      85048
> t2estonia        hepusers     mario  Mario Kadastik    7716000
>
> so according to this Mario (me) has computed a huge amount of time  
> in comparison to andres. However if I look at the priorities from  
> sinfo -nl I see this:
>
> [root@slurm-1 slurm]# sprio -nl|head -3
>   JOBID     USER PRIORITY   AGE        FAIRSHARE  JOBSIZE    PARTITION  QOS
>   53498    mario 0.00003497 0.2404977  0.4897101  0.9919238   
> 1.0000000  0.0000000
>   53499    mario 0.00003497 0.2404977  0.4897101  0.9919238   
> 1.0000000  0.0000000
> [root@slurm-1 slurm]# sprio -nl|grep andres|head -1
>   53835   andres 0.00003497 0.2396412  0.4897101  0.9919238   
> 1.0000000  0.0000000
>
> so in fact the fairshare factor is equivalent for both users no  
> matter that one has been getting a lot of the resource while the  
> other has not.
>
> or do I misunderstand the =parent part?  I tried also setting all  
> users shares to 1 and have no clue how long it will take for sprio  
> to recompute this, but right now it's showing the same priorities.
>
> That's one of the reasons why I'd like to be able to see how the  
> actual usage and decay over time affect the factor so that I can  
> better understand the algorithm and tune the weights.
>
> Thanks,
>
> Mario Kadastik, PhD
> Researcher
>
> ---
>   "Physics is like sex, sure it may have practical reasons, but  
> that's not why we do it"
>      -- Richard P. Feynman
>
>

Reply via email to