Greetings --

The calculation of Fairshare has gone wrong for some users, and I've not
been able to find a handle in the documentation, or in the archive, on how
to audit that calculation. By audit, I mean review the conditional inputs
to the Fairshare calculation, and repeat that calculation.

Our cluster is running the multifactor priority plugin, but have not
implemented Fair-tree as yet.

My usual method for auditing is by hand.
1. Find fairshare for user, in the context of their account:  sshare -A
<account> -a
2. Determine job priority details:  sprio -j <jobid>

Today we had a situation where a user with fairshare near 1.0 had a very
small fairshare contribution to priority. Details are that:
FairShareWeight=1000
User Fairshare~ 0.9
User Fairshare=3

User brought to our attention a lengthy wait in the queue, when they knew
that their group usage was not very high. My audit indicates that the
fairshare calculation failed. I performed the audit about an hour after
user submitted the job.

Many thanks,
~ Em

----------------------------------
E.M. Dragowsky, Ph.D.
ITS -- Research Computing
Case Western Reserve University
(216) 368-0082

Reply via email to