-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/01/14 11:03, Christopher Samuel wrote:

> Here's the data we have with all numbers normalised to hours:
> 
> GrpCPUMins 71200
> CPURunMins 58348
> Raw Usage 12967
> 
> So that means GrpCPUMins-CPURunMins-RawUsage = -115 hours.

After a bit more digging it appears our issue appears to be that the
description of CPURunMins in the sshare manual page is misleading.

It says:

# The number of CPU-minutes accumulated by jobs currently running
# against the account.

to me that says it should increase as time goes on an jobs accumulate
more run time.

However, the comment in the source code says:

uint64_t cpu_run_mins; /* how many cpu mins are allocated
* currently */

which is a very different thing, this includes time reserved for the
entire length of the job (which makes good sense).

So we were bundling up the reserved time and the used time as if they
were the same thing and then subtracting them from the users quota.

Now we know this we can revise our quota tool that presents sshare
information in a usable manner for users to instead display quota,
reserved and used amounts.

The end effect is the same (queued jobs can't run) but the users at
least can understand why and then go and nag their project leader
about why one user has grabbed all their quota. :-)

I'll open a bug regarding the description of CPURunMins.

All the best!
Chris
- -- 
 Christopher Samuel        Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/      http://twitter.com/vlsci

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlLMxRgACgkQO2KABBYQAh+S6QCfX+dN5jdgY9ofmc57srJa9xD6
KAcAn0cm7N8uUwEtIIVX4dFQ/BjF7deM
=PG1q
-----END PGP SIGNATURE-----

Reply via email to