We too would love to have a way of accounting for GPU usage. The hope
was that there might be a record of the requested GRES resource in the
accounting database such that externally we could determine GPU usage.
While we have nodes with GPUs in a separate partition, the best we can
obtain is that this was a GPU job and not the 1 to 4 GPUs that it
actually used.
Bill
On 06/27/2014 10:27 AM, Paul Edmon wrote:
Actually a broader question would be GPU charging back to fairshare.
Do they actually count? How much? This is an interesting question.
-Paul Edmon-
On 06/27/2014 10:17 AM, Jens Svalgaard Kohrt wrote:
Hi,
We are trying to setup GPU accounting in a mixed environment with 12
CPU only nodes, and 12 nodes each two 2 GPU’s. All nodes have 20 CPU
cores.
Jobs are submitted to a partition containing all nodes, and are
allocated as
* if a GPU is needed: on the GPU nodes
* if no GPU is needed: on any node (but only on GPU nodes if all CPU
nodes are in use)
Everything seems to work, apart from that the GPU’s are "free to use”
wrt. Slurms fair share accounting etc
Is it somehow possible to set this up such that accounting wise,
getting a GPU corresponds to getting e.g., 10 cpu cores extra?
Using Google I’ve only been able to find something about GPU
accounting as future work.
In an ideal world it would be nice to be able to have write a job
submit/completion script that given information about the
requested/allocated
* # CPU cores
* # GPUs
* # Memory
* # QOS
* # maximum/actual running time
calculates the cost of running the job and updates the SlurmDBD
database.
In my particular context, only something like this is needed
cost_of_job = time_used * (total_cpus + 10*total_gpus)
Can somebody give a hint on how to do this (if possible)?
If not, maybe point me to where in the slurm source code I should
start digging?
Thanks!
Jens=