[slurm-dev] Re: how to monitor CPU/RAM usage on each node of a slurm job? python API?

Ryan Cox Mon, 19 Sep 2016 10:13:10 -0700

We use this script that we cobbled together:https://github.com/BYUHPC/slurm-random/blob/master/rjobstat. It assumesthat you're using cgroups. It uses ssh to connect to each node so it'snot very scalable but it works well enough for us.


Ryan


On 09/18/2016 06:42 PM, Igor Yakushin wrote:

how to monitor CPU/RAM usage on each node of a slurm job? python API?
Hi All,
I'd like to be able to see for a given jobid how much resources areused by a job on each node it is running on at this moment. Is there away to do it?
So far it looks like I have to script it: get the list of the involvednodes using, for example, squeue or qstat, ssh to each node and findall the user processes (not 100% guaranteed that they would be fromthe job I am interested in: is there a way to find UNIX pidscorresponding to Slurm jobid?).
Another question: is there python API to slurm? I found pyslurm but sofar it would not build with my version of Slurm.
Thank you,
Igor

[slurm-dev] Re: how to monitor CPU/RAM usage on each node of a slurm job? python API?

Reply via email to