[slurm-dev] Re: how to monitor CPU/RAM usage on each node of a slurm job? python API?

Lachlan Musicman Sun, 18 Sep 2016 20:14:37 -0700

Gah, yes. sstat, not sinfo.

------
The most dangerous phrase in the language is, "We've always done it this
way."


- Grace Hopper

On 19 September 2016 at 13:00, Peter A Ruprecht <peter.rupre...@colorado.edu
> wrote:

> Igor,
>
> Would sstat give you what you need?  (http://slurm.schedmd.com/sstat.html)
>  It doesn't update instantaneously but at least a few times a minute.
>
> If you want to get fancy, I believe that xdmod can integrate with
> TACC-stats to provide graphs about what is happening inside a job but I'm
> not sure whether that updates in "real" time.
>
> One of our summer interns created a custom ganglia interface that checked
> what nodes a job was running on and graphed several relevant variables
> selected from the ganglia RRD files for those nodes.  If you're interested
> in seeing that work, I can look into whether we can share it.
>
> So there are some existing ways of going at this.
>
> Pete
>
> From: Igor Yakushin <igor.2...@gmail.com>
> Reply-To: slurm-dev <slurm-dev@schedmd.com>
> Date: Sunday, September 18, 2016 at 6:42 PM
> To: slurm-dev <slurm-dev@schedmd.com>
> Subject: [slurm-dev] how to monitor CPU/RAM usage on each node of a slurm
> job? python API?
>
> Hi All,
>
> I'd like to be able to see for a given jobid how much resources are used
> by a job on each node it is running on at this moment. Is there a way to do
> it?
>
> So far it looks like I have to script it: get the list of the involved
> nodes using, for example, squeue or qstat, ssh to each node and find all
> the user processes (not 100% guaranteed that they would be from the job I
> am interested in: is there a way to find UNIX pids corresponding to Slurm
> jobid?).
>
> Another question: is there python API to slurm? I found pyslurm but so far
> it would not build with my version of Slurm.
>
> Thank you,
> Igor
>
>

[slurm-dev] Re: how to monitor CPU/RAM usage on each node of a slurm job? python API?

Reply via email to