[slurm-dev] Re: how to monitor CPU/RAM usage on each node of a slurm job? python API?

Lachlan Musicman Sun, 18 Sep 2016 19:10:01 -0700

I think you need a couple of things going on:

1. you have to have some sort of accounting organised and set up
2. your sbatch scripts need to use: srun <command> not just <command>
3. sinfo should then work on the job number.


When I asked, that was the response iirc.


cheers
L.

------
The most dangerous phrase in the language is, "We've always done it this
way."

- Grace Hopper

On 19 September 2016 at 10:41, Igor Yakushin <igor.2...@gmail.com> wrote:

> Hi All,
>
> I'd like to be able to see for a given jobid how much resources are used
> by a job on each node it is running on at this moment. Is there a way to do
> it?
>
> So far it looks like I have to script it: get the list of the involved
> nodes using, for example, squeue or qstat, ssh to each node and find all
> the user processes (not 100% guaranteed that they would be from the job I
> am interested in: is there a way to find UNIX pids corresponding to Slurm
> jobid?).
>
> Another question: is there python API to slurm? I found pyslurm but so far
> it would not build with my version of Slurm.
>
> Thank you,
> Igor
>
>

[slurm-dev] Re: how to monitor CPU/RAM usage on each node of a slurm job? python API?

Reply via email to