Hi Peter,
Ganglia plugin would be interesting. How do ganglia clients on different
nodes communicate? Typically they do not talk to each other but only to the
central node. However, to decide that they are part of the same job, they
somehow need to talk to each other?
Thank you,
Igor
On Sun, Sep
Gah, yes. sstat, not sinfo.
--
The most dangerous phrase in the language is, "We've always done it this
way."
- Grace Hopper
On 19 September 2016 at 13:00, Peter A Ruprecht wrote:
> Igor,
>
> Would sstat give you what you need?
Igor,
Would sstat give you what you need? (http://slurm.schedmd.com/sstat.html) It
doesn't update instantaneously but at least a few times a minute.
If you want to get fancy, I believe that xdmod can integrate with TACC-stats to
provide graphs about what is happening inside a job but I'm not
Also, if you have slurm installed on a deb based distro, you can try this
https://github.com/edf-hpc/slurm-web
I tried to get it running on RPM (Centos) but it is too tightly coupled to
deb for my ability to port it.
cheers
L.
--
The most dangerous phrase in the language is, "We've always
I think you need a couple of things going on:
1. you have to have some sort of accounting organised and set up
2. your sbatch scripts need to use: srun not just
3. sinfo should then work on the job number.
When I asked, that was the response iirc.
cheers
L.
--
The most dangerous phrase
Hi All,
I'd like to be able to see for a given jobid how much resources are used by
a job on each node it is running on at this moment. Is there a way to do
it?
So far it looks like I have to script it: get the list of the involved
nodes using, for example, squeue or qstat, ssh to each node and
On 18/09/16 03:45, John DeSantis wrote:
> Try adding a "DefMemPerCPU" statement in your partition definitions, e.g
You can also set that globally.
# Global default for jobs - request 2GB per core wanted.
DefMemPerCPU=2048
All the best,
Chris
--
Christopher SamuelSenior Systems