Hi Moe,
That was exactly what we were after. (hmmm, I really should read through
the man-pages more carefully...)
I'll pass on the idea of adding an XML output option.
Many thanks!
Mark.
On 25/01/12 03:52, Moe Jette wrote:
Use the command "scontrol show job --detail". The output will contain a
line like this for each node allocated to each job:
Nodes=tux123 CPU_IDs=2-5 Mem=2048
While the data does exist, that's not going to be particularly simple to
parse and work with. There has been talk about adding an "--xml" option
for XML output from scontrol, but that has never been done. Since SLURM
is open source, you could modify scontrol to add an "--xml" option or
build a new tool for your particular application.
Moe Jette
SchedMD
Quoting Mark Nelson <mdnels...@gmail.com>:
Hi there,
My colleague came up with the question below about running jobs on a
normal x86 based cluster. Hopefully someone here can shed some light
on this.
When running SLURM on a multi-core/multi-socket cluster systems is
there any way of finding out the cores allocated for a particular job.
Using "scontrol show job" I can find out which nodes are allocated and
a total number of cores, but have no way of knowing how these cores
might be distributed across the nodes. While the system seems to
allocate cores consecutively, across multiple jobs there is no way of
knowing which cores are assigned to which job. For example, in an
8-core multi-node system, if I ask for 3 cores across 2 nodes (salloc
-n 3 -N 2) how do I know if 2 cores are allocated from the first node
and 1 core from the second or visa-versa. Also as nodes are filled up
with other jobs, and jobs finish at different times, there is no way
of mapping jobs to particular cores. I've seen from other postings
that SLURM core numbering might not match the physical hardware core
numbering, but for my purposes this is not a problem, as long as the
numbering is consistent.
The reason I'm asking this question, is I'm trying to integrate SLURM
with PTP (Eclipse Parallel Tools Platform) system monitoring that
expects to map jobs to nodes and cores in a graphical interface.
Therefore for jobs on a multi-core cluster, I need to report on which
cores and nodes a particular job is running, in a specified XML format.
Many thanks!
Mark.