Thank you Jette, I didn't remember that some time ago I tried sstat, but It didn't work.
If I sstat a jobid that has job steps, all goes fine and I see the stat of the job, but, If I sstat on a jobid without job steps I can't get the stats. I use ProctrackType=proctrack/linuxproc I get the error: sstat: error: no steps running for job 11352 I tried it as a normal user and as a root. In the other hand, with sacct I can see the stats of all my finished jobs, but not the stats of those that are running or pending. I see the job line but with the statistics fields empty. Thank you 2012/1/28 je...@schedmd.com <je...@schedmd.com> > ** Look at sstat and sacct commands > -- > Sent from my Android phone. Please excuse my brevity and typos. > > > Felip Moll <lip...@gmail.com> wrote: >> >> Related to this, is it possible to know how many CPU and Memory is >> currently using the job? >> >> >> >> 2012/1/24 Mark Nelson <mdnels...@gmail.com> >> >>> Hi Moe, >>> >>> That was exactly what we were after. (hmmm, I really should read through >>> the man-pages more carefully...) >>> >>> I'll pass on the idea of adding an XML output option. >>> >>> Many thanks! >>> Mark. >>> >>> >>> On 25/01/12 03:52, Moe Jette wrote: >>> >>>> Use the command "scontrol show job --detail". The output will contain a >>>> line like this for each node allocated to each job: >>>> Nodes=tux123 CPU_IDs=2-5 Mem=2048 >>>> While the data does exist, that's not going to be particularly simple to >>>> parse and work with. There has been talk about adding an "--xml" option >>>> for XML output from scontrol, but that has never been done. Since SLURM >>>> is open source, you could modify scontrol to add an "--xml" option or >>>> build a new tool for your particular application. >>>> >>>> Moe Jette >>>> SchedMD >>>> >>>> Quoting Mark Nelson <mdnels...@gmail.com>: >>>> >>>> Hi there, >>>>> >>>>> My colleague came up with the question below about running jobs on a >>>>> normal x86 based cluster. Hopefully someone here can shed some light >>>>> on this. >>>>> >>>>> When running SLURM on a multi-core/multi-socket cluster systems is >>>>> there any way of finding out the cores allocated for a particular job. >>>>> Using "scontrol show job" I can find out which nodes are allocated and >>>>> a total number of cores, but have no way of knowing how these cores >>>>> might be distributed across the nodes. While the system seems to >>>>> allocate cores consecutively, across multiple jobs there is no way of >>>>> knowing which cores are assigned to which job. For example, in an >>>>> 8-core multi-node system, if I ask for 3 cores across 2 nodes (salloc >>>>> -n 3 -N 2) how do I know if 2 cores are allocated from the first node >>>>> and 1 core from the second or visa-versa. Also as nodes are filled up >>>>> with other jobs, and jobs finish at different times, there is no way >>>>> of mapping jobs to particular cores. I've seen from other postings >>>>> that SLURM core numbering might not match the physical hardware core >>>>> numbering, but for my purposes this is not a problem, as long as the >>>>> numbering is consistent. >>>>> >>>>> The reason I'm asking this question, is I'm trying to integrate SLURM >>>>> with PTP (Eclipse Parallel Tools Platform) system monitoring that >>>>> expects to map jobs to nodes and cores in a graphical interface. >>>>> Therefore for jobs on a multi-core cluster, I need to report on which >>>>> cores and nodes a particular job is running, in a specified XML format. >>>>> >>>>> >>>>> Many thanks! >>>>> Mark. >>>>> >>>>> >>>> >>>> >>>> >>> >>