I'm announcing an updated version of the node status tool "pestat" for Slurm.

The job list for each node may now optionally include the (expected) job EndTime using the -E option. This information is very useful when you are waiting for a draining node to be cleared of jobs. For example, it's nice to know when the node may be shut down for repairs. The attached screenshot shows an example node status.

Download the tool (a bash script) and other files from GitHub:
https://github.com/OleHolmNielsen/Slurm_tools/tree/master/pestat

Usage: pestat [-p partition(s)] [-u username] [-g groupname]
        [-q qoslist] [-s statelist] [-n/-w hostlist] [-j joblist]
        [-f | -F | -m free_mem | -M free_mem ] [-1] [-E] [-C/-c] [-V] [-h]
where:
        -p partition: Select only partion <partition>
        -u username: Print only user <username>
        -g groupname: Print only users in UNIX group <groupname>
        -q qoslist: Print only QOS in the qoslist <qoslist>
        -s statelist: Print only nodes with state in <statelist>
        -n/-w hostlist: Print only nodes in hostlist
        -j joblist: Print only nodes in job <joblist>
        -f: Print only nodes that are flagged by * (unexpected load etc.)
        -F: Like -f, but only nodes flagged in RED are printed.
        -m free_mem: Print only nodes with free memory LESS than free_mem MB
-M free_mem: Print only nodes with free memory GREATER than free_mem MB (under-utilized) -1: Only 1 line per node (unique nodes in multiple partitions are printed once only)
        -E: Job EndTime is printed after each jobid/user
        -C: Color output is forced ON
        -c: Color output is forced OFF
        -h: Print this help information
        -V: Version information

Global configuration file for pestat: /etc/pestat.conf
Per-user configuration file for pestat: /root/.pestat.conf

/Ole

Reply via email to