I'm announcing an updated version 0.51 of the node status tool "pestat" for Slurm.

New features:

1. Turning on colors explicitly even when the output doesn't go to a terminal with the -C flag (and -c to turn off colors). Thanks to Fermin Molina <[email protected]> for requesting this! Fermin's suggests to allow a nice continuous monitoring of "flagged" nodes with:

# watch -n 60 --color 'pestat -f -C'

2. Added -n/-w hostlist to select a subset of nodes. The -n form is for compatibility with sinfo, whereas -w is compatible with pdsh and clush (ClusterShell).

Download the tool (a short bash script) from https://ftp.fysik.dtu.dk/Slurm/pestat. If your commands do not live in /usr/bin, please make appropriate changes in the CONFIGURE section at the top of the script.

Usage: pestat [-p partition(s)] [-u username] [-q qoslist] [-s statelist] [-n/-w hostlist]
        [-f | -m free_mem | -M free_mem ] [-C/-c] [-V] [-h]
where:
        -p partition: Select only partion <partition>
        -u username: Print only user <username>
        -q qoslist: Print only QOS in the qoslist <qoslist>
        -s statelist: Print only nodes with state in <statelist>
        -n/-w hostlist: Print only nodes in hostlist
        -f: Print only nodes that are flagged by * (unexpected load etc.)
        -m free_mem: Print only nodes with free memory LESS than free_mem MB
-M free_mem: Print only nodes with free memory GREATER than free_mem MB (under-utilized)
        -C: Color output is forced ON
        -c: Color output is forced OFF
        -h: Print this help information
Usage: pestat [-p partition(s)] [-u username] [-q qoslist] [-s statelist]
        [-f | -m free_mem | -M free_mem ] [-V] [-h]
where:
        -p partition: Select only partion <partition>
        -u username: Print only user <username>
        -q qoslist: Print only QOS in the qoslist <qoslist>
        -s statelist: Print only nodes with state in <statelist>
        -f: Print only nodes that are flagged by * (unexpected load etc.)
        -m free_mem: Print only nodes with free memory LESS than free_mem MB
-M free_mem: Print only nodes with free memory GREATER than free_mem MB (under-utilized)
        -h: Print this help information
        -V: Version information
        -V: Version information

I use "pestat -f" all the time because it prints and flags (in color) only the nodes which have an unexpected CPU load or node status, for example:

# pestat  -f
Print only nodes that are flagged by *
Hostname       Partition     Node Num_CPU  CPUload  Memsize  Freemem Joblist
State Use/Tot (MB) (MB) JobId User ... a066 xeon8* alloc 8 8 8.04 23900 173* 91683 user01 a067 xeon8* alloc 8 8 8.07 23900 181* 91683 user01 a083 xeon8* alloc 8 8 8.06 23900 172* 91683 user01


The -s option is useful for checking on possibly unusual node states, for example:

# pestat -s mixed

--
Ole Holm Nielsen
Department of Physics, Technical University of Denmark

Reply via email to