Thank you. Your patch will be in version 2.3.1. We will probably not release any more new versions of 2.2.

Quoting Ramiro Alba <[email protected]>:

Hi all,

I've found that slurmctld does not take into account nodes in power save
modes when using healthcheck functionality, so it tries to contact not
responding nodes, resulting in being set to:

node_ptr->not_responding = true

and then, they are any more available to assign resources.

The problem is solved by changing at the 'void run_health_check(void)'
function in file 'ping_nodes.c':

if (IS_NODE_NO_RESPOND(node_ptr) || IS_NODE_FUTURE(node_ptr))
    continue;

by the conditional

if (IS_NODE_NO_RESPOND(node_ptr) || IS_NODE_FUTURE(node_ptr) ||
    IS_NODE_POWER_SAVE(node_ptr))
       continue;

This bug is present both in 2.2.7 and 2.3.0 version as far as I could
test.

Cheers

--
Ramiro Alba

Centre Tecnològic de Tranferència de Calor
http://www.cttc.upc.edu


Escola Tècnica Superior d'Enginyeries
Industrial i Aeronàutica de Terrassa
Colom 11, E-08222, Terrassa, Barcelona, Spain
Tel: (+34) 93 739 86 46



--
Aquest missatge ha estat analitzat per MailScanner
a la cerca de virus i d'altres continguts perillosos,
i es considera que est� net.





Reply via email to