Dear Ana: > One thing I still do not have clear is how adding/removing nodes from > the cluster affects SLURM (the queue, the state information, etc.) and > jobs already running. Is it equivalent to "scontrol reconfigure" where > you have to restart slurmctld every time you add or remove a node?
Yes, ElastiCluster writes the new config file and then restarts the SLURM daemons. > What if we want to do these changes of nodes frequently, how does this > affect the users? They should not notice, except for the occasional glitch while `slurmctld` is restarted. I'll admit that this has not received much testing, though (OTOH, neither have I received any bug reports on this). Note that, when scaling down a cluster, you should (1) set nodes you want to remove in "DRAIN" state, (2) use `elasticluster remove-node` to remove them. (`elasticluster resize -r` removes the nodes immediately starting with the highest-numbered ones, regardless of whether they are running any jobs.) Ciao, R -- Riccardo Murri, Schwerzenbacherstrasse 2, CH-8606 Nänikon, Switzerland -- You received this message because you are subscribed to the Google Groups "elasticluster" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
