sorry for the double post. i'm not sure why gmail decided to send the message again, i didn't send it twice...
On Wed, Nov 25, 2015 at 9:33 AM, Michael Di Domenico <mdidomeni...@gmail.com> wrote: > > is it possible to add or remove just a single node from a partition > without having to re-establish the whole list of nodes? > > for example > > if i have nodes[001-100] and i want to remove only node 049. is there > some incantation that will allow me to do that without having to say > nodes[001-048,050-100] > > the motivation is that we have a mixed pool of nodes some with gpu's > and some without. as our cluster ages, the gpus are getting flaky. > often the gpu flakes out or dies, but the rest of the node is > perfectly fine. > > i'd like to dynamically move a node out of the gpu partition and into > a non-gpu partition using a node-health script > > yes, gres would probably handle this better then split partitions, but > we haven't rolled to gres allocations on the gpu's yet