Hi

Does the ResumeTimeout apply to nodes that are part of an allocation
where the majority of the nodes are already awake? We quite often see
jobs starting before the few sleeping nodes have woken up, these jobs
stay 'configuring' until they complete, providing the program being run
can cope with some missing nodes. If the whole allocation of nodes is
asleep, it works as expected.

I have ResumeTimeout set to 3 minutes which is plenty of time for a node
to boot providing it's not fsck'ing its disks.

Cheers,

Reply via email to