Hi all, *bump*
I can't believe no one has an explanation for this parameter... Regards, Uwe Am 02.09.2014 um 16:30 schrieb Uwe Sauter: > > Hi all, > > I'm a bit confused by the explanation of the "BatchStartTimeout" option. > It states: > > "Specifies how long to wait after a batch job start request is issued > before we expect the batch job to be running on the compute node. > Depending upon how nodes are returned to service, this value may need to > be increased above its default value of 10 seconds." > > It is unclear from which point in time this timeout gets counted. Some > possibilities: > > - when a batch job was submitted > - when SLURM executes the ResumeProgram command > - when the node's slurm daemon contacts the controller daemon > > Can someone reword the explanation or give details about this option? > > Are there recommendations, e.g. linked to ResumeTimeout? > > > Thanks, > > Uwe >