[slurm-dev] sacct -a --nnodes=2-5

2015-07-08 Thread Danny Rotscher
Hello, I've question to the parameter --nnodes of sacct. sacct -a --nnodes=Min-Max I understand the line above, that sacct shows all jobs which have a node count between min and max including the min and max values. But as you could see in the following section, it doesn't work as I suggest.

[slurm-dev] Re: Slurm versions 14.11.8 and 15.08.0-pre6 are now available

2015-07-08 Thread Martins Innus
Moe, On 7/7/15 7:04 PM, Moe Jette wrote: -- Backfill scheduler: The configured backfill_interval value (default 30 seconds) is now interpretted as a maximum run time for the backfill scheduler. Once reached, the scheduler will build a new job queue and start over, even if not

[slurm-dev] Re: More reservation woes

2015-07-08 Thread John Desantis
Bill and Bruce, We are just in order to get fairshare. I'm not gonna do that in production though, that sounds dangerous. You mean you don't want to be the guinea pig?! The reservations in the database are only for historical purposes, they don't get read in from the slurmctld. The DBD

[slurm-dev] Re: Slurm versions 14.11.8 and 15.08.0-pre6 are now available

2015-07-08 Thread Moe Jette
The backfill scheduler will get to the end of the queue if it can do so in 30 seconds (or whatever you have backfill_interval configured to be). The sdiag command will report actual scheduler run times. The cycle times are in units of microseconds. Quoting Martins Innus

[slurm-dev] Re: sacct -a --nnodes=2-5

2015-07-08 Thread Michael Kit Gilbert
Danny, I think if you add more to your output you will likely see that the results that display with 1 node are actually just the batch parts of another job, but they show up on their own line. I am not a slurm expert, so there could be other reasons for the 1s showing up in your output, but this

[slurm-dev] Impact on cluster when slurmdbd's database is offline

2015-07-08 Thread Trey Dockendorf
Last night for 8 hours our MySQL server was offline due to a storage failure. As far as I can tell from the slurmctld logs jobs continued to be started and complete successfully. The only errors I saw were in the slurmdbd logs, as expected. What, if any, impact on a SLURM cluster will there be