Hi Nicolas,

There are a few possible ways to do this. I would suggest that
you submit the jobs specifying different bank accounts and
configure the different bank accounts each have the same
share of the machine. It is a bit of effort to configure this,
but it is the most flexible solution:
http://www.schedmd.com/slurmdocs/priority_multifactor.html#fairshare

Other options would be multiple slurm partitions/queues, job
dependencies, advanced reservations, and multiple qualities of
service.

Moe Jette
SchedMD



Quoting Nicolas Bigaouette <[email protected]>:

Hi all,

During the weekend I submited many thousands of jobs through slurm. These
jobs are submited using a script that edit a file with the simulation
parameters and call sbatch on it. All of these simulations are part of the
same "group": due to the Monte-Carlo nature of them, I need to run many of
them to acquire a good statistics of the problem.

My issue is that I need to submit many of these groups of simulations, each
of them requiring thousands of runs. But I don't want a single group to
monopolise the whole cluster until the thousands of runs are done. What I
want is simulations from differente "groups" being run alternatively so even
though every group are not finished running, I can start building the
statistics from every group slowly and see if something is emerging of if I
need to cancel, change parameters, etc.

Also, sharing the cluster would be easier: I can submit my thousands of jobs
but another user could still run something before all my jobs are done.

Would something like that be possible? I hope I was not to confusing. I
don't use any external scheduler for now (SchedulerType=sched/backfill); is
slurm able to achieve something like that? If not, what would you suggest?

Thanks a lot for your answers!

Regards,

Nicolas





Reply via email to