Hi

Been looking into this a bit more and it seems that part of the problem
is in sbatch where it modifies the ntasks value.

src/sbatch/opt.c" line 2406


        /* massage the numbers */
        if ((opt.nodes_set || opt.extra_set)                            &&
            ((opt.min_nodes == opt.max_nodes) || (opt.max_nodes == 0))  &&
            !opt.ntasks_set) {
                /* 1 proc / node default */
                opt.ntasks = MAX(opt.min_nodes, 1);

If I remove the check for opt.min_nodes == opt.max_nodes, my job works.

I also made a change in src/slurmctld/node_scheduler.c at line 846 to
set req_nodes = to max_nodes instead of min_nodes but I'm not sure that
does anything, it just looked wrong. I'll change it back tomorrow and
see if my job still works.

This is the command that would normally fail but now works, d1 has 16
nodes each with 16 cores and I'm using con_res with CR_CPU.

sbatch  -p d1 -N15-16 -c 4  

but, any value of min_cpu <= num_cpus only allocates 4 nodes, -N5-16
gives me 16 nodes - weird!

Cheers,


On Mon, 2014-06-16 at 17:48 -0700, Franco Broi wrote: 
> 
> You can't currently submit a job with -Nmin<max:max and -c < all cpus,
> you get a bad constraints error.
> 
> A few people have reported this bug over the past several months but I
> haven't seen an mention of a fix.
> 
> Cheers,

Reply via email to