Moe,

There are a couple of other problems with the --switch option that have 
surfaced. I can make the patches but I'd like your advice first.


1) when the max-time is not specified, and the number of switches selected 
is greater than the requested number, the job will be requeued once. I 
could document this as the desired behavior. Other options would be to set 
the wait time to the site max; or  to not wait at all, but that basically 
means the --switch without a time is a nop. I'm leaning towards setting it 
to the site max.

2) the max time is a time string. That means it's lowest resolution is 
minutes. Since the default value for the site max (max_switch_wait) is 60 
seconds, it basically means the max-time is a nop. I'm not in favor of 
raising the max-time default. I'd rather make the value the number of 
seconds to wait. 




From:   Moe Jette <je...@schedmd.com>
To:     slurm-dev@lists.llnl.gov, 
Date:   01/18/2012 12:48 PM
Subject:        Re: [slurm-dev] bug when --switch option used
Sent by:        owner-slurm-...@lists.llnl.gov



Alex,

Thanks for the patch. This will be in version 2.3.3 plus a fix to 
similar logic in the select/linear plugin used to count the leaf 
switches used.

Moe

Quoting Alejandro Lucero Palau <alejandro.luc...@bsc.es>:

> Hi,
>
> I've found a bug related to use of --switch option for requesting a
> number of switches when topology plugin is used.
>
> There are two problems:
>
> 1) leaf_switch_counter is incremented in the wrong place leading to
> count tested switches instead of just the selected switch.
> 2) When _select_nodes is called, checking for best_switch value is not
> always done. This problem could lead to spread a job through more than
> requested switches.
>
> Patch attached.
>
>
> WARNING / LEGAL TEXT: This message is intended only for the use of the
> individual or entity to which it is addressed and may contain
> information which is privileged, confidential, proprietary, or exempt
> from disclosure under applicable law. If you are not the intended
> recipient or the person responsible for delivering the message to the
> intended recipient, you are strictly prohibited from disclosing,
> distributing, copying, or in any way using this message. If you have
> received this communication in error, please notify the sender and
> destroy and delete any copies you may have received.
>
> http://www.bsc.es/disclaimer.htm





Reply via email to