As mentioned in the man page: Computers will only be reused if the number of retries > number of computers (or more correctly: sshlogins).
The order in which the computer is tested is based on the order values are extracted from a Perl hash using 'values'. I am still puzzled why you believe this order will be important. I would believe it is much more important to know that a computer on which the job has failed will not be chosen unless number of retries > number of sshlogins. /Ole On Thu, May 29, 2014 at 9:27 PM, Mitchell Wyle <[email protected]> wrote: > Hi Ole, > > Thanks for the quick reply. I meant, if I have 10 SSHLOGIN computers how > does parallel choose on which one it will dispatch the next job and to which > one it will dispatch a failed job that it is retrying. The selection method > it uses for selecting which computer when it does what the man page says: > "retry it on another computer." round-robin is better than random > (zookeeper) and better than "least loaded." > > Thanks again. > > > > > On Thu, May 29, 2014 at 12:20 PM, Ole Tange <[email protected]> wrote: >> >> On Thu, May 29, 2014 at 8:54 PM, Mitchell Wyle <[email protected]> wrote: >> > Cool! I shall try simple --retries and verify it works. Does it >> > "round >> > robin" the tries? Thanks! >> >> No. It does what it says in the man page: >> >> --retries n >> If a job fails, retry it on another computer. Do >> this n times. If there are fewer than n computers >> in --sshlogin GNU parallel will re-use the >> computers. This is useful if some jobs fail for no >> apparent reason (such as network failure). >> >> Why do you think it would do something else than what it says in the man >> page? >> >> >> /Ole > >
