On 16/08/2014 at 11:51,
Achim Gratz <strom...@nexgo.de> wrote:

> Just make it the responsibility of the user that each server in the list
> given to parallel is actually reachable, don't second-guess the user.
> That list may actually be something that the user just gets from
> somewhere else, so you should perhaps be flexible with the expected
> format.

If the ability to dynamically include/exclude servers is implemented (for
instance by re-reading a file containing the list of servers) then the user
could take care of maintaining a list of active servers by doing something
like (just to get the idea):

   while true; do parallel -k 'if ssh {} /bin/true; then echo "{}"; fi' ::: 
host1 host2 ... hostN > active_hosts.slf; sleep 10; done

And then starting GNU Parallel as:

   parallel --slf active_hosts.slf ...

Of course, the jobs that were sent to the unavailable servers before they were
detected as down will still fail. But in this case I think it is okay to re-run
GNU Parallel with --resume-failed.


Best,

-- 
Douglas A. Augusto

Reply via email to