On 16/08/2014 at 11:51, Achim Gratz <strom...@nexgo.de> wrote: > Just make it the responsibility of the user that each server in the list > given to parallel is actually reachable, don't second-guess the user. > That list may actually be something that the user just gets from > somewhere else, so you should perhaps be flexible with the expected > format.
If the ability to dynamically include/exclude servers is implemented (for instance by re-reading a file containing the list of servers) then the user could take care of maintaining a list of active servers by doing something like (just to get the idea): while true; do parallel -k 'if ssh {} /bin/true; then echo "{}"; fi' ::: host1 host2 ... hostN > active_hosts.slf; sleep 10; done And then starting GNU Parallel as: parallel --slf active_hosts.slf ... Of course, the jobs that were sent to the unavailable servers before they were detected as down will still fail. But in this case I think it is okay to re-run GNU Parallel with --resume-failed. Best, -- Douglas A. Augusto