Hi,
I use parallel to distribute jobs over nodes in our 39-node cluster
(embarrassingly parallel problems, so can split them up arbitrarily).
However, sometimes some of the nodes are down (like now), and so what I
would like to do is the following:
 1) submit jobs that will take several hours to run, during which time I
won't have anything else in particular to do
 2) Go work on bringing cluster nodes back up
 3) Change ~/.parallel/sshloginfile
 4) GNU parallel notices that the file has changed, just like if I were
using -j procfile, and immediately starts jobs on those additional nodes.

I am using parallel 20110522.  Is this behavior already implemented?  If
not, I would like to request this feature.
Regards,
Jon Wilson

Reply via email to