Hi all, Reposting here from stack overflow upon request. Since the functionality doesn't currently exist, I guess this could be described as an enhancement suggestion. Let me describe the use case, and then the functionality I was proposing.
My GNU parallel use case is mostly to manage batch processing within SLURM on a HPC cluster. I know a few others in my community who also do, mostly at NERSC because of their documentation suggesting it ( https://docs.nersc.gov/jobs/workflow/gnuparallel/). However, a lot of the larger academic computing groups often have group-owned machines on the cluster (which are outside of SLURM control) or have access to multiple different queues. I think it would be nice to be able to create the parent GNU parallel process on a machine that you own (and so it is always running) and when a SLURM allocation is granted on one queue or another, those machines just add their addresses to the nodelist of the GNU parallel job. This allows the job to keep running and make maximal use of fluctuating resources. I think the only "feature" really needed to make this possible is a flag that changes how frequently the "nodelist" is checked. Personally, my tasks are often 8h+ and I wouldn't want to waste 8h of an allocation waiting for the parent process to have a task return before it checks the nodelist again. Would be interested to hear if other people have similar use cases/would benefit and how hard it would be to add that functionality. Thanks, Andrew Saydjari