On Thu, Jul 28, 2022 at 5:45 PM Rob Sargent <robjsarg...@gmail.com> wrote: > > On 7/28/22 09:28, Christian Meesters wrote: > > > On 7/28/22 14:56, Rob Sargent wrote: > > On Jul 28, 2022, at 1:10 AM, Christian Meesters <meest...@uni-mainz.de> wrote: > Hi, > > not quite. Under SLURM the jobstep starter (SLURM lingo) is "srun". You do > not do ssh from job host to job host, but rather use "parallel" as a > semaphore avoiding over subscription of job steps with "srun". I summarized > this approach here: > > https://mogonwiki.zdv.uni-mainz.de/dokuwiki/start:working_on_mogon:workflow_organization:node_local_scheduling#running_on_several_hosts > (uh-oh - I need to clean up that site, many outdated sections there, but > this one should still be ok) > > One advantage: you can safely utilize the resources of both (or more) hosts - > the master hosts and all secondaries. How much resources you require depends > on your application and the work it does. Be sure to consider I/O (e.g. > stage-in file to avoid random I/O with too many concurrent applications, > etc.), if this is an issue for your application. > > Cheers > > Christian > > Christian, > My use of GNU parallel does not include ssh. Rather I simply fill the slurm > node with —jobs=ncores > > That would require to have an interactive job and having > ncores_per_node/threads_per_application ssh-connections, and you have to > manually trigger the script. My solution is to use parallel in a SLURM-job > context and avoid the synchronization step by a human, whilst offering a > potential multi-node job with smp applications. It's your choice, of course. > > > if I follow correctly that is what I am doing. Here's my slurm job
Would this work: #!/bin/bash LOGDIR=/scratch/general/pe-nfs1/u0138544/logs chmod a+x $LOGDIR/* logfile="$LOGDIR"/mylog.$$ touch "$logfile" chmod -R a+rw "$LOGDIR" . /uufs/chpc.utah.edu/sys/installdir/sgspub/bin/sgsCP.sh parallel \ --joblog +"$joblog" \ --verbose \ --jobs 50% \ --delay 1 \ /uufs/chpc.utah.edu/sys/installdir/sgspub/bin/chaser-10Mt 83a9a2ad-fe16-4872-b629-b9ba70ed5bbb $endtime $JOBDIR ::: {1..750} The idea is the same as my original: Make the file and set the permissions before starting GNU Parallel. /Ole