Re: file permissions on joblog

Christian Meesters Thu, 28 Jul 2022 08:28:50 -0700


On 7/28/22 14:56, Rob Sargent wrote:

On Jul 28, 2022, at 1:10 AM, Christian Meesters<meest...@uni-mainz.de>  wrote:
Hi,

not quite. Under SLURM the jobstep starter (SLURM lingo) is "srun". You do not do ssh from job host 
to job host, but rather use "parallel" as a semaphore avoiding over subscription of job steps with 
"srun". I summarized this approach here:

https://mogonwiki.zdv.uni-mainz.de/dokuwiki/start:working_on_mogon:workflow_organization:node_local_scheduling#running_on_several_hosts
  (uh-oh - I need to clean up that site, many outdated sections there, but this 
one should still be ok)

One advantage: you can safely utilize the resources of both (or more) hosts - 
the master hosts and all secondaries. How much resources you require depends on 
your application and the work it does. Be sure to consider I/O (e.g. stage-in 
file to avoid random I/O with too many concurrent applications, etc.), if this 
is an issue for your application.

Cheers

Christian

Christian,
My use of GNU parallel does not include ssh. Rather I simply fill the slurm  
node with —jobs=ncores

That would require to have an interactive job and havingncores_per_node/threads_per_application ssh-connections, and you have tomanually trigger the script. My solution is to use parallel in aSLURM-job context and avoid the synchronization step by a human, whilstoffering a potential multi-node job with smp applications. It's yourchoice, of course.


Ole,
Is your suggestion that I should ssh back to my account and run the job?  
Pretty sure 2FA will get in the way.

Thanks to you both,
rjs

Re: file permissions on joblog

Reply via email to