Re: [Beowulf] Puzzling Intel mpi behavior with slurm
On Thu, 05 Apr 2018 09:10:57 -0600 Faraz Hussainwrote: > Here's something quite baffling. I have a cluster running slurm but > have not setup passwordless ssh for a user yet. So when the user > runs "mpirun -n 2 -hostfile hosts hostname", it will hang because of > ssh issue. That is expected. > > Now the baffling thing is the mpirun command works inside a slurm > script! How can it work if passwordless ssh has not been configured? > Does slurm use some different authentication (munge?) to login to > the hosts and execute the hostname command? What happens is that mpirun sees the slurm environment variables and switches to a slurm aware mode. In this mode it uses srun to to launch pmi_proxy processes on each node of the job. Then it proceeds to start all ranks using these pmi_proxy processes. The process tree ends up being something like this on the first node: slurmd->slurmstepd->bash(jobscript)->mpirun->srun -w nodes[..] pmi_proxy And on the other nodes: slurmd->slurmstepd->pmi_proxy->rank[0...n] Authentication/authorization is handled by slurm and depens on how you set it up (often munge). Cheers, Peter K ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Re: [Beowulf] Puzzling Intel mpi behavior with slurm
On 04/09/2018 04:58 AM, Chris Samuel wrote: On Saturday, 7 April 2018 5:37:28 AM AEST Prentice Bisbal wrote: To really complicate things, you should look at process management interface (PMI). This is a middle layer between Slurm (or an other scheduler) and the MPI tasks. It's a standardized abstraction layer to make programming MPI implementations and schedulers easier. It also increases startup time of the MPI jobs, which is not insignificant for large jobs. Hopefully PMI/PMI2/PMIx decreases the startup time! :-) I think I meant to say "increases startup performance", or something like that. Thanks for catching this error which was the exact opposite of what I was trying to say. There's a presentation on PMIx (the latest version) here: https://slurm.schedmd.com/SC17/PMIx-SC17.pdf You need to be careful about the versioning with PMIx and Slurm versions, there is information on getting it working on both sites: https://slurm.schedmd.com/mpi_guide.html#pmix https://pmix.org/support/how-to/slurm-support/ Hope this helps! Chris ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Re: [Beowulf] Puzzling Intel mpi behavior with slurm
On Saturday, 7 April 2018 5:37:28 AM AEST Prentice Bisbal wrote: > To really complicate things, you should look at process management interface > (PMI). This is a middle layer between Slurm (or an other scheduler) and the > MPI tasks. It's a standardized abstraction layer to make programming MPI > implementations and schedulers easier. It also increases startup time of > the MPI jobs, which is not insignificant for large jobs. Hopefully PMI/PMI2/PMIx decreases the startup time! :-) There's a presentation on PMIx (the latest version) here: https://slurm.schedmd.com/SC17/PMIx-SC17.pdf You need to be careful about the versioning with PMIx and Slurm versions, there is information on getting it working on both sites: https://slurm.schedmd.com/mpi_guide.html#pmix https://pmix.org/support/how-to/slurm-support/ Hope this helps! Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Re: [Beowulf] Puzzling Intel mpi behavior with slurm
See the URL below for a good overview of how Slurm works: https://slurm.schedmd.com/quickstart.html The way I understand it, tasks are started by Slurmd. Ssh is not involved at all. SGE does the same thing with 'tight integration'. The tasks are started on the compute nodes by sgeexecd, which spawns an sge sheperd task, which then spawns the actual task. To really complicate things, you should look at process management interface (PMI). This is a middle layer between Slurm (or an other scheduler) and the MPI tasks. It's a standardized abstraction layer to make programming MPI implementations and schedulers easier. It also increases startup time of the MPI jobs, which is not insignificant for large jobs. www.mcs.anl.gov/papers/P1760.pdf Prentice On 04/05/2018 11:10 AM, Faraz Hussain wrote: Here's something quite baffling. I have a cluster running slurm but have not setup passwordless ssh for a user yet. So when the user runs "mpirun -n 2 -hostfile hosts hostname", it will hang because of ssh issue. That is expected. Now the baffling thing is the mpirun command works inside a slurm script! How can it work if passwordless ssh has not been configured? Does slurm use some different authentication (munge?) to login to the hosts and execute the hostname command? Or does slurm have some fancy behind the scenes integration with Intel mpi ? ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Re: [Beowulf] Puzzling Intel mpi behavior with slurm
At least for Grid Engine/OpenMPI the preferred mechanism ("tight integration") involves the shepherds running on each exec hosts to start MPI, without any SSH/RSH required at all. I'm not sure if you've run across this documentation, but it might help to figure out what's going on: https://slurm.schedmd.com/mpi_guide.html#intel_mpi I'm guessing you're using the "srun" method right now. Skylar On Thu, Apr 5, 2018 at 8:10 AM, Faraz Hussainwrote: > Here's something quite baffling. I have a cluster running slurm but have > not setup passwordless ssh for a user yet. So when the user runs "mpirun -n > 2 -hostfile hosts hostname", it will hang because of ssh issue. That is > expected. > > Now the baffling thing is the mpirun command works inside a slurm script! > How can it work if passwordless ssh has not been configured? Does slurm use > some different authentication (munge?) to login to the hosts and execute > the hostname command? > > Or does slurm have some fancy behind the scenes integration with Intel mpi > ? > > ___ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Re: [Beowulf] Puzzling Intel mpi behavior with slurm
i'm pretty sure, but don't quote me, that slurm forks processes from the slurmd to launch code and does not use ssh On Thu, Apr 5, 2018 at 11:10 AM, Faraz Hussainwrote: > Here's something quite baffling. I have a cluster running slurm but have not > setup passwordless ssh for a user yet. So when the user runs "mpirun -n 2 > -hostfile hosts hostname", it will hang because of ssh issue. That is > expected. > > Now the baffling thing is the mpirun command works inside a slurm script! > How can it work if passwordless ssh has not been configured? Does slurm use > some different authentication (munge?) to login to the hosts and execute the > hostname command? > > Or does slurm have some fancy behind the scenes integration with Intel mpi ? > > ___ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf ___ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf