Re: [OMPI devel] Running on Kubernetes

2018-03-16 Thread r...@open-mpi.org
I haven’t really spent any time with Kubernetes, but it seems to me you could just write a Kubernetes plm (and maybe an odls) component and bypass the ssh stuff completely given that you say there is a launcher API. > On Mar 16, 2018, at 11:02 AM, Jeff Squyres (jsquyres) >

Re: [OMPI devel] Running on Kubernetes

2018-03-16 Thread Jeff Squyres (jsquyres)
On Mar 16, 2018, at 10:01 AM, Gilles Gouaillardet wrote: > > By default, Open MPI uses the rsh PLM in order to start a job. To clarify one thing here: the name of our plugin is "rsh" for historical reasons, but it defaults to looking to looking for "ssh" first.

Re: [OMPI devel] Upcoming nightly tarball URL changes

2018-03-16 Thread Barrett, Brian via devel
Eventual consistency for the win. It looks like I forgot to set a short cache time for the CDN that fronts the artifact repository. So the previous day’s file was returned. I fixed that and flushed the cache on the CDN, so it should work now. Brian On Mar 15, 2018, at 10:41 PM, Boris

Re: [OMPI devel] Running on Kubernetes

2018-03-16 Thread Gilles Gouaillardet
Hi Rong, SSH is safe when properly implemented. That being said, some sites does not allow endusers to directly SSH into compute nodes because they do not want them to do anything without the resource manager knowing about it. What is your concern with SSH ? You can run a resource manager (such

[OMPI devel] Running on Kubernetes

2018-03-16 Thread Rong Ou
Hi, I've implemented a Kubernetes operator for running OpenMPI jobs there. An operator is just a Custom Resource Definition (CRD) for a new type of mpi-aware batch job, and a custom controller that manages the states of those jobs.