Re: [OMPI devel] Question about tight integration with not-yet-supported queuing systems

2014-12-01 Thread Gilles Gouaillardet
Marc, i am not aware of any mpi implementation in which mpirun does the job allocation. instead, mpirun gets job info from the batch manager (e.g. number of nodes) so the job can be launched seamlessly and be properly killed in case of a job abort (bkill or equivalent) Cheers, Gilles On

Re: [OMPI devel] Question about tight integration with not-yet-supported queuing systems

2014-12-01 Thread marc . hoeppner
HI, sorry for the late reply - I've been traveling with limited email access. I think you can leave this issue be. I think I was hoping for a way to just launch mpirun and have it create the allocation by itself. It's not super important right now, more something I was wondering about.

Re: [OMPI devel] Question about tight integration with not-yet-supported queuing systems

2014-11-28 Thread Ralph Castain
Hey Marc - just wanted to check to see if you felt this would indeed solve the problem for you. I’d rather not invest the time if this isn’t going to meet the need, and I honestly don’t know of a better solution. > On Nov 20, 2014, at 2:13 PM, Ralph Castain wrote: > >

Re: [OMPI devel] Question about tight integration with not-yet-supported queuing systems

2014-11-20 Thread Ralph Castain
Here’s what I can provide: * lsrun -n N bash This causes openlava to create an allocation and start you off in a bash shell (or pick your shell) * mpirun ….. Will read the allocation and use openlava to start the daemons, and then the application, on the allocated nodes You can execute as

Re: [OMPI devel] Question about tight integration with not-yet-supported queuing systems

2014-11-20 Thread Marc Höppner
Hi, yes, lsrun exists under openlava. Using mpirun is fine, but openlava currently requires that to be launched through a bash script (openmpi-mpirun). Would be neater if one could do away with that. Agan, thanks for looking into this! /Marc Hold on - was discussing this with a (possibly

Re: [OMPI devel] Question about tight integration with not-yet-supported queuing systems

2014-11-20 Thread Ralph Castain
Hold on - was discussing this with a (possibly former) OpenLava developer who made some suggestions that would make this work. It all hinges on one thing. Can you please check and see if you have “lsrun” on your system? If you do, then I can offer a tight integration in that we would use

Re: [OMPI devel] Question about tight integration with not-yet-supported queuing systems

2014-11-18 Thread Marc Höppner
Hi Ralph, I really appreciate you guys looking into this! At least now I know that there isn't a better way to run mpi jobs. Probably worth looking into LSF again.. Cheers, Marc I took a brief gander at the OpenLava source code, and a couple of things jump out. First, OpenLava is a batch

Re: [OMPI devel] Question about tight integration with not-yet-supported queuing systems

2014-11-18 Thread Ralph Castain
I took a brief gander at the OpenLava source code, and a couple of things jump out. First, OpenLava is a batch scheduler and only supports batch execution - there is no interactive command for "run this job". So you would have to "bsub" mpirun regardless. Once you submit the job, mpirun can

Re: [OMPI devel] Question about tight integration with not-yet-supported queuing systems

2014-11-18 Thread Ralph Castain
If you could just run a single copy of "env" and send the output along, that would help a lot. I'm not interested in the usual path etc, but would like to see the envars that OpenLava is setting. Thanks Ralph On Tue, Nov 18, 2014 at 2:19 AM, Gilles Gouaillardet < gilles.gouaillar...@iferc.org>

Re: [OMPI devel] Question about tight integration with not-yet-supported queuing systems

2014-11-18 Thread Gilles Gouaillardet
Marc, the reply you pointed is a bit confusing to me : "There is a native C API which can submit/start/stop/kill/re queue jobs" this is not what i am looking for :-( "you need to make an appropriate call to openlava to start a remote process" this is what i am interested in :-) could you be

Re: [OMPI devel] Question about tight integration with not-yet-supported queuing systems

2014-11-18 Thread Gilles Gouaillardet
Hi Marc, OpenLava is based on a pretty old version of LSF (4.x if i remember correctly) and i do not think LSF had support for parallel jobs tight integration at that time. my understanding is that basically, there is two kind of direct integration : - mpirun launch: mpirun spawns orted via the

[OMPI devel] Question about tight integration with not-yet-supported queuing systems

2014-11-18 Thread Marc Höppner
Hi list, I have recently started to wonder how hard it would be to add support for queuing systems to the tight integration function of OpenMPI (unfortunately, I am not a developer myself). Specifically, we are working with OpenLava (www.openlava.org), which is based on an early version of