Hey Marc - just wanted to check to see if you felt this would indeed solve the problem for you. I’d rather not invest the time if this isn’t going to meet the need, and I honestly don’t know of a better solution.
> On Nov 20, 2014, at 2:13 PM, Ralph Castain <r...@open-mpi.org> wrote: > > Here’s what I can provide: > > * lsrun -n N bash This causes openlava to create an allocation and start you > off in a bash shell (or pick your shell) > > * mpirun ….. Will read the allocation and use openlava to start the > daemons, and then the application, on the allocated nodes > > You can execute as many mpirun’s as you like, then release the allocation (I > believe by simply exiting the shell) when done. > > Does that match your expectations? > Ralph > > >> On Nov 20, 2014, at 2:03 PM, Marc Höppner <marc.hoepp...@bils.se >> <mailto:marc.hoepp...@bils.se>> wrote: >> >> Hi, >> >> yes, lsrun exists under openlava. >> >> Using mpirun is fine, but openlava currently requires that to be launched >> through a bash script (openmpi-mpirun). Would be neater if one could do away >> with that. >> >> Agan, thanks for looking into this! >> >> /Marc >> >>> Hold on - was discussing this with a (possibly former) OpenLava developer >>> who made some suggestions that would make this work. It all hinges on one >>> thing. >>> >>> Can you please check and see if you have “lsrun” on your system? If you do, >>> then I can offer a tight integration in that we would use OpenLava to >>> actually launch the OMPI daemons. Still not sure I could support you >>> directly launching MPI apps without using mpirun, if that’s what you are >>> after. >>> >>> >>>> On Nov 18, 2014, at 7:58 AM, Marc Höppner <marc.hoepp...@bils.se >>>> <mailto:marc.hoepp...@bils.se>> wrote: >>>> >>>> Hi Ralph, >>>> >>>> I really appreciate you guys looking into this! At least now I know that >>>> there isn't a better way to run mpi jobs. Probably worth looking into LSF >>>> again.. >>>> >>>> Cheers, >>>> >>>> Marc >>>>> I took a brief gander at the OpenLava source code, and a couple of things >>>>> jump out. First, OpenLava is a batch scheduler and only supports batch >>>>> execution - there is no interactive command for "run this job". So you >>>>> would have to "bsub" mpirun regardless. >>>>> >>>>> Once you submit the job, mpirun can certainly read the local allocation >>>>> via the environment. However, we cannot use the OpenLava internal >>>>> functions to launch the daemons or processes as the code is GPL2, and >>>>> thus has a viral incompatible license. Ordinarily, we get around that by >>>>> just executing the interactive job execution command, but OpenLava >>>>> doesn't have one. >>>>> >>>>> So we'd have no other choice but to use ssh to launch the daemons on the >>>>> remote nodes. This is exactly what the provided openmpi wrapper script >>>>> that comes with OpenLava already does. >>>>> >>>>> Bottom line: I don't see a way to do any deeper integration minus the >>>>> interactive execution command. If OpenLava had a way of getting an >>>>> allocation and then interactively running jobs, we could support what you >>>>> requested. This doesn't seem to be what they are intending, unless I'm >>>>> missing something (the documentation is rather incomplete). >>>>> >>>>> Ralph >>>>> >>>>> >>>>> On Tue, Nov 18, 2014 at 6:20 AM, Marc Höppner <marc.hoepp...@bils.se >>>>> <mailto:marc.hoepp...@bils.se>> wrote: >>>>> Hi, >>>>> >>>>> sure, no problem. And about the C Api, I really don’t know more than what >>>>> I was told in the google group post I referred to (i.e. the API is >>>>> essentially identical to LSF 4-6, which should be on the web). >>>>> >>>>> The output of env can be found here: >>>>> https://dl.dropboxusercontent.com/u/1918141/env.txt >>>>> <https://dl.dropboxusercontent.com/u/1918141/env.txt> >>>>> >>>>> /M >>>>> >>>>> Marc P. Hoeppner, PhD >>>>> Team Leader >>>>> BILS Genome Annotation Platform >>>>> Department for Medical Biochemistry and Microbiology >>>>> Uppsala University, Sweden >>>>> marc.hoepp...@bils.se <mailto:marc.hoepp...@bils.se> >>>>>> On 18 Nov 2014, at 15:14, Ralph Castain <r...@open-mpi.org >>>>>> <mailto:r...@open-mpi.org>> wrote: >>>>>> >>>>>> If you could just run a single copy of "env" and send the output along, >>>>>> that would help a lot. I'm not interested in the usual path etc, but >>>>>> would like to see the envars that OpenLava is setting. >>>>>> >>>>>> Thanks >>>>>> Ralph >>>>>> >>>>>> >>>>>> On Tue, Nov 18, 2014 at 2:19 AM, Gilles Gouaillardet >>>>>> <gilles.gouaillar...@iferc.org <mailto:gilles.gouaillar...@iferc.org>> >>>>>> wrote: >>>>>> Marc, >>>>>> >>>>>> the reply you pointed is a bit confusing to me : >>>>>> >>>>>> "There is a native C API which can submit/start/stop/kill/re queue jobs" >>>>>> this is not what i am looking for :-( >>>>>> >>>>>> "you need to make an appropriate call to openlava to start a remote >>>>>> process" >>>>>> this is what i am interested in :-) >>>>>> could you be more specific (e.g. point me to the functions, since the >>>>>> OpenLava doc is pretty minimal ...) >>>>>> >>>>>> the goal here is to spawn the orted daemons as part of the parallel job, >>>>>> so these daemons are accounted within the parallel job. >>>>>> /* if we use an API that simply spawns orted, but the orted is not >>>>>> related whatsoever to the parallel job, >>>>>> then we can simply use ssh */ >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Gilles >>>>>> >>>>>> >>>>>> On 2014/11/18 18:24, Marc Höppner wrote: >>>>>>> Hi Gilles, >>>>>>> >>>>>>> thanks for the prompt reply. Yes, as far as I know there is a C API to >>>>>>> interact with jobs etc. Some mentioning here: >>>>>>> https://groups.google.com/forum/#!topic/openlava-users/w74cRUe9Y9E >>>>>>> <https://groups.google.com/forum/#%21topic/openlava-users/w74cRUe9Y9E> >>>>>>> <https://groups.google.com/forum/#!topic/openlava-users/w74cRUe9Y9E> >>>>>>> <https://groups.google.com/forum/#%21topic/openlava-users/w74cRUe9Y9E> >>>>>>> >>>>>>> >>>>>>> /Marc >>>>>>> >>>>>>> Marc P. Hoeppner, PhD >>>>>>> Team Leader >>>>>>> BILS Genome Annotation Platform >>>>>>> Department for Medical Biochemistry and Microbiology >>>>>>> Uppsala University, Sweden >>>>>>> marc.hoepp...@bils.se <mailto:marc.hoepp...@bils.se> >>>>>>> >>>>>>>> On 18 Nov 2014, at 08:40, Gilles Gouaillardet >>>>>>>> <gilles.gouaillar...@iferc.org> <mailto:gilles.gouaillar...@iferc.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>> Hi Marc, >>>>>>>> >>>>>>>> OpenLava is based on a pretty old version of LSF (4.x if i remember >>>>>>>> correctly) >>>>>>>> and i do not think LSF had support for parallel jobs tight integration >>>>>>>> at that time. >>>>>>>> >>>>>>>> my understanding is that basically, there is two kind of direct >>>>>>>> integration : >>>>>>>> - mpirun launch: mpirun spawns orted via the API provided by the batch >>>>>>>> manager >>>>>>>> - direct launch: the mpi tasks are launched directly from the >>>>>>>> script/command line and no mpirun/orted is involved >>>>>>>> at that time, it works with SLURM and possibly other PMI capable batch >>>>>>>> manager >>>>>>>> >>>>>>>> i think OpenLava simply gets a list of hosts from the environment, >>>>>>>> build >>>>>>>> a machinefile, pass it to mpirun that spawns orted with ssh, so this is >>>>>>>> really loose integration. >>>>>>>> >>>>>>>> OpenMPI is based on plugins, so as long as the queing system provides >>>>>>>> an >>>>>>>> API to start/stop/kill tasks, mpirun launch should not >>>>>>>> be a huge effort. >>>>>>>> >>>>>>>> Are you aware of such an API provided by OpenLava ? >>>>>>>> >>>>>>>> Cheers, >>>>>>>> >>>>>>>> Gilles >>>>>>>> >>>>>>>> On 2014/11/18 16:31, Marc Höppner wrote: >>>>>>>>> Hi list, >>>>>>>>> >>>>>>>>> I have recently started to wonder how hard it would be to add support >>>>>>>>> for queuing systems to the tight integration function of OpenMPI >>>>>>>>> (unfortunately, I am not a developer myself). Specifically, we are >>>>>>>>> working with OpenLava (www.openlava.org <http://www.openlava.org/>), >>>>>>>>> which is based on an early version of Lava/LSF and open source. It’s >>>>>>>>> proven quite useful in environments where some level of LSF >>>>>>>>> compatibility is needed, but without actually paying for a (rather >>>>>>>>> pricey) LSF license. >>>>>>>>> >>>>>>>>> Given that openLava shares quite a bit of DNA with LSF, I was >>>>>>>>> wondering how hard it would be to add OL tight integration support to >>>>>>>>> OpenMPI. Currently, OL enables OpenMPI jobs through a wrapper script, >>>>>>>>> but that’s obviously not ideal and doesn’t work for some programs >>>>>>>>> that have MPI support built-in (and thus expect to be able to just >>>>>>>>> execute mpirun). >>>>>>>>> >>>>>>>>> Any thoughts on this would be greatly appreciated! >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> >>>>>>>>> Marc >>>>>>>>> >>>>>>>>> Marc P. Hoeppner, PhD >>>>>>>>> Team Leader >>>>>>>>> BILS Genome Annotation Platform >>>>>>>>> Department for Medical Biochemistry and Microbiology >>>>>>>>> Uppsala University, Sweden >>>>>>>>> marc.hoepp...@bils.se <mailto:marc.hoepp...@bils.se> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> devel mailing list >>>>>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org> >>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >>>>>>>>> Link to this post: >>>>>>>>> http://www.open-mpi.org/community/lists/devel/2014/11/16312.php >>>>>>>>> <http://www.open-mpi.org/community/lists/devel/2014/11/16312.php> >>>>>>>> _______________________________________________ >>>>>>>> devel mailing list >>>>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org> >>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >>>>>>>> Link to this post: >>>>>>>> http://www.open-mpi.org/community/lists/devel/2014/11/16313.php >>>>>>>> <http://www.open-mpi.org/community/lists/devel/2014/11/16313.php> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> devel mailing list >>>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org> >>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >>>>>>> Link to this post: >>>>>>> http://www.open-mpi.org/community/lists/devel/2014/11/16314.php >>>>>>> <http://www.open-mpi.org/community/lists/devel/2014/11/16314.php> >>>>>> >>>>>> _______________________________________________ >>>>>> devel mailing list >>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org> >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >>>>>> Link to this post: >>>>>> http://www.open-mpi.org/community/lists/devel/2014/11/16315.php >>>>>> <http://www.open-mpi.org/community/lists/devel/2014/11/16315.php> >>>>>> >>>>>> _______________________________________________ >>>>>> devel mailing list >>>>>> de...@open-mpi.org <mailto:de...@open-mpi.org> >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >>>>>> Link to this post: >>>>>> http://www.open-mpi.org/community/lists/devel/2014/11/16316.php >>>>>> <http://www.open-mpi.org/community/lists/devel/2014/11/16316.php> >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org <mailto:de...@open-mpi.org> >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/devel/2014/11/16317.php >>>>> <http://www.open-mpi.org/community/lists/devel/2014/11/16317.php> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org <mailto:de...@open-mpi.org> >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/devel/2014/11/16318.php >>>>> <http://www.open-mpi.org/community/lists/devel/2014/11/16318.php> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org <mailto:de...@open-mpi.org> >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/devel/2014/11/16319.php >>>> <http://www.open-mpi.org/community/lists/devel/2014/11/16319.php> >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org <mailto:de...@open-mpi.org> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2014/11/16326.php >>> <http://www.open-mpi.org/community/lists/devel/2014/11/16326.php> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org <mailto:de...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> <http://www.open-mpi.org/mailman/listinfo.cgi/devel> >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/11/16327.php >> <http://www.open-mpi.org/community/lists/devel/2014/11/16327.php>