Re: [OMPI users] LSF launch with OpenMPI
Sorry about the typo, yes, I meant OMPI 1.3.2. Mehdi -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: May-07-09 12:07 PM To: Open MPI Users Subject: Re: [OMPI users] LSF launch with OpenMPI Did you mean OMPI 1.3.2? OMPI 1.2.3 did not have LSF support. On May 7, 2009, at 9:50 AM, Mehdi Bozzo-Rey wrote: > Hi Jeff, > > I tried several combinations and: > > - LIBS=... does not work for OpenMPI 1.2.3 / LSF 7.0.5 > - the winner for now is LSF 7.0.4 / OpenMPI 1.2.3 > > Cheers, > > Mehdi > > -Original Message- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] > On > Behalf Of Jeff Squyres > Sent: May-06-09 3:12 PM > To: Open MPI Users > Subject: Re: [OMPI users] LSF launch with OpenMPI > > On May 5, 2009, at 10:01 AM, Matthieu Brucher wrote: > > > > What Terry said is correct. It means that "mpirun" will use, > > under the > > > covers, the "native" launching mechanism of LSF to launch jobs > > (vs., say, > > > rsh or ssh). It'll also discover the hosts to use for this job > > without the > > > use of a hostfile -- it'll query LSF directly to see what hosts it > > should > > > use. > > > > OK, so I have to do something like: > > bsub -n ${CPUS} mpirun myapplication > > > > Is it what I think? > > > > I don't know what you think. ;-) But I think that your above command > might be correct. You want *1* copy of mpirun to execute. Hence, if > > bsub -n ${CPUS} uptime > > launches ${CPUS} copies of uptime, then the above command is not > correct. You want to submit an ${CPUS} processor job to LSF and have > *one* copy of "mpirun myapplication" run -- mpirun will then invoke > the underlying stuff to launch ${CPUS} copies of myapplication and > join them together into a single MPI job. > > > I've enclosed the configure output as well as the config.log. The > > problem is that my LSF (I didn't install it) 7.0.3 need libbat to be > > linked against llsbstream (I modified the configure script to add > > -llsbstream, and it compiled). > > > > Huh! Odd -- we didn't need that before. Let me check with > Platform... > > FWIW, you should be able to run like this without modifying configure: > > ./configure LIBS=-llsbstream etc > > That should add -llsbstream in the Right places. > > -- > Jeff Squyres > Cisco Systems > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] LSF launch with OpenMPI
Did you mean OMPI 1.3.2? OMPI 1.2.3 did not have LSF support. On May 7, 2009, at 9:50 AM, Mehdi Bozzo-Rey wrote: Hi Jeff, I tried several combinations and: - LIBS=... does not work for OpenMPI 1.2.3 / LSF 7.0.5 - the winner for now is LSF 7.0.4 / OpenMPI 1.2.3 Cheers, Mehdi -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: May-06-09 3:12 PM To: Open MPI Users Subject: Re: [OMPI users] LSF launch with OpenMPI On May 5, 2009, at 10:01 AM, Matthieu Brucher wrote: > > What Terry said is correct. It means that "mpirun" will use, > under the > > covers, the "native" launching mechanism of LSF to launch jobs > (vs., say, > > rsh or ssh). It'll also discover the hosts to use for this job > without the > > use of a hostfile -- it'll query LSF directly to see what hosts it > should > > use. > > OK, so I have to do something like: > bsub -n ${CPUS} mpirun myapplication > > Is it what I think? > I don't know what you think. ;-) But I think that your above command might be correct. You want *1* copy of mpirun to execute. Hence, if bsub -n ${CPUS} uptime launches ${CPUS} copies of uptime, then the above command is not correct. You want to submit an ${CPUS} processor job to LSF and have *one* copy of "mpirun myapplication" run -- mpirun will then invoke the underlying stuff to launch ${CPUS} copies of myapplication and join them together into a single MPI job. > I've enclosed the configure output as well as the config.log. The > problem is that my LSF (I didn't install it) 7.0.3 need libbat to be > linked against llsbstream (I modified the configure script to add > -llsbstream, and it compiled). > Huh! Odd -- we didn't need that before. Let me check with Platform... FWIW, you should be able to run like this without modifying configure: ./configure LIBS=-llsbstream etc That should add -llsbstream in the Right places. -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] LSF launch with OpenMPI
Hi, Thank you for the tip, this seems to be what I was looking for. Matthieu 2009/5/7 Mehdi Bozzo-Rey <mbozz...@platform.com>: > Hello Jeroen, > > > > There are 2 ways of launching OpenMPI jobs (using a recent version of LSF): > > 1. The one you have just described; it uses the generic PJL (Parallel > Job Launcher) framework. You can easily recognise it because of the use of > the –a openmpi flag and mpirun.lsf > > 2. In recent versions of LSF, another framework is also available, and > it permits a tight (native) integration with the MPIs (this is why there is > the OpenMPI integration) > > > > So, for 1., a typical command line would be, as you mentioned, something > like: > > > > bsub -o %J.out -e %J.err -n 4 -R "span[ptile=1]" -a openmpi mpirun.lsf > ./test > > > > And for 2., you would use something like: > > > > bsub -o %J.out -e %J.err -n 4 -R "span[ptile=1]" mpirun ./test > > > > Cheers, > > > > Mehdi > > > > > > > > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of Jeroen Kleijer > Sent: May-05-09 9:26 AM > To: Open MPI Users > Subject: Re: [OMPI users] LSF launch with OpenMPI > > > > If you wish to submit to lsf using its native commands (bsub) you can do the > following: > > > > bsub -q ${QUEUE} -a openmpi -n ${CPUS} "mpirun.lsf -x PATH -x > LD_LIBRARY_PATH -x MPI_BUFFER_SIZE ${COMMAND} ${OPTIONS}" > > > > It should be noted that in this case you don't call OpenMPI's mpirun > directly but use the mpirun.lsf, a wrapper script provided by LSF. This > wrapper script takes care of setting the necessary environment variables and > eventually calls the correct mpirun. (the option "-a openmpi" tells LSF that > we're using OpenMPI so don't try to autodetect) > > > > Regards, > > > > Jeroen Kleijer > > On Tue, May 5, 2009 at 2:23 PM, Jeff Squyres <jsquy...@cisco.com> wrote: > > On May 5, 2009, at 6:10 AM, Matthieu Brucher wrote: > > The first is what the support of LSF by OpenMPI means. When mpirun is > executed, it is an LSF job that is actually ran? Or what does it > imply? I've tried to search on the openmpi website as well as on the > internet, but I couldn't find a clear answer/use case. > > > > What Terry said is correct. It means that "mpirun" will use, under the > covers, the "native" launching mechanism of LSF to launch jobs (vs., say, > rsh or ssh). It'll also discover the hosts to use for this job without the > use of a hostfile -- it'll query LSF directly to see what hosts it should > use. > > My second question is about the LSF detection. lsf.h is detected, but > when lsb_launch is searched for ion libbat.so, it fails because > parse_time and parse_time_ex are not found. Is there a way to add > additional lsf libraries so that the search can be done? > > Can you send all the data shown here: > > http://www.open-mpi.org/community/help/ > > -- > Jeff Squyres > Cisco Systems > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher
Re: [OMPI users] LSF launch with OpenMPI
Hi Jeff, I just tried it: OpenMPI 1.3.2 (compiled with no LSF support)/ LSF 7.0.4 and the PJL framework (-a openmpi / mpirun.lsf) and everything looks fine. Cheers, Mehdi -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: May-05-09 9:38 AM To: Open MPI Users Subject: Re: [OMPI users] LSF launch with OpenMPI On May 5, 2009, at 9:25 AM, Jeroen Kleijer wrote: > If you wish to submit to lsf using its native commands (bsub) you > can do the following: > > bsub -q ${QUEUE} -a openmpi -n ${CPUS} "mpirun.lsf -x PATH -x > LD_LIBRARY_PATH -x MPI_BUFFER_SIZE ${COMMAND} ${OPTIONS}" > > It should be noted that in this case you don't call OpenMPI's mpirun > directly but use the mpirun.lsf, a wrapper script provided by LSF. > This wrapper script takes care of setting the necessary environment > variables and eventually calls the correct mpirun. (the option "-a > openmpi" tells LSF that we're using OpenMPI so don't try to > autodetect) I had forgotten about this. I should ask my LSF contacts if this method still works with Open MPI v1.3 (which natively supports LSF), or whether strange / interesting failures occur because of the integration that mpirun.lsf does ends up effectively conflicting with what OMPI's mpirun does internally... -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] LSF launch with OpenMPI
Hello Jeroen, There are 2 ways of launching OpenMPI jobs (using a recent version of LSF): 1. The one you have just described; it uses the generic PJL (Parallel Job Launcher) framework. You can easily recognise it because of the use of the -a openmpi flag and mpirun.lsf 2. In recent versions of LSF, another framework is also available, and it permits a tight (native) integration with the MPIs (this is why there is the OpenMPI integration) So, for 1., a typical command line would be, as you mentioned, something like: bsub -o %J.out -e %J.err -n 4 -R "span[ptile=1]" -a openmpi mpirun.lsf ./test And for 2., you would use something like: bsub -o %J.out -e %J.err -n 4 -R "span[ptile=1]" mpirun ./test Cheers, Mehdi From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Jeroen Kleijer Sent: May-05-09 9:26 AM To: Open MPI Users Subject: Re: [OMPI users] LSF launch with OpenMPI If you wish to submit to lsf using its native commands (bsub) you can do the following: bsub -q ${QUEUE} -a openmpi -n ${CPUS} "mpirun.lsf -x PATH -x LD_LIBRARY_PATH -x MPI_BUFFER_SIZE ${COMMAND} ${OPTIONS}" It should be noted that in this case you don't call OpenMPI's mpirun directly but use the mpirun.lsf, a wrapper script provided by LSF. This wrapper script takes care of setting the necessary environment variables and eventually calls the correct mpirun. (the option "-a openmpi" tells LSF that we're using OpenMPI so don't try to autodetect) Regards, Jeroen Kleijer On Tue, May 5, 2009 at 2:23 PM, Jeff Squyres <jsquy...@cisco.com> wrote: On May 5, 2009, at 6:10 AM, Matthieu Brucher wrote: The first is what the support of LSF by OpenMPI means. When mpirun is executed, it is an LSF job that is actually ran? Or what does it imply? I've tried to search on the openmpi website as well as on the internet, but I couldn't find a clear answer/use case. What Terry said is correct. It means that "mpirun" will use, under the covers, the "native" launching mechanism of LSF to launch jobs (vs., say, rsh or ssh). It'll also discover the hosts to use for this job without the use of a hostfile -- it'll query LSF directly to see what hosts it should use. My second question is about the LSF detection. lsf.h is detected, but when lsb_launch is searched for ion libbat.so, it fails because parse_time and parse_time_ex are not found. Is there a way to add additional lsf libraries so that the search can be done? Can you send all the data shown here: http://www.open-mpi.org/community/help/ -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] LSF launch with OpenMPI
2009/5/6 Jeff Squyres: > On May 5, 2009, at 10:01 AM, Matthieu Brucher wrote: > >> > What Terry said is correct. It means that "mpirun" will use, under the >> > covers, the "native" launching mechanism of LSF to launch jobs (vs., >> > say, >> > rsh or ssh). It'll also discover the hosts to use for this job without >> > the >> > use of a hostfile -- it'll query LSF directly to see what hosts it >> > should >> > use. >> >> OK, so I have to do something like: >> bsub -n ${CPUS} mpirun myapplication >> >> Is it what I think? >> > > I don't know what you think. ;-) But I think that your above command might > be correct. You want *1* copy of mpirun to execute. Hence, if > > bsub -n ${CPUS} uptime > > launches ${CPUS} copies of uptime, then the above command is not correct. > You want to submit an ${CPUS} processor job to LSF and have *one* copy of > "mpirun myapplication" run -- mpirun will then invoke the underlying stuff > to launch ${CPUS} copies of myapplication and join them together into a > single MPI job. > >> I've enclosed the configure output as well as the config.log. The >> problem is that my LSF (I didn't install it) 7.0.3 need libbat to be >> linked against llsbstream (I modified the configure script to add >> -llsbstream, and it compiled). >> > > Huh! Odd -- we didn't need that before. Let me check with Platform... > > FWIW, you should be able to run like this without modifying configure: > > ./configure LIBS=-llsbstream etc > > That should add -llsbstream in the Right places. Thanks, I'll try this (provided I'm able to run OpenMPI 1.3.2, I have some strange errors I didn't get with 1.2.8) Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher
Re: [OMPI users] LSF launch with OpenMPI
On May 5, 2009, at 10:01 AM, Matthieu Brucher wrote: > What Terry said is correct. It means that "mpirun" will use, under the > covers, the "native" launching mechanism of LSF to launch jobs (vs., say, > rsh or ssh). It'll also discover the hosts to use for this job without the > use of a hostfile -- it'll query LSF directly to see what hosts it should > use. OK, so I have to do something like: bsub -n ${CPUS} mpirun myapplication Is it what I think? I don't know what you think. ;-) But I think that your above command might be correct. You want *1* copy of mpirun to execute. Hence, if bsub -n ${CPUS} uptime launches ${CPUS} copies of uptime, then the above command is not correct. You want to submit an ${CPUS} processor job to LSF and have *one* copy of "mpirun myapplication" run -- mpirun will then invoke the underlying stuff to launch ${CPUS} copies of myapplication and join them together into a single MPI job. I've enclosed the configure output as well as the config.log. The problem is that my LSF (I didn't install it) 7.0.3 need libbat to be linked against llsbstream (I modified the configure script to add -llsbstream, and it compiled). Huh! Odd -- we didn't need that before. Let me check with Platform... FWIW, you should be able to run like this without modifying configure: ./configure LIBS=-llsbstream etc That should add -llsbstream in the Right places. -- Jeff Squyres Cisco Systems
Re: [OMPI users] LSF launch with OpenMPI
2009/5/5 Jeff Squyres: > On May 5, 2009, at 6:10 AM, Matthieu Brucher wrote: > >> The first is what the support of LSF by OpenMPI means. When mpirun is >> executed, it is an LSF job that is actually ran? Or what does it >> imply? I've tried to search on the openmpi website as well as on the >> internet, but I couldn't find a clear answer/use case. >> > > What Terry said is correct. It means that "mpirun" will use, under the > covers, the "native" launching mechanism of LSF to launch jobs (vs., say, > rsh or ssh). It'll also discover the hosts to use for this job without the > use of a hostfile -- it'll query LSF directly to see what hosts it should > use. OK, so I have to do something like: bsub -n ${CPUS} mpirun myapplication Is it what I think? >> My second question is about the LSF detection. lsf.h is detected, but >> when lsb_launch is searched for ion libbat.so, it fails because >> parse_time and parse_time_ex are not found. Is there a way to add >> additional lsf libraries so that the search can be done? >> > > > Can you send all the data shown here: > > http://www.open-mpi.org/community/help/ I've enclosed the configure output as well as the config.log. The problem is that my LSF (I didn't install it) 7.0.3 need libbat to be linked against llsbstream (I modified the configure script to add -llsbstream, and it compiled). I can't use the official way of launching a batch job, LSF doesn't pickup the correct LSF script wrapper (due to a bogus installation). Thank you for all the answers! (I will have others, as I'm trying to use the InfiniPath support as well) Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher output.tar.bz Description: Binary data
Re: [OMPI users] LSF launch with OpenMPI
On May 5, 2009, at 9:25 AM, Jeroen Kleijer wrote: If you wish to submit to lsf using its native commands (bsub) you can do the following: bsub -q ${QUEUE} -a openmpi -n ${CPUS} "mpirun.lsf -x PATH -x LD_LIBRARY_PATH -x MPI_BUFFER_SIZE ${COMMAND} ${OPTIONS}" It should be noted that in this case you don't call OpenMPI's mpirun directly but use the mpirun.lsf, a wrapper script provided by LSF. This wrapper script takes care of setting the necessary environment variables and eventually calls the correct mpirun. (the option "-a openmpi" tells LSF that we're using OpenMPI so don't try to autodetect) I had forgotten about this. I should ask my LSF contacts if this method still works with Open MPI v1.3 (which natively supports LSF), or whether strange / interesting failures occur because of the integration that mpirun.lsf does ends up effectively conflicting with what OMPI's mpirun does internally... -- Jeff Squyres Cisco Systems
Re: [OMPI users] LSF launch with OpenMPI
If you wish to submit to lsf using its native commands (bsub) you can do the following: bsub -q ${QUEUE} -a openmpi -n ${CPUS} "mpirun.lsf -x PATH -x LD_LIBRARY_PATH -x MPI_BUFFER_SIZE ${COMMAND} ${OPTIONS}" It should be noted that in this case you don't call OpenMPI's mpirun directly but use the mpirun.lsf, a wrapper script provided by LSF. This wrapper script takes care of setting the necessary environment variables and eventually calls the correct mpirun. (the option "-a openmpi" tells LSF that we're using OpenMPI so don't try to autodetect) Regards, Jeroen Kleijer On Tue, May 5, 2009 at 2:23 PM, Jeff Squyreswrote: > On May 5, 2009, at 6:10 AM, Matthieu Brucher wrote: > > The first is what the support of LSF by OpenMPI means. When mpirun is >> executed, it is an LSF job that is actually ran? Or what does it >> imply? I've tried to search on the openmpi website as well as on the >> internet, but I couldn't find a clear answer/use case. >> >> > What Terry said is correct. It means that "mpirun" will use, under the > covers, the "native" launching mechanism of LSF to launch jobs (vs., say, > rsh or ssh). It'll also discover the hosts to use for this job without the > use of a hostfile -- it'll query LSF directly to see what hosts it should > use. > > My second question is about the LSF detection. lsf.h is detected, but >> when lsb_launch is searched for ion libbat.so, it fails because >> parse_time and parse_time_ex are not found. Is there a way to add >> additional lsf libraries so that the search can be done? >> >> > > Can you send all the data shown here: > >http://www.open-mpi.org/community/help/ > > -- > Jeff Squyres > Cisco Systems > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] LSF launch with OpenMPI
On May 5, 2009, at 6:10 AM, Matthieu Brucher wrote: The first is what the support of LSF by OpenMPI means. When mpirun is executed, it is an LSF job that is actually ran? Or what does it imply? I've tried to search on the openmpi website as well as on the internet, but I couldn't find a clear answer/use case. What Terry said is correct. It means that "mpirun" will use, under the covers, the "native" launching mechanism of LSF to launch jobs (vs., say, rsh or ssh). It'll also discover the hosts to use for this job without the use of a hostfile -- it'll query LSF directly to see what hosts it should use. My second question is about the LSF detection. lsf.h is detected, but when lsb_launch is searched for ion libbat.so, it fails because parse_time and parse_time_ex are not found. Is there a way to add additional lsf libraries so that the search can be done? Can you send all the data shown here: http://www.open-mpi.org/community/help/ -- Jeff Squyres Cisco Systems
Re: [OMPI users] LSF launch with OpenMPI
On Tue, 2009-05-05 at 12:10 +0200, Matthieu Brucher wrote: > Hello, > > I have two questions, in fact. > > The first is what the support of LSF by OpenMPI means. When mpirun is > executed, it is an LSF job that is actually ran? Or what does it > imply? I've tried to search on the openmpi website as well as on the > internet, but I couldn't find a clear answer/use case. Hi Matthieu I think it's fair to say that if "batch system XYZ" is supported, then in a job script submitted to that batch system you can issue an mpirun command without manually specifying numbers of processes, hostnames, launch protocols, etc. They're all picked up using the mechanisms of the batch system. If LSF has any peculiarities, someone will point them out, I'm sure. Configuring for LSF I can't help you with. Ciao
[OMPI users] LSF launch with OpenMPI
Hello, I have two questions, in fact. The first is what the support of LSF by OpenMPI means. When mpirun is executed, it is an LSF job that is actually ran? Or what does it imply? I've tried to search on the openmpi website as well as on the internet, but I couldn't find a clear answer/use case. My second question is about the LSF detection. lsf.h is detected, but when lsb_launch is searched for ion libbat.so, it fails because parse_time and parse_time_ex are not found. Is there a way to add additional lsf libraries so that the search can be done? Matthieu Brucher -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher