Reuti, Found this link: http://mickey.ifp.illinois.edu/speechWiki/index.php?title=Simple_UserGuide_for_MPICH2&oldid=4403
It seems (I didn´t test it yet) that it uses a mpich2 version prior to 1.4 (without the tight integration) and OGE with $fill_up. Other question: Is there any way to launch mpd in $fill_up instead of $round_robin? All the best, Sergio On Sun, Jul 28, 2013 at 2:07 PM, Reuti <[email protected]> wrote: > Am 28.07.2013 um 18:10 schrieb Sergio Mafra: > > > Ok, > > > > In fact, the app was compiled with mpich2 version 1.08 and since it's > distributed by a Research Center, we don't have access to compilation. > > This is even worse, as there is no guarantee that an application built > with 1.08 will work with 1.4 at all. MPI is an application programming > interface (API) but not an application binary interface (ABI), hence the > MPI library can't be changed in type and version in general *). The one > used for compilation should be the same used later on during execution. > > At time of MPICH 1.08 the default startup of slave tasks was to have > already an mpd-ring running in MPICH2 before the job is started and most > likely this is statically linked into the application. This may result in > running n instances of the application in serial instead of one parallel > application. I don't recall exactly when Hydra appeared in MPICH2. > > The best would be to ask them to recompile the application with a recent > version of an MPI library, or for you to go back MPICH2 1.08 and set up SGE > for it: > http://arc.liv.ac.uk/SGE/howto/mpich2-integration/mpich2-integration.htmlBut > I can't provide any support for this old version or setup. > > -- Reuti > > > *)There are exceptions e.g. in Open MPI that the ABI will be the same > between an uneven (feature release) and the following even (stable release) > version of the library. > > > > I'll try to upgrade the version of mpich2 and see what happens. > > > > Thanks again, > > > > Sergio > > > > Em domingo, 28 de julho de 2013, Reuti escreveu: > > Am 27.07.2013 um 22:29 schrieb Sergio Mafra: > > > > > Hi Reuti, > > > > > > It seems that the previous tests are wrong. > > > I realize that your doubts are right.. There was only one slot being > busy despite all 16 being deployed. > > > > > > I´d change the job launcher to: > > > > > > $qsub -N $nameofthecase -b y -pe orte 20 -cwd mpiexec -np 20 > newave170502_L > > > > Aha, the "-np 20" option shouldn't be necessary at all. Maybe it was a > bug in MPICH2 1.4 at that time not to detect the granted slots. > > > > - Was the MPICH2 1.4 version used also to compile the application? > > > > - As the 1.4 is somewhat old, I suggest to update at least to 1.4.1p1: > > > > http://www.mpich.org/static/downloads/1.4.1p1/ > > > > You can compile it to be installed into ~/local/mpich2-1.4.1p1 or alike > and use this version then for compilation and execution. > > > > You could also try the latest http://www.mpich.org/ or even > http://www.open-mpi.org > > > > -- Reuti > > > > > > > Note that (for some reason) it´s mandatory to tell PE and mpi that are > 20 slots to use. > > > > > > Doing that, it comes this output for a job with 20 slots > > > > > > $round_robin: > > > > > > job with 20 slots > > > job launched as > > > $qsub -N $nameofthecase -b y -pe orte 20 -cwd mpiexec -np 20 > newave170502_L > > > > > > $ ps -e f --cols=500 > > > 2390 ? Sl 0:00 /opt/sge6/bin/linux-x64/sge_execd > > > 2835 ? S 0:00 \_ sge_shepherd-1 -bg > > > 2837 ? Ss 0:00 \_ mpiexec -np 20 newave170502_L > > > 2838 ? S 0:00 \_ /usr/bin/hydra_pmi_proxy > --control-port master:46220 --demux poll --pgid 0 --retries 10 --proxy-id 0 > > > 2840 ? R 1:18 | \_ newave170502_L > > > 2841 ? S 0:54 | \_ newave170502_L > > > 2842 ? S 1:07 | \_ newave170502_L > > > 2843 ? S 0:52 | \_ newave170502_L > > > 2844 ? S 1:07 | \_ newave170502_L > > > 2845 ? S 1:08 | \_ newave170502_L > > > 2846 ? S 0:00 | \_ newave170502_L > > > 2847 ? S 0:00 | \_ newave170502_L > > > 2848 ? S 0:00 | \_ newave170502_L > > > 2849 ? S 0:00 | \_ newave170502_L > > > 2839 ? Sl 0:00 \_ /opt/sge6/bin/linux-x64/qrsh > -inherit -V node001 "/usr/bin/hydra_pmi_proxy" --control-port master:46220 > --demux poll --pgid 0 --retries 10 --proxy-id 1 > > > > > > > > > $ mpiexec --version > > > HYDRA build details: > > > Version: 1.4 > > > Release Date: Thu Jun 16 16:41:08 CDT > 2011 > > > CC: gcc > -I/build/buildd/mpich2-1.4/src/mpl/include > -I/build/buildd/mpich2-1.4/src/mpl/include > -I/build/buildd/mpich2-1.4/src/openpa/src > -I/build/buildd/mpich2-1.4/src/openpa/src > -I/build/buildd/mpich2-1.4/src/mpid/ch3/include > -I/build/buildd/mpich2-1.4/src/mpid/ch3/include > -I/build/buildd/mpich2-1.4/src/mpid/common/datatype > -I/build/buildd/mpich2-1.4/src/mpid/common/datatype > -I/build/buildd/mpich2-1.4/src/mpid/common/locks > -I/build/buildd/mpich2-1.4/src/mpid/common/locks > -I/build/buildd/mpich2-1.4/src/mpid/ch3/channels/nemesis/include > -I/build/buildd/mpich2-1.4/src/mpid/ch3/channels/nemesis/include > -I/build/buildd/mpich2-1.4/src/mpid/ch3/channels/nemesis/nemesis/include > -I/build/buildd/mpich2-1.4/src/mpid/ch3/channels/nemesis/nemesis/include > -I/build/buildd/mpich2-1.4/src/mpid/ch3/channels/nemesis/nemesis/utils/monitor > -I/build/buildd/mpich2-1.4/src/mpid/ch3/channels/nemesis/nemesis/utils/monitor > -I/build/buildd/mpich2-1.4/src/util/wrappers > -I/build/buildd/mpich2-1.4/src/util/wrappers -g -O2 -g -O2 -Wall -O2 > -Wl,-Bsymbolic-functions -lrt -lcr -lpthread > > > CXX: > > > F77: > > > F90: gfortran > -Wl,-Bsymbolic-functions -lrt -lcr -lpthread > > > Configure options: > '--build=x86_64-linux-gnu' '--includedir=${prefix}/include' > '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' > '--sysconfdir=/etc' '--localstatedir=/var' > '--libexecdir=${prefix}/lib/mpich2' '--srcdir=.' > '--disable-maintainer-mode' '--disable-dependency-tracking' '- > >
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
