Re: [OMPI users] Issue with Profiling Fortran code
I think this issue is now resolved and thanks everybody for your help. I certainly learnt a lot! For the first case you describe, as OPENMPI is now, the call sequence from fortran is mpi_comm_rank -> MPI_Comm_rank -> PMPI_Comm_rank For the second case, as MPICH is now, its mpi_comm_rank -> PMPI_Comm_rank AFAIK, all known/popular MPI implemention's fortran binding layer is implemented with C MPI functions, including MPICH2 and OpenMPI. If MPICH2's fortran layer was implemented the way you said, typical profiling tools including MPE will fail to work with fortran applications. e.g. check mpich2-xxx/src/binding/f77/sendf.c. To answer this specific point see for example the comment in src/binding/f77/comm_sizef.c /* This defines the routine that we call, which must be the PMPI version since we're renameing the Fortran entry as the pmpi version */ and the workings of the definition in MPICH #ifndef MPICH_MPI_FROM_PMPI This is what makes MPICH behaviour different than OPENMPI's in this matter. Regards, Nick. A.Chan So for the first case if I have a pure fortran/C++ code I have to profile at the C interface. So is the patch now retracted ? Nick. I think you have an incorrect deffinition of "correctly" :). According to the MPI standard, an MPI implementation is free to either layer language bindings (and only allow profiling at the lowest layer) or not layer the language bindings (and require profiling libraries intercept each language). The only requirement is that the implementation document what it has done. Since everyone is pretty clear on what Open MPI has done, I don't think you can claim Open MPI is doing it "incorrectly". Different from MPICH is not necessarily incorrect. (BTW, LAM/MPI handles profiling the same way as Open MPI). Brian On Fri, 5 Dec 2008, Nick Wright wrote: Hi Antony That will work yes, but its not portable to other MPI's that do implement the profiling layer correctly unfortunately. I guess we will just need to detect that we are using openmpi when our tool is configured and add some macros to deal with that accordingly. Is there an easy way to do this built into openmpi? Thanks Nick. Anthony Chan wrote: Hope I didn't misunderstand your question. If you implement your profiling library in C where you do your real instrumentation, you don't need to implement the fortran layer, you can simply link with Fortran to C MPI wrapper library -lmpi_f77. i.e. /bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib where libYourProfClib.a is your profiling tool written in C. If you don't want to intercept the MPI call twice for fortran program, you need to implment fortran layer. In that case, I would think you can just call C version of PMPI_xxx directly from your fortran layer, e.g. void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); *info = PMPI_Comm_rank(comm,rank); } A.Chan - "Nick Wright"wrote: Hi I am trying to use the PMPI interface with OPENMPI to profile a fortran program. I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched on. The problem seems to be that if one eg. intercepts to call to mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should. So if one wants to create a library that can profile C and Fortran codes at the same time one ends up intercepting the mpi call twice. Which is not desirable and not what should happen (and indeed doesn't happen in other MPI implementations). A simple example to illustrate is below. If somebody knows of a fix to avoid this issue that would be great ! Thanks Nick. pmpi_test.c: mpicc pmpi_test.c -c #include #include "mpi.h" void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); pmpi_comm_rank_(comm,rank,info); } int MPI_Comm_rank(MPI_Comm comm, int *rank) { printf("MPI_comm_rank call successfully intercepted\n"); PMPI_Comm_rank(comm,rank); } hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o program hello implicit none include 'mpif.h' integer ierr integer myid,nprocs character*24 fdate,host call MPI_Init( ierr ) myid=0 call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr ) call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr ) call getenv('HOST',host) write (*,*) 'Hello World from proc',myid,' out of',nprocs,host call mpi_finalize(ierr) end ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list
Re: [OMPI users] Issue with Profiling Fortran code
Hi George, - "George Bosilca"wrote: > On Dec 5, 2008, at 03:16 , Anthony Chan wrote: > > > void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { > >printf("mpi_comm_rank call successfully intercepted\n"); > >*info = PMPI_Comm_rank(comm,rank); > > } > > Unfortunately this example is not correct. The real Fortran prototype > > for the MPI_Comm_rank function is > void mpi_comm_rank_(MPI_Fint *comm, MPI_Fint *rank, MPI_Fint *ierr). Yes, you are right. I was being sloppy (it was late, so just cut/paste from Nick's code), the correct code should be void mpi_comm_rank_(MPI_Fint *comm, MPI_Fint *rank, MPI_Fint *info) { printf("mpi_comm_rank call successfully intercepted\n"); *info = PMPI_Comm_rank(MPI_Comm_f2c(*comm),*rank); } A.Chan > > As you might notice there is no MPI_Comm (and believe me for Open MPI > > MPI_Comm is different than MPI_Fint), and there is no guarantee that > > the C int is the same as the Fortran int (looks weird but true). > Therefore, several conversions are required in order to be able to go > > from the Fortran layer into the C one. > > As a result, a tool should never cross the language boundary by > itself. Instead it should call the pmpi function as provided by the > MPI library. This doesn't really fix the issue that started this email > > thread, but at least clarify it a little bit. > >george. > > > > > A.Chan > > > > - "Nick Wright" wrote: > > > >> Hi > >> > >> I am trying to use the PMPI interface with OPENMPI to profile a > >> fortran > >> program. > >> > >> I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile > switched > >> on. > >> > >> The problem seems to be that if one eg. intercepts to call to > >> mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this > > >> then > >> > >> calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should. > >> > >> So if one wants to create a library that can profile C and Fortran > >> codes > >> at the same time one ends up intercepting the mpi call twice. Which > > >> is > >> > >> not desirable and not what should happen (and indeed doesn't happen > > >> in > >> > >> other MPI implementations). > >> > >> A simple example to illustrate is below. If somebody knows of a fix > > >> to > >> > >> avoid this issue that would be great ! > >> > >> Thanks > >> > >> Nick. > >> > >> pmpi_test.c: mpicc pmpi_test.c -c > >> > >> #include > >> #include "mpi.h" > >> void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { > >> printf("mpi_comm_rank call successfully intercepted\n"); > >> pmpi_comm_rank_(comm,rank,info); > >> } > >> int MPI_Comm_rank(MPI_Comm comm, int *rank) { > >> printf("MPI_comm_rank call successfully intercepted\n"); > >> PMPI_Comm_rank(comm,rank); > >> } > >> > >> hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o > >> > >> program hello > >>implicit none > >>include 'mpif.h' > >>integer ierr > >>integer myid,nprocs > >>character*24 fdate,host > >>call MPI_Init( ierr ) > >> myid=0 > >> call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr ) > >> call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr ) > >> call getenv('HOST',host) > >> write (*,*) 'Hello World from proc',myid,' out > of',nprocs,host > >> call mpi_finalize(ierr) > >> end > >> > >> > >> > >> ___ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Issue with Profiling Fortran code
Hi Nick, - "Nick Wright"wrote: > For the first case you describe, as OPENMPI is now, the call sequence > > from fortran is > > mpi_comm_rank -> MPI_Comm_rank -> PMPI_Comm_rank > > For the second case, as MPICH is now, its > > mpi_comm_rank -> PMPI_Comm_rank > AFAIK, all known/popular MPI implemention's fortran binding layer is implemented with C MPI functions, including MPICH2 and OpenMPI. If MPICH2's fortran layer was implemented the way you said, typical profiling tools including MPE will fail to work with fortran applications. e.g. check mpich2-xxx/src/binding/f77/sendf.c. A.Chan > So for the first case if I have a pure fortran/C++ code I have to > profile at the C interface. > > So is the patch now retracted ? > > Nick. > > > I think you have an incorrect deffinition of "correctly" :). > According > > to the MPI standard, an MPI implementation is free to either layer > > language bindings (and only allow profiling at the lowest layer) or > not > > layer the language bindings (and require profiling libraries > intercept > > each language). The only requirement is that the implementation > > document what it has done. > > > > Since everyone is pretty clear on what Open MPI has done, I don't > think > > you can claim Open MPI is doing it "incorrectly". Different from > MPICH > > is not necessarily incorrect. (BTW, LAM/MPI handles profiling the > same > > way as Open MPI). > > > > Brian > > > > On Fri, 5 Dec 2008, Nick Wright wrote: > > > >> Hi Antony > >> > >> That will work yes, but its not portable to other MPI's that do > >> implement the profiling layer correctly unfortunately. > >> > >> I guess we will just need to detect that we are using openmpi when > our > >> tool is configured and add some macros to deal with that > accordingly. > >> Is there an easy way to do this built into openmpi? > >> > >> Thanks > >> > >> Nick. > >> > >> Anthony Chan wrote: > >>> Hope I didn't misunderstand your question. If you implement > >>> your profiling library in C where you do your real > instrumentation, > >>> you don't need to implement the fortran layer, you can simply > link > >>> with Fortran to C MPI wrapper library -lmpi_f77. i.e. > >>> > >>> /bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 > -lYourProfClib > >>> > >>> where libYourProfClib.a is your profiling tool written in C. If > you > >>> don't want to intercept the MPI call twice for fortran program, > >>> you need to implment fortran layer. In that case, I would think > you > >>> can just call C version of PMPI_xxx directly from your fortran > layer, > >>> e.g. > >>> > >>> void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { > >>> printf("mpi_comm_rank call successfully intercepted\n"); > >>> *info = PMPI_Comm_rank(comm,rank); > >>> } > >>> > >>> A.Chan > >>> > >>> - "Nick Wright" wrote: > >>> > Hi > > I am trying to use the PMPI interface with OPENMPI to profile a > fortran program. > > I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile > switched > on. > > The problem seems to be that if one eg. intercepts to call to > mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this > then > > calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it > should. > > So if one wants to create a library that can profile C and > Fortran > codes at the same time one ends up intercepting the mpi call > twice. > Which is > > not desirable and not what should happen (and indeed doesn't > happen in > > other MPI implementations). > > A simple example to illustrate is below. If somebody knows of a > fix to > > avoid this issue that would be great ! > > Thanks > > Nick. > > pmpi_test.c: mpicc pmpi_test.c -c > > #include > #include "mpi.h" > void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { > printf("mpi_comm_rank call successfully intercepted\n"); > pmpi_comm_rank_(comm,rank,info); > } > int MPI_Comm_rank(MPI_Comm comm, int *rank) { > printf("MPI_comm_rank call successfully intercepted\n"); > PMPI_Comm_rank(comm,rank); > } > > hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o > > program hello > implicit none > include 'mpif.h' > integer ierr > integer myid,nprocs > character*24 fdate,host > call MPI_Init( ierr ) > myid=0 > call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr ) > call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr ) > call getenv('HOST',host) > write (*,*) 'Hello World from proc',myid,' out > of',nprocs,host > call mpi_finalize(ierr) > end > > > > ___ > users mailing list >
Re: [OMPI users] Issue with Profiling Fortran code
Hi Nick, - "Nick Wright"wrote: > Hi Antony > > That will work yes, but its not portable to other MPI's that do > implement the profiling layer correctly unfortunately. I guess I must have missed something here. What is not portable ? > > I guess we will just need to detect that we are using openmpi when our > > tool is configured and add some macros to deal with that accordingly. > Is > there an easy way to do this built into openmpi? MPE by default provides a fortran to C wrapper library, that way user does not have to know about the MPI implementation's fortran to C layer. MPE user can specify the fortran to C layer that implementation have during MPE configure. Since MPI implementation's fortran to C library does not change often, so writing a configure test to check for libmpi_f77.*, libfmpich.*, or libfmpi.* should get you covered for most platforms. A.Chan > > Thanks > > Nick. > > Anthony Chan wrote: > > Hope I didn't misunderstand your question. If you implement > > your profiling library in C where you do your real instrumentation, > > you don't need to implement the fortran layer, you can simply link > > with Fortran to C MPI wrapper library -lmpi_f77. i.e. > > > > /bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 > -lYourProfClib > > > > where libYourProfClib.a is your profiling tool written in C. > > If you don't want to intercept the MPI call twice for fortran > program, > > you need to implment fortran layer. In that case, I would think > you > > can just call C version of PMPI_xxx directly from your fortran > layer, e.g. > > > > void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { > > printf("mpi_comm_rank call successfully intercepted\n"); > > *info = PMPI_Comm_rank(comm,rank); > > } > > > > A.Chan > > > > - "Nick Wright" wrote: > > > >> Hi > >> > >> I am trying to use the PMPI interface with OPENMPI to profile a > >> fortran > >> program. > >> > >> I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile > switched > >> on. > >> > >> The problem seems to be that if one eg. intercepts to call to > >> mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this > then > >> > >> calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should. > >> > >> So if one wants to create a library that can profile C and Fortran > >> codes > >> at the same time one ends up intercepting the mpi call twice. Which > is > >> > >> not desirable and not what should happen (and indeed doesn't happen > in > >> > >> other MPI implementations). > >> > >> A simple example to illustrate is below. If somebody knows of a fix > to > >> > >> avoid this issue that would be great ! > >> > >> Thanks > >> > >> Nick. > >> > >> pmpi_test.c: mpicc pmpi_test.c -c > >> > >> #include > >> #include "mpi.h" > >> void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { > >>printf("mpi_comm_rank call successfully intercepted\n"); > >>pmpi_comm_rank_(comm,rank,info); > >> } > >> int MPI_Comm_rank(MPI_Comm comm, int *rank) { > >>printf("MPI_comm_rank call successfully intercepted\n"); > >>PMPI_Comm_rank(comm,rank); > >> } > >> > >> hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o > >> > >>program hello > >> implicit none > >> include 'mpif.h' > >> integer ierr > >> integer myid,nprocs > >> character*24 fdate,host > >> call MPI_Init( ierr ) > >>myid=0 > >>call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr ) > >>call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr ) > >>call getenv('HOST',host) > >>write (*,*) 'Hello World from proc',myid,' out > of',nprocs,host > >>call mpi_finalize(ierr) > >>end > >> > >> > >> > >> ___ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Issue with Profiling Fortran code
After spending few hours pondering about this problem, we came to the conclusion that the best approach is to keep what we had before (i.e. the original approach). This means I'll undo my patch in the trunk, and not change the behavior on the next releases (1.3 and 1.2.9). This approach, while different from others MPI implementations, is as legal as possible from the MPI standard point of view. Any suggestions on this topic or about the inconsistent behavior between the MPI implementations, should be directed to the MPI Forum Tools group for further evaluation. The main reason for this is being nice with tool developers. In the current incarnation, they can either catch the Fortran calls or the C calls. If they provide both, then they will have to figure out how to cope with the double calls (as your example highlight). Here is the behavior Open MPI will stick too: Fortran MPI -> C MPI Fortran PMPI -> C MPI george. PS: There was another possible approach, which could avoid the double calls while preserving the tool writers friendliness. This possible approach will do: Fortran MPI -> C MPI Fortran PMPI -> C PMPI ^ Unfortunately, we will have to heavily modify all files in the Fortran interface layer in order to support this approach. We're too close to a major release to start such time consuming work. george. On Dec 5, 2008, at 13:27 , Nick Wright wrote: Brian Sorry I picked the wrong word there. I guess this is more complicated than I thought it was. For the first case you describe, as OPENMPI is now, the call sequence from fortran is mpi_comm_rank -> MPI_Comm_rank -> PMPI_Comm_rank For the second case, as MPICH is now, its mpi_comm_rank -> PMPI_Comm_rank So for the first case if I have a pure fortran/C++ code I have to profile at the C interface. So is the patch now retracted ? Nick. I think you have an incorrect deffinition of "correctly" :). According to the MPI standard, an MPI implementation is free to either layer language bindings (and only allow profiling at the lowest layer) or not layer the language bindings (and require profiling libraries intercept each language). The only requirement is that the implementation document what it has done. Since everyone is pretty clear on what Open MPI has done, I don't think you can claim Open MPI is doing it "incorrectly". Different from MPICH is not necessarily incorrect. (BTW, LAM/MPI handles profiling the same way as Open MPI). Brian On Fri, 5 Dec 2008, Nick Wright wrote: Hi Antony That will work yes, but its not portable to other MPI's that do implement the profiling layer correctly unfortunately. I guess we will just need to detect that we are using openmpi when our tool is configured and add some macros to deal with that accordingly. Is there an easy way to do this built into openmpi? Thanks Nick. Anthony Chan wrote: Hope I didn't misunderstand your question. If you implement your profiling library in C where you do your real instrumentation, you don't need to implement the fortran layer, you can simply link with Fortran to C MPI wrapper library -lmpi_f77. i.e. /bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 - lYourProfClib where libYourProfClib.a is your profiling tool written in C. If you don't want to intercept the MPI call twice for fortran program, you need to implment fortran layer. In that case, I would think you can just call C version of PMPI_xxx directly from your fortran layer, e.g. void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); *info = PMPI_Comm_rank(comm,rank); } A.Chan - "Nick Wright"wrote: Hi I am trying to use the PMPI interface with OPENMPI to profile a fortran program. I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched on. The problem seems to be that if one eg. intercepts to call to mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should. So if one wants to create a library that can profile C and Fortran codes at the same time one ends up intercepting the mpi call twice. Which is not desirable and not what should happen (and indeed doesn't happen in other MPI implementations). A simple example to illustrate is below. If somebody knows of a fix to avoid this issue that would be great ! Thanks Nick. pmpi_test.c: mpicc pmpi_test.c -c #include #include "mpi.h" void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); pmpi_comm_rank_(comm,rank,info); } int MPI_Comm_rank(MPI_Comm comm, int *rank) { printf("MPI_comm_rank call successfully intercepted\n"); PMPI_Comm_rank(comm,rank); } hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o program hello implicit none include 'mpif.h' integer
Re: [OMPI users] Deadlock on large numbers of processors
The reason i'd like to disable these eager buffers is to help detect the deadlock better. I would not run with this for a normal run but it would be useful for debugging. If the deadlock is indeed due to our code then disabling any shared buffers or eager sends would make that deadlock reproduceable.In addition we might be able to lower the number of processors down. Right now determining which processor is deadlocks when we are using 8K cores and each processor has hundreds of messages sent out would be quite difficult. Thanks for your suggestions, Justin Brock Palen wrote: OpenMPI has differnt eager limits for all the network types, on your system run: ompi_info --param btl all and look for the eager_limits You can set these values to 0 using the syntax I showed you before. That would disable eager messages. There might be a better way to disable eager messages. Not sure why you would want to disable them, they are there for performance. Maybe you would still see a deadlock if every message was below the threshold. I think there is a limit of the number of eager messages a receving cpus will accept. Not sure about that though. I still kind of doubt it though. Try tweaking your buffer sizes, make the openib btl eager limit the same as shared memory. and see if you get locks up between hosts and not just shared memory. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Dec 5, 2008, at 2:10 PM, Justin wrote: Thank you for this info. I should add that our code tends to post a lot of sends prior to the other side posting receives. This causes a lot of unexpected messages to exist. Our code explicitly matches up all tags and processors (that is we do not use MPI wild cards). If we had a dead lock I would think we would see it regardless of weather or not we cross the roundevous threshold. I guess one way to test this would be to to set this threshold to 0. If it then dead locks we would likely be able to track down the deadlock. Are there any other parameters we can send mpi that will turn off buffering? Thanks, Justin Brock Palen wrote: When ever this happens we found the code to have a deadlock. users never saw it until they cross the eager->roundevous threshold. Yes you can disable shared memory with: mpirun --mca btl ^sm Or you can try increasing the eager limit. ompi_info --param btl sm MCA btl: parameter "btl_sm_eager_limit" (current value: "4096") You can modify this limit at run time, I think (can't test it right now) it is just: mpirun --mca btl_sm_eager_limit 40960 I think you can also in tweaking these values use env Vars in place of putting it all in the mpirun line: export OMPI_MCA_btl_sm_eager_limit=40960 See: http://www.open-mpi.org/faq/?category=tuning Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Dec 5, 2008, at 12:22 PM, Justin wrote: Hi, We are currently using OpenMPI 1.3 on Ranger for large processor jobs (8K+). Our code appears to be occasionally deadlocking at random within point to point communication (see stacktrace below). This code has been tested on many different MPI versions and as far as we know it does not contain a deadlock. However, in the past we have ran into problems with shared memory optimizations within MPI causing deadlocks. We can usually avoid these by setting a few environment variables to either increase the size of shared memory buffers or disable shared memory optimizations all together. Does OpenMPI have any known deadlocks that might be causing our deadlocks? If are there any work arounds? Also how do we disable shared memory within OpenMPI? Here is an example of where processors are hanging: #0 0x2b2df3522683 in mca_btl_sm_component_progress () from /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_btl_sm.so #1 0x2b2df2cb46bf in mca_bml_r2_progress () from /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_bml_r2.so #2 0x2b2df0032ea4 in opal_progress () from /opt/apps/intel10_1/openmpi/1.3/lib/libopen-pal.so.0 #3 0x2b2ded0d7622 in ompi_request_default_wait_some () from /opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0 #4 0x2b2ded109e34 in PMPI_Waitsome () from /opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0 Thanks, Justin ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Deadlock on large numbers of processors
OpenMPI has differnt eager limits for all the network types, on your system run: ompi_info --param btl all and look for the eager_limits You can set these values to 0 using the syntax I showed you before. That would disable eager messages. There might be a better way to disable eager messages. Not sure why you would want to disable them, they are there for performance. Maybe you would still see a deadlock if every message was below the threshold. I think there is a limit of the number of eager messages a receving cpus will accept. Not sure about that though. I still kind of doubt it though. Try tweaking your buffer sizes, make the openib btl eager limit the same as shared memory. and see if you get locks up between hosts and not just shared memory. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Dec 5, 2008, at 2:10 PM, Justin wrote: Thank you for this info. I should add that our code tends to post a lot of sends prior to the other side posting receives. This causes a lot of unexpected messages to exist. Our code explicitly matches up all tags and processors (that is we do not use MPI wild cards). If we had a dead lock I would think we would see it regardless of weather or not we cross the roundevous threshold. I guess one way to test this would be to to set this threshold to 0. If it then dead locks we would likely be able to track down the deadlock. Are there any other parameters we can send mpi that will turn off buffering? Thanks, Justin Brock Palen wrote: When ever this happens we found the code to have a deadlock. users never saw it until they cross the eager->roundevous threshold. Yes you can disable shared memory with: mpirun --mca btl ^sm Or you can try increasing the eager limit. ompi_info --param btl sm MCA btl: parameter "btl_sm_eager_limit" (current value: "4096") You can modify this limit at run time, I think (can't test it right now) it is just: mpirun --mca btl_sm_eager_limit 40960 I think you can also in tweaking these values use env Vars in place of putting it all in the mpirun line: export OMPI_MCA_btl_sm_eager_limit=40960 See: http://www.open-mpi.org/faq/?category=tuning Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Dec 5, 2008, at 12:22 PM, Justin wrote: Hi, We are currently using OpenMPI 1.3 on Ranger for large processor jobs (8K+). Our code appears to be occasionally deadlocking at random within point to point communication (see stacktrace below). This code has been tested on many different MPI versions and as far as we know it does not contain a deadlock. However, in the past we have ran into problems with shared memory optimizations within MPI causing deadlocks. We can usually avoid these by setting a few environment variables to either increase the size of shared memory buffers or disable shared memory optimizations all together. Does OpenMPI have any known deadlocks that might be causing our deadlocks? If are there any work arounds? Also how do we disable shared memory within OpenMPI? Here is an example of where processors are hanging: #0 0x2b2df3522683 in mca_btl_sm_component_progress () from / opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_btl_sm.so #1 0x2b2df2cb46bf in mca_bml_r2_progress () from /opt/apps/ intel10_1/openmpi/1.3/lib/openmpi/mca_bml_r2.so #2 0x2b2df0032ea4 in opal_progress () from /opt/apps/ intel10_1/openmpi/1.3/lib/libopen-pal.so.0 #3 0x2b2ded0d7622 in ompi_request_default_wait_some () from / opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0 #4 0x2b2ded109e34 in PMPI_Waitsome () from /opt/apps/ intel10_1/openmpi/1.3//lib/libmpi.so.0 Thanks, Justin ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Issue with Profiling Fortran code
On Dec 5, 2008, at 03:16 , Anthony Chan wrote: void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); *info = PMPI_Comm_rank(comm,rank); } Unfortunately this example is not correct. The real Fortran prototype for the MPI_Comm_rank function is void mpi_comm_rank_(MPI_Fint *comm, MPI_Fint *rank, MPI_Fint *ierr). As you might notice there is no MPI_Comm (and believe me for Open MPI MPI_Comm is different than MPI_Fint), and there is no guarantee that the C int is the same as the Fortran int (looks weird but true). Therefore, several conversions are required in order to be able to go from the Fortran layer into the C one. As a result, a tool should never cross the language boundary by itself. Instead it should call the pmpi function as provided by the MPI library. This doesn't really fix the issue that started this email thread, but at least clarify it a little bit. george. A.Chan - "Nick Wright"wrote: Hi I am trying to use the PMPI interface with OPENMPI to profile a fortran program. I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched on. The problem seems to be that if one eg. intercepts to call to mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should. So if one wants to create a library that can profile C and Fortran codes at the same time one ends up intercepting the mpi call twice. Which is not desirable and not what should happen (and indeed doesn't happen in other MPI implementations). A simple example to illustrate is below. If somebody knows of a fix to avoid this issue that would be great ! Thanks Nick. pmpi_test.c: mpicc pmpi_test.c -c #include #include "mpi.h" void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); pmpi_comm_rank_(comm,rank,info); } int MPI_Comm_rank(MPI_Comm comm, int *rank) { printf("MPI_comm_rank call successfully intercepted\n"); PMPI_Comm_rank(comm,rank); } hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o program hello implicit none include 'mpif.h' integer ierr integer myid,nprocs character*24 fdate,host call MPI_Init( ierr ) myid=0 call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr ) call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr ) call getenv('HOST',host) write (*,*) 'Hello World from proc',myid,' out of',nprocs,host call mpi_finalize(ierr) end ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Deadlock on large numbers of processors
Thank you for this info. I should add that our code tends to post a lot of sends prior to the other side posting receives. This causes a lot of unexpected messages to exist. Our code explicitly matches up all tags and processors (that is we do not use MPI wild cards). If we had a dead lock I would think we would see it regardless of weather or not we cross the roundevous threshold. I guess one way to test this would be to to set this threshold to 0. If it then dead locks we would likely be able to track down the deadlock. Are there any other parameters we can send mpi that will turn off buffering? Thanks, Justin Brock Palen wrote: When ever this happens we found the code to have a deadlock. users never saw it until they cross the eager->roundevous threshold. Yes you can disable shared memory with: mpirun --mca btl ^sm Or you can try increasing the eager limit. ompi_info --param btl sm MCA btl: parameter "btl_sm_eager_limit" (current value: "4096") You can modify this limit at run time, I think (can't test it right now) it is just: mpirun --mca btl_sm_eager_limit 40960 I think you can also in tweaking these values use env Vars in place of putting it all in the mpirun line: export OMPI_MCA_btl_sm_eager_limit=40960 See: http://www.open-mpi.org/faq/?category=tuning Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Dec 5, 2008, at 12:22 PM, Justin wrote: Hi, We are currently using OpenMPI 1.3 on Ranger for large processor jobs (8K+). Our code appears to be occasionally deadlocking at random within point to point communication (see stacktrace below). This code has been tested on many different MPI versions and as far as we know it does not contain a deadlock. However, in the past we have ran into problems with shared memory optimizations within MPI causing deadlocks. We can usually avoid these by setting a few environment variables to either increase the size of shared memory buffers or disable shared memory optimizations all together. Does OpenMPI have any known deadlocks that might be causing our deadlocks? If are there any work arounds? Also how do we disable shared memory within OpenMPI? Here is an example of where processors are hanging: #0 0x2b2df3522683 in mca_btl_sm_component_progress () from /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_btl_sm.so #1 0x2b2df2cb46bf in mca_bml_r2_progress () from /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_bml_r2.so #2 0x2b2df0032ea4 in opal_progress () from /opt/apps/intel10_1/openmpi/1.3/lib/libopen-pal.so.0 #3 0x2b2ded0d7622 in ompi_request_default_wait_some () from /opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0 #4 0x2b2ded109e34 in PMPI_Waitsome () from /opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0 Thanks, Justin ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Deadlock on large numbers of processors
When ever this happens we found the code to have a deadlock. users never saw it until they cross the eager->roundevous threshold. Yes you can disable shared memory with: mpirun --mca btl ^sm Or you can try increasing the eager limit. ompi_info --param btl sm MCA btl: parameter "btl_sm_eager_limit" (current value: "4096") You can modify this limit at run time, I think (can't test it right now) it is just: mpirun --mca btl_sm_eager_limit 40960 I think you can also in tweaking these values use env Vars in place of putting it all in the mpirun line: export OMPI_MCA_btl_sm_eager_limit=40960 See: http://www.open-mpi.org/faq/?category=tuning Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Dec 5, 2008, at 12:22 PM, Justin wrote: Hi, We are currently using OpenMPI 1.3 on Ranger for large processor jobs (8K+). Our code appears to be occasionally deadlocking at random within point to point communication (see stacktrace below). This code has been tested on many different MPI versions and as far as we know it does not contain a deadlock. However, in the past we have ran into problems with shared memory optimizations within MPI causing deadlocks. We can usually avoid these by setting a few environment variables to either increase the size of shared memory buffers or disable shared memory optimizations all together. Does OpenMPI have any known deadlocks that might be causing our deadlocks? If are there any work arounds? Also how do we disable shared memory within OpenMPI? Here is an example of where processors are hanging: #0 0x2b2df3522683 in mca_btl_sm_component_progress () from / opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_btl_sm.so #1 0x2b2df2cb46bf in mca_bml_r2_progress () from /opt/apps/ intel10_1/openmpi/1.3/lib/openmpi/mca_bml_r2.so #2 0x2b2df0032ea4 in opal_progress () from /opt/apps/intel10_1/ openmpi/1.3/lib/libopen-pal.so.0 #3 0x2b2ded0d7622 in ompi_request_default_wait_some () from / opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0 #4 0x2b2ded109e34 in PMPI_Waitsome () from /opt/apps/intel10_1/ openmpi/1.3//lib/libmpi.so.0 Thanks, Justin ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Deadlock on large numbers of processors
On Dec 5, 2008, at 12:22 PM, Justin wrote: Does OpenMPI have any known deadlocks that might be causing our deadlocks? Known deadlocks, no. We are assisting a customer, however, with a deadlock that occurs in IMB Alltoall (and some other IMB tests) when using 128 hosts and the MX BTL. We have not yet determined if it is a problem with MX, the MX BTL, or something else. Scott
Re: [OMPI users] Issue with Profiling Fortran code
Brian Sorry I picked the wrong word there. I guess this is more complicated than I thought it was. For the first case you describe, as OPENMPI is now, the call sequence from fortran is mpi_comm_rank -> MPI_Comm_rank -> PMPI_Comm_rank For the second case, as MPICH is now, its mpi_comm_rank -> PMPI_Comm_rank So for the first case if I have a pure fortran/C++ code I have to profile at the C interface. So is the patch now retracted ? Nick. I think you have an incorrect deffinition of "correctly" :). According to the MPI standard, an MPI implementation is free to either layer language bindings (and only allow profiling at the lowest layer) or not layer the language bindings (and require profiling libraries intercept each language). The only requirement is that the implementation document what it has done. Since everyone is pretty clear on what Open MPI has done, I don't think you can claim Open MPI is doing it "incorrectly". Different from MPICH is not necessarily incorrect. (BTW, LAM/MPI handles profiling the same way as Open MPI). Brian On Fri, 5 Dec 2008, Nick Wright wrote: Hi Antony That will work yes, but its not portable to other MPI's that do implement the profiling layer correctly unfortunately. I guess we will just need to detect that we are using openmpi when our tool is configured and add some macros to deal with that accordingly. Is there an easy way to do this built into openmpi? Thanks Nick. Anthony Chan wrote: Hope I didn't misunderstand your question. If you implement your profiling library in C where you do your real instrumentation, you don't need to implement the fortran layer, you can simply link with Fortran to C MPI wrapper library -lmpi_f77. i.e. /bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib where libYourProfClib.a is your profiling tool written in C. If you don't want to intercept the MPI call twice for fortran program, you need to implment fortran layer. In that case, I would think you can just call C version of PMPI_xxx directly from your fortran layer, e.g. void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); *info = PMPI_Comm_rank(comm,rank); } A.Chan - "Nick Wright"wrote: Hi I am trying to use the PMPI interface with OPENMPI to profile a fortran program. I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched on. The problem seems to be that if one eg. intercepts to call to mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should. So if one wants to create a library that can profile C and Fortran codes at the same time one ends up intercepting the mpi call twice. Which is not desirable and not what should happen (and indeed doesn't happen in other MPI implementations). A simple example to illustrate is below. If somebody knows of a fix to avoid this issue that would be great ! Thanks Nick. pmpi_test.c: mpicc pmpi_test.c -c #include #include "mpi.h" void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); pmpi_comm_rank_(comm,rank,info); } int MPI_Comm_rank(MPI_Comm comm, int *rank) { printf("MPI_comm_rank call successfully intercepted\n"); PMPI_Comm_rank(comm,rank); } hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o program hello implicit none include 'mpif.h' integer ierr integer myid,nprocs character*24 fdate,host call MPI_Init( ierr ) myid=0 call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr ) call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr ) call getenv('HOST',host) write (*,*) 'Hello World from proc',myid,' out of',nprocs,host call mpi_finalize(ierr) end ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Processor/core selection/affinity for large shared memory systems
Terry Frankcombe wrote: > Isn't it up to the OS scheduler what gets run where? I was under the impression that the processor affinity API was designed to let the OS (at least Linux) know how a given task preferred to be bound in terms of the system topology. -- V. Ram v_r_...@fastmail.fm -- http://www.fastmail.fm - Accessible with your email software or over the web
Re: [OMPI users] Issue with Profiling Fortran code
Nick - I think you have an incorrect deffinition of "correctly" :). According to the MPI standard, an MPI implementation is free to either layer language bindings (and only allow profiling at the lowest layer) or not layer the language bindings (and require profiling libraries intercept each language). The only requirement is that the implementation document what it has done. Since everyone is pretty clear on what Open MPI has done, I don't think you can claim Open MPI is doing it "incorrectly". Different from MPICH is not necessarily incorrect. (BTW, LAM/MPI handles profiling the same way as Open MPI). Brian On Fri, 5 Dec 2008, Nick Wright wrote: Hi Antony That will work yes, but its not portable to other MPI's that do implement the profiling layer correctly unfortunately. I guess we will just need to detect that we are using openmpi when our tool is configured and add some macros to deal with that accordingly. Is there an easy way to do this built into openmpi? Thanks Nick. Anthony Chan wrote: Hope I didn't misunderstand your question. If you implement your profiling library in C where you do your real instrumentation, you don't need to implement the fortran layer, you can simply link with Fortran to C MPI wrapper library -lmpi_f77. i.e. /bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib where libYourProfClib.a is your profiling tool written in C. If you don't want to intercept the MPI call twice for fortran program, you need to implment fortran layer. In that case, I would think you can just call C version of PMPI_xxx directly from your fortran layer, e.g. void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); *info = PMPI_Comm_rank(comm,rank); } A.Chan - "Nick Wright"wrote: Hi I am trying to use the PMPI interface with OPENMPI to profile a fortran program. I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched on. The problem seems to be that if one eg. intercepts to call to mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should. So if one wants to create a library that can profile C and Fortran codes at the same time one ends up intercepting the mpi call twice. Which is not desirable and not what should happen (and indeed doesn't happen in other MPI implementations). A simple example to illustrate is below. If somebody knows of a fix to avoid this issue that would be great ! Thanks Nick. pmpi_test.c: mpicc pmpi_test.c -c #include #include "mpi.h" void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); pmpi_comm_rank_(comm,rank,info); } int MPI_Comm_rank(MPI_Comm comm, int *rank) { printf("MPI_comm_rank call successfully intercepted\n"); PMPI_Comm_rank(comm,rank); } hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o program hello implicit none include 'mpif.h' integer ierr integer myid,nprocs character*24 fdate,host call MPI_Init( ierr ) myid=0 call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr ) call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr ) call getenv('HOST',host) write (*,*) 'Hello World from proc',myid,' out of',nprocs,host call mpi_finalize(ierr) end ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Processor/core selection/affinity for large shared memory systems
Ralph Castain wrote: > Thanks - yes, that helps. Can you do add --display-map to you cmd > line? That will tell us what mpirun thinks it is doing. The output from display map is below. Note that I've sanitized a few items, but nothing relevant to this: [granite:29685] Map for job: 1 Generated by mapping mode: byslot Starting vpid: 0Vpid range: 16 Num app_contexts: 1 Data for app_context: index 0 app: /path/to/executable Num procs: 16 Argv[0]: /path/to/executable Env[0]: OMPI_MCA_rmaps_base_display_map=1 Env[1]: OMPI_MCA_orte_precondition_transports=e16b0004a956445e-0515b892592a4a02 Env[2]: OMPI_MCA_rds=proxy Env[3]: OMPI_MCA_ras=proxy Env[4]: OMPI_MCA_rmaps=proxy Env[5]: OMPI_MCA_pls=proxy Env[6]: OMPI_MCA_rmgr=proxy Working dir: /home/user/case (user: 0) Num maps: 0 Num elements in nodes list: 1 Mapped node: Cell: 0 Nodename: graniteLaunch id: -1 Username: NULL Daemon name: Data type: ORTE_PROCESS_NAMEData Value: NULL Oversubscribed: TrueNum elements in procs list: 16 Mapped proc: Proc Name: Data type: ORTE_PROCESS_NAMEData Value: [0,1,0] Proc Rank: 0Proc PID: 0 App_context index: 0 Mapped proc: Proc Name: Data type: ORTE_PROCESS_NAMEData Value: [0,1,1] Proc Rank: 1Proc PID: 0 App_context index: 0 Mapped proc: Proc Name: Data type: ORTE_PROCESS_NAMEData Value: [0,1,2] Proc Rank: 2Proc PID: 0 App_context index: 0 Mapped proc: Proc Name: Data type: ORTE_PROCESS_NAMEData Value: [0,1,3] Proc Rank: 3Proc PID: 0 App_context index: 0 Mapped proc: Proc Name: Data type: ORTE_PROCESS_NAMEData Value: [0,1,4] Proc Rank: 4Proc PID: 0 App_context index: 0 Mapped proc: Proc Name: Data type: ORTE_PROCESS_NAMEData Value: [0,1,5] Proc Rank: 5Proc PID: 0 App_context index: 0 Mapped proc: Proc Name: Data type: ORTE_PROCESS_NAMEData Value: [0,1,6] Proc Rank: 6Proc PID: 0 App_context index: 0 Mapped proc: Proc Name: Data type: ORTE_PROCESS_NAMEData Value: [0,1,7] Proc Rank: 7Proc PID: 0 App_context index: 0 Mapped proc: Proc Name: Data type: ORTE_PROCESS_NAMEData Value: [0,1,8] Proc Rank: 8Proc PID: 0 App_context index: 0 Mapped proc: Proc Name: Data type: ORTE_PROCESS_NAMEData Value: [0,1,9] Proc Rank: 9Proc PID: 0 App_context index: 0 Mapped proc: Proc Name: Data type: ORTE_PROCESS_NAMEData Value: [0,1,10] Proc Rank: 10 Proc PID: 0 App_context index: 0 Mapped proc: Proc Name: Data type: ORTE_PROCESS_NAMEData Value: [0,1,11] Proc Rank: 11 Proc PID: 0 App_context index: 0 Mapped proc: Proc Name: Data type: ORTE_PROCESS_NAMEData Value: [0,1,12] Proc Rank: 12 Proc PID: 0 App_context index: 0 Mapped proc: Proc Name: Data type: ORTE_PROCESS_NAMEData Value: [0,1,13] Proc Rank: 13
Re: [OMPI users] Issue with Profiling Fortran code
I hope you are aware, that *many* tools and application actually profile the fortran MPI layer by intercepting the C function calls. This allows them to not have to deal with f2c translation of MPI objects and not worry about the name mangling issue. Would there be a way to have both options e.g. as a configure flag? The current commit basically breaks all of these applications... Edgar, I haven't seen the fix so I can't comment on that. Anyway, in general though this can't be true. Such a profiling tool would *only* work with openmpi if it were written that way today. I guess such a fix will break openmpi specific tools (are there any?). For MPICH for example, one must provide a hook into eg mpi_comm_rank_ as that calls PMPI_Comm_rank (as it should) and thus if one was only intercepting C calls one would not see any fortran profiling information. Nick. George Bosilca wrote: Nick, Thanks for noticing this. It's unbelievable that nobody noticed that over the last 5 years. Anyway, I think we have a one line fix for this problem. I'll test it asap, and then push it in the 1.3. Thanks, george. On Dec 5, 2008, at 10:14 , Nick Wright wrote: Hi Antony That will work yes, but its not portable to other MPI's that do implement the profiling layer correctly unfortunately. I guess we will just need to detect that we are using openmpi when our tool is configured and add some macros to deal with that accordingly. Is there an easy way to do this built into openmpi? Thanks Nick. Anthony Chan wrote: Hope I didn't misunderstand your question. If you implement your profiling library in C where you do your real instrumentation, you don't need to implement the fortran layer, you can simply link with Fortran to C MPI wrapper library -lmpi_f77. i.e. /bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib where libYourProfClib.a is your profiling tool written in C. If you don't want to intercept the MPI call twice for fortran program, you need to implment fortran layer. In that case, I would think you can just call C version of PMPI_xxx directly from your fortran layer, e.g. void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); *info = PMPI_Comm_rank(comm,rank); } A.Chan - "Nick Wright"wrote: Hi I am trying to use the PMPI interface with OPENMPI to profile a fortran program. I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched on. The problem seems to be that if one eg. intercepts to call to mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should. So if one wants to create a library that can profile C and Fortran codes at the same time one ends up intercepting the mpi call twice. Which is not desirable and not what should happen (and indeed doesn't happen in other MPI implementations). A simple example to illustrate is below. If somebody knows of a fix to avoid this issue that would be great ! Thanks Nick. pmpi_test.c: mpicc pmpi_test.c -c #include #include "mpi.h" void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); pmpi_comm_rank_(comm,rank,info); } int MPI_Comm_rank(MPI_Comm comm, int *rank) { printf("MPI_comm_rank call successfully intercepted\n"); PMPI_Comm_rank(comm,rank); } hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o program hello implicit none include 'mpif.h' integer ierr integer myid,nprocs character*24 fdate,host call MPI_Init( ierr ) myid=0 call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr ) call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr ) call getenv('HOST',host) write (*,*) 'Hello World from proc',myid,' out of',nprocs,host call mpi_finalize(ierr) end ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Issue with Profiling Fortran code
On Dec 5, 2008, at 11:29 AM, Nick Wright wrote: I think we can just look at OPEN_MPI as you say and then OMPI_MAJOR_VERSION, OMPI_MINOR_VERSION & OMPI_RELEASE_VERSION from mpi.h and if version is less than 1.2.9 implement a work around as Antony suggested. Its not the most elegant solution but it will work I think? Ya, that should work. -- Jeff Squyres Cisco Systems
[OMPI users] Deadlock on large numbers of processors
Hi, We are currently using OpenMPI 1.3 on Ranger for large processor jobs (8K+). Our code appears to be occasionally deadlocking at random within point to point communication (see stacktrace below). This code has been tested on many different MPI versions and as far as we know it does not contain a deadlock. However, in the past we have ran into problems with shared memory optimizations within MPI causing deadlocks. We can usually avoid these by setting a few environment variables to either increase the size of shared memory buffers or disable shared memory optimizations all together. Does OpenMPI have any known deadlocks that might be causing our deadlocks? If are there any work arounds? Also how do we disable shared memory within OpenMPI? Here is an example of where processors are hanging: #0 0x2b2df3522683 in mca_btl_sm_component_progress () from /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_btl_sm.so #1 0x2b2df2cb46bf in mca_bml_r2_progress () from /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_bml_r2.so #2 0x2b2df0032ea4 in opal_progress () from /opt/apps/intel10_1/openmpi/1.3/lib/libopen-pal.so.0 #3 0x2b2ded0d7622 in ompi_request_default_wait_some () from /opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0 #4 0x2b2ded109e34 in PMPI_Waitsome () from /opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0 Thanks, Justin
Re: [OMPI users] Issue with Profiling Fortran code
George, I hope you are aware, that *many* tools and application actually profile the fortran MPI layer by intercepting the C function calls. This allows them to not have to deal with f2c translation of MPI objects and not worry about the name mangling issue. Would there be a way to have both options e.g. as a configure flag? The current commit basically breaks all of these applications... Thanks Edgar George Bosilca wrote: Nick, Thanks for noticing this. It's unbelievable that nobody noticed that over the last 5 years. Anyway, I think we have a one line fix for this problem. I'll test it asap, and then push it in the 1.3. Thanks, george. On Dec 5, 2008, at 10:14 , Nick Wright wrote: Hi Antony That will work yes, but its not portable to other MPI's that do implement the profiling layer correctly unfortunately. I guess we will just need to detect that we are using openmpi when our tool is configured and add some macros to deal with that accordingly. Is there an easy way to do this built into openmpi? Thanks Nick. Anthony Chan wrote: Hope I didn't misunderstand your question. If you implement your profiling library in C where you do your real instrumentation, you don't need to implement the fortran layer, you can simply link with Fortran to C MPI wrapper library -lmpi_f77. i.e. /bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib where libYourProfClib.a is your profiling tool written in C. If you don't want to intercept the MPI call twice for fortran program, you need to implment fortran layer. In that case, I would think you can just call C version of PMPI_xxx directly from your fortran layer, e.g. void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); *info = PMPI_Comm_rank(comm,rank); } A.Chan - "Nick Wright"wrote: Hi I am trying to use the PMPI interface with OPENMPI to profile a fortran program. I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched on. The problem seems to be that if one eg. intercepts to call to mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should. So if one wants to create a library that can profile C and Fortran codes at the same time one ends up intercepting the mpi call twice. Which is not desirable and not what should happen (and indeed doesn't happen in other MPI implementations). A simple example to illustrate is below. If somebody knows of a fix to avoid this issue that would be great ! Thanks Nick. pmpi_test.c: mpicc pmpi_test.c -c #include #include "mpi.h" void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); pmpi_comm_rank_(comm,rank,info); } int MPI_Comm_rank(MPI_Comm comm, int *rank) { printf("MPI_comm_rank call successfully intercepted\n"); PMPI_Comm_rank(comm,rank); } hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o program hello implicit none include 'mpif.h' integer ierr integer myid,nprocs character*24 fdate,host call MPI_Init( ierr ) myid=0 call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr ) call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr ) call getenv('HOST',host) write (*,*) 'Hello World from proc',myid,' out of',nprocs,host call mpi_finalize(ierr) end ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Edgar Gabriel Assistant Professor Parallel Software Technologies Lab http://pstl.cs.uh.edu Department of Computer Science University of Houston Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335
Re: [OMPI users] Issue with Profiling Fortran code
I think we can just look at OPEN_MPI as you say and then OMPI_MAJOR_VERSION, OMPI_MINOR_VERSION & OMPI_RELEASE_VERSION from mpi.h and if version is less than 1.2.9 implement a work around as Antony suggested. Its not the most elegant solution but it will work I think? Nick. Jeff Squyres wrote: On Dec 5, 2008, at 10:55 AM, David Skinner wrote: FWIW, if that one-liner fix works (George and I just chatted about this on the phone), we can probably also push it into v1.2.9. great! thanks. It occurs to me that this is likely not going to be enough for you, though. :-\ Like it or not, there's still installed OMPI's out there that will show this old behavior. Do you need to know / adapt for those? If so, I can see two ways of you figuring it out: 1. At run time, do a simple call to (Fortran) MPI_INITIALIZED and see if you intercept it twice (both in Fortran and in C). 2. If that's not attractive, we can probably add a line into the ompi_info output that you can grep for when using OMPI (you can look for the OPEN_MPI macro from our to know if it's Open MPI or not). Specifically, this line can be there for the "fixed" versions, and it simply won't be there for non-fixed versions.
Re: [OMPI users] Issue with Profiling Fortran code
On Dec 5, 2008, at 10:55 AM, David Skinner wrote: FWIW, if that one-liner fix works (George and I just chatted about this on the phone), we can probably also push it into v1.2.9. great! thanks. It occurs to me that this is likely not going to be enough for you, though. :-\ Like it or not, there's still installed OMPI's out there that will show this old behavior. Do you need to know / adapt for those? If so, I can see two ways of you figuring it out: 1. At run time, do a simple call to (Fortran) MPI_INITIALIZED and see if you intercept it twice (both in Fortran and in C). 2. If that's not attractive, we can probably add a line into the ompi_info output that you can grep for when using OMPI (you can look for the OPEN_MPI macro from our to know if it's Open MPI or not). Specifically, this line can be there for the "fixed" versions, and it simply won't be there for non-fixed versions. -- Jeff Squyres Cisco Systems
Re: [OMPI users] Issue with Profiling Fortran code
FWIW, if that one-liner fix works (George and I just chatted about this on the phone), we can probably also push it into v1.2.9. On Dec 5, 2008, at 10:49 AM, George Bosilca wrote: Nick, Thanks for noticing this. It's unbelievable that nobody noticed that over the last 5 years. Anyway, I think we have a one line fix for this problem. I'll test it asap, and then push it in the 1.3. Thanks, george. On Dec 5, 2008, at 10:14 , Nick Wright wrote: Hi Antony That will work yes, but its not portable to other MPI's that do implement the profiling layer correctly unfortunately. I guess we will just need to detect that we are using openmpi when our tool is configured and add some macros to deal with that accordingly. Is there an easy way to do this built into openmpi? Thanks Nick. Anthony Chan wrote: Hope I didn't misunderstand your question. If you implement your profiling library in C where you do your real instrumentation, you don't need to implement the fortran layer, you can simply link with Fortran to C MPI wrapper library -lmpi_f77. i.e. /bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib where libYourProfClib.a is your profiling tool written in C. If you don't want to intercept the MPI call twice for fortran program, you need to implment fortran layer. In that case, I would think you can just call C version of PMPI_xxx directly from your fortran layer, e.g. void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); *info = PMPI_Comm_rank(comm,rank); } A.Chan - "Nick Wright"wrote: Hi I am trying to use the PMPI interface with OPENMPI to profile a fortran program. I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched on. The problem seems to be that if one eg. intercepts to call to mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should. So if one wants to create a library that can profile C and Fortran codes at the same time one ends up intercepting the mpi call twice. Which is not desirable and not what should happen (and indeed doesn't happen in other MPI implementations). A simple example to illustrate is below. If somebody knows of a fix to avoid this issue that would be great ! Thanks Nick. pmpi_test.c: mpicc pmpi_test.c -c #include #include "mpi.h" void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); pmpi_comm_rank_(comm,rank,info); } int MPI_Comm_rank(MPI_Comm comm, int *rank) { printf("MPI_comm_rank call successfully intercepted\n"); PMPI_Comm_rank(comm,rank); } hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o program hello implicit none include 'mpif.h' integer ierr integer myid,nprocs character*24 fdate,host call MPI_Init( ierr ) myid=0 call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr ) call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr ) call getenv('HOST',host) write (*,*) 'Hello World from proc',myid,' out of',nprocs,host call mpi_finalize(ierr) end ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] Hybrid program
Nifty -- good to know. Thanks for looking into this! Do any kernel-hacker types on this list know roundabout what version thread-affinity was brought into the Linux kernel? FWIW: all the same concepts here (using pid==0) should also work for PLPA, so you can set via socket/core, etc. On Dec 5, 2008, at 10:45 AM, Edgar Gabriel wrote: ok, so I digged a little deeper, and have some good news. Let me start with a set of routines, that we didn't even discuss yet, but which works for setting thread affinity, and discuss then libnuma and sched_setaffinity() again. --- On linux systems, the pthread library has a set of routines to modify and determine thread-affinity related information: #define __USE_GNU int pthread_setaffinity_np (pthread_t __th, size_t __cpusetsize, const cpu_set_t *__cpuset); int pthread_getaffinity_np (pthread_t __th, size_t __cpusetsize, cpu_set_t *__cpuset) These two routines can be used to modify the affinity of an existing thread. If you would like to modify the affinity of a thread *before* creating it, you can use a similar routines. int pthread_attr_setaffinity_np (pthread_attr_t *__attr, size_t __cpusetsize, const cpu_set_t *__cpuset) I tested the first two routines, and they did work for me. --- Now to libnuma vs. sched_setaffinity: after digging a little deeper in the libnuma sources, I realized that one of the differences on what they do vs. what I did in my testcases was, that libnuma uses the sched_setaffinity() calls with a pid of 0, instead of determining the pid using the getpid() function. According to the sched_setaffinity() manpages, pid of zero means 'apply the new rules to the current process', but it does in fact mean 'to the current task/thread'. I wrote a set of tests, where I used sched_setaffinity() with the pid zero, and I was in fact able to modify the affinity of an individual thread using sched_setaffinity(). If you pass in the pid of the process, it will affect the affinity of all threads of that process. Bottom line is, you can modify the affinity of a thread using both libnuma on a per socket basis and the sched_setaffinity() calls on a per core basis. Alternatively, you can use the pthread_setaffinity_np() function to modify the affinity of a thread using a cpu_set_t similar to sched_setaffinity. Thanks Edgar Jeff Squyres wrote: Fair enough; let me know what you find. It would be good to understand exactly why you're seeing what you're seeing... On Dec 2, 2008, at 5:47 PM, Edgar Gabriel wrote: its on OpenSuSE 11 with kernel 2.6.25.11. I don't know the libnuma library version, but I suspect that its fairly new. I will try to investigate that in the next days a little more. I do think that they use sched_setaffinity() underneath the hood (because in one of my failed attempts when I passed in the wrong argument, I got actually the same error message that I got earlier with sched_setaffinity), but they must do something additionally underneath. Anyway, I just wanted to report the result, and that there is obviously a difference, even if can't explain it right now in details. Thanks Edgar Jeff Squyres wrote: On Dec 2, 2008, at 11:27 AM, Edgar Gabriel wrote: so I ran a couple of tests today and I can not confirm your statement. I wrote simple a simple test code where a process first sets an affinity mask and than spawns a number of threads. The threads modify the affinity mask and every thread ( including the master thread) print out there affinity mask at the end. With sched_getaffinity() and sched_setaffinity() it was indeed such that the master thread had the same affinity mask as the thread that it spawned. This means, that the modification of the affinity mask by a new thread in fact did affect the master thread. Executing the same codesquence however using the libnuma calls, the master thread however was not affected by the new affinity mask of the children. So clearly, libnuma must be doing something differently. What distro/version of Linux are you using, and what version of libnuma? Libnuma v2.0.x very definitely is just a wrapper around the syscall for sched_setaffinity(). I downloaded it from: ftp://oss.sgi.com/www/projects/libnuma/download -- Edgar Gabriel Assistant Professor Parallel Software Technologies Lab http://pstl.cs.uh.edu Department of Computer Science University of Houston Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Edgar Gabriel Assistant Professor Parallel Software Technologies Lab http://pstl.cs.uh.edu Department of Computer Science
Re: [OMPI users] Issue with Profiling Fortran code
Nick, Thanks for noticing this. It's unbelievable that nobody noticed that over the last 5 years. Anyway, I think we have a one line fix for this problem. I'll test it asap, and then push it in the 1.3. Thanks, george. On Dec 5, 2008, at 10:14 , Nick Wright wrote: Hi Antony That will work yes, but its not portable to other MPI's that do implement the profiling layer correctly unfortunately. I guess we will just need to detect that we are using openmpi when our tool is configured and add some macros to deal with that accordingly. Is there an easy way to do this built into openmpi? Thanks Nick. Anthony Chan wrote: Hope I didn't misunderstand your question. If you implement your profiling library in C where you do your real instrumentation, you don't need to implement the fortran layer, you can simply link with Fortran to C MPI wrapper library -lmpi_f77. i.e. /bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib where libYourProfClib.a is your profiling tool written in C. If you don't want to intercept the MPI call twice for fortran program, you need to implment fortran layer. In that case, I would think you can just call C version of PMPI_xxx directly from your fortran layer, e.g. void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); *info = PMPI_Comm_rank(comm,rank); } A.Chan - "Nick Wright"wrote: Hi I am trying to use the PMPI interface with OPENMPI to profile a fortran program. I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched on. The problem seems to be that if one eg. intercepts to call to mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should. So if one wants to create a library that can profile C and Fortran codes at the same time one ends up intercepting the mpi call twice. Which is not desirable and not what should happen (and indeed doesn't happen in other MPI implementations). A simple example to illustrate is below. If somebody knows of a fix to avoid this issue that would be great ! Thanks Nick. pmpi_test.c: mpicc pmpi_test.c -c #include #include "mpi.h" void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); pmpi_comm_rank_(comm,rank,info); } int MPI_Comm_rank(MPI_Comm comm, int *rank) { printf("MPI_comm_rank call successfully intercepted\n"); PMPI_Comm_rank(comm,rank); } hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o program hello implicit none include 'mpif.h' integer ierr integer myid,nprocs character*24 fdate,host call MPI_Init( ierr ) myid=0 call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr ) call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr ) call getenv('HOST',host) write (*,*) 'Hello World from proc',myid,' out of',nprocs,host call mpi_finalize(ierr) end ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Fortran90 functions missing: MPI_COMM_GET_ATTR / MPI_ATTR_GET()
On Dec 5, 2008, at 10:33 AM, Jens wrote: thanks a lot. This fixed a bug in my code. I already like open-mpi for this :) LOL! Glad to help. :-) FWIW, we're working on new Fortran bindings for MPI-3 that fix some of the shortcomings of the F90 bindings. -- Jeff Squyres Cisco Systems
Re: [OMPI users] Fortran90 functions missing: MPI_COMM_GET_ATTR / MPI_ATTR_GET()
Hi Jeff, thanks a lot. This fixed a bug in my code. I already like open-mpi for this :) Greeting Jens Jeff Squyres schrieb: > These functions do exist in Open MPI, but your code is not quite > correct. Here's a new version that is correct: > > - > program main > use mpi > implicit none > integer :: ierr, rank, size > integer :: mpi1_val > integer(kind = MPI_ADDRESS_KIND) :: mpi2_val > logical :: attr_flag > > call MPI_INIT(ierr) > call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) > call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr) > > call MPI_COMM_GET_ATTR(MPI_COMM_WORLD, MPI_IO, mpi2_val, attr_flag, ierr) > call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, mpi1_val, attr_flag, ierr) > > print *, "Hello, world, I am ", rank, " of ", size > call MPI_FINALIZE(ierr) > end > - > > Note three things: > > 1. attr_flag is supposed to be of type logical, not integer > 2. In MPI-1 (MPI_ATTR_GET) the type of the value is integer > 2. In MPI-2 (MPI_COMM_GET_ATTR), the type of the value is > integer(kind=MPI_ADDRESS_KIND) > > F90 is strongly typed, so the F90 compiler is correct in claiming that > functions of the signature you specified were not found. > > Make sense? > > I'm not sure why your original code works with MPICH2 -- perhaps they > don't have F90 bindings for these functions, and therefore they're > falling through to the F77 bindings (where no type checking is > done)...? If so, you're getting lucky that it works; perhaps > sizeof(INTEGER) == sizeof(LOGICAL), and sizeof(INTEGER) == > sizeof(INTEGER(KIND=MPI_ADDRESS_KIND)). That's a guess. > > > > On Dec 5, 2008, at 4:49 AM, Jens wrote: > >> Hi, >> >> I just switched from MPICH2 to openmpi because of sge-support, but I am >> missing some mpi-functions for fortran 90. >> >> Does anyone know why >> MPI_COMM_GET_ATTR() >> MPI_ATTR_GET() >> are not available? They work fine with MPICH2. >> >> I compiled openmpi 1.2.8/1.3rc on a clean CentOS 5.2 with GNU-compilers >> and Intel 11.0. Both give me the same error: >> >> GNU: >> Error: There is no specific subroutine for the generic 'mpi_attr_get' >> at (1) >> >> Intel 11.0: >> hello_f90.f90(22): error #6285: There is no matching specific subroutine >> for this generic subroutine call. [MPI_ATTR_GET] >>call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr) >> >> Any ideas ...? >> >> Greetings >> Jens >> >> >> program main >> use mpi >> implicit none >> integer :: ierr, rank, size >> integer :: attr_val, attr_flag >> >> call MPI_INIT(ierr) >> call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) >> call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr) >> >> call MPI_COMM_GET_ATTR(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr) >> call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr) >> >> print *, "Hello, world, I am ", rank, " of ", size >> call MPI_FINALIZE(ierr) >> end >> --- >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >
Re: [OMPI users] Issue with Profiling Fortran code
Hi Antony That will work yes, but its not portable to other MPI's that do implement the profiling layer correctly unfortunately. I guess we will just need to detect that we are using openmpi when our tool is configured and add some macros to deal with that accordingly. Is there an easy way to do this built into openmpi? Thanks Nick. Anthony Chan wrote: Hope I didn't misunderstand your question. If you implement your profiling library in C where you do your real instrumentation, you don't need to implement the fortran layer, you can simply link with Fortran to C MPI wrapper library -lmpi_f77. i.e. /bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib where libYourProfClib.a is your profiling tool written in C. If you don't want to intercept the MPI call twice for fortran program, you need to implment fortran layer. In that case, I would think you can just call C version of PMPI_xxx directly from your fortran layer, e.g. void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); *info = PMPI_Comm_rank(comm,rank); } A.Chan - "Nick Wright"wrote: Hi I am trying to use the PMPI interface with OPENMPI to profile a fortran program. I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched on. The problem seems to be that if one eg. intercepts to call to mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should. So if one wants to create a library that can profile C and Fortran codes at the same time one ends up intercepting the mpi call twice. Which is not desirable and not what should happen (and indeed doesn't happen in other MPI implementations). A simple example to illustrate is below. If somebody knows of a fix to avoid this issue that would be great ! Thanks Nick. pmpi_test.c: mpicc pmpi_test.c -c #include #include "mpi.h" void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); pmpi_comm_rank_(comm,rank,info); } int MPI_Comm_rank(MPI_Comm comm, int *rank) { printf("MPI_comm_rank call successfully intercepted\n"); PMPI_Comm_rank(comm,rank); } hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o program hello implicit none include 'mpif.h' integer ierr integer myid,nprocs character*24 fdate,host call MPI_Init( ierr ) myid=0 call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr ) call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr ) call getenv('HOST',host) write (*,*) 'Hello World from proc',myid,' out of',nprocs,host call mpi_finalize(ierr) end ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Fortran90 functions missing: MPI_COMM_GET_ATTR / MPI_ATTR_GET()
These functions do exist in Open MPI, but your code is not quite correct. Here's a new version that is correct: - program main use mpi implicit none integer :: ierr, rank, size integer :: mpi1_val integer(kind = MPI_ADDRESS_KIND) :: mpi2_val logical :: attr_flag call MPI_INIT(ierr) call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr) call MPI_COMM_GET_ATTR(MPI_COMM_WORLD, MPI_IO, mpi2_val, attr_flag, ierr) call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, mpi1_val, attr_flag, ierr) print *, "Hello, world, I am ", rank, " of ", size call MPI_FINALIZE(ierr) end - Note three things: 1. attr_flag is supposed to be of type logical, not integer 2. In MPI-1 (MPI_ATTR_GET) the type of the value is integer 2. In MPI-2 (MPI_COMM_GET_ATTR), the type of the value is integer(kind=MPI_ADDRESS_KIND) F90 is strongly typed, so the F90 compiler is correct in claiming that functions of the signature you specified were not found. Make sense? I'm not sure why your original code works with MPICH2 -- perhaps they don't have F90 bindings for these functions, and therefore they're falling through to the F77 bindings (where no type checking is done)...? If so, you're getting lucky that it works; perhaps sizeof(INTEGER) == sizeof(LOGICAL), and sizeof(INTEGER) == sizeof(INTEGER(KIND=MPI_ADDRESS_KIND)). That's a guess. On Dec 5, 2008, at 4:49 AM, Jens wrote: Hi, I just switched from MPICH2 to openmpi because of sge-support, but I am missing some mpi-functions for fortran 90. Does anyone know why MPI_COMM_GET_ATTR() MPI_ATTR_GET() are not available? They work fine with MPICH2. I compiled openmpi 1.2.8/1.3rc on a clean CentOS 5.2 with GNU- compilers and Intel 11.0. Both give me the same error: GNU: Error: There is no specific subroutine for the generic 'mpi_attr_get' at (1) Intel 11.0: hello_f90.f90(22): error #6285: There is no matching specific subroutine for this generic subroutine call. [MPI_ATTR_GET] call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr) Any ideas ...? Greetings Jens program main use mpi implicit none integer :: ierr, rank, size integer :: attr_val, attr_flag call MPI_INIT(ierr) call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr) call MPI_COMM_GET_ATTR(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr) call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr) print *, "Hello, world, I am ", rank, " of ", size call MPI_FINALIZE(ierr) end --- ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
[OMPI users] MCA parameter
Thank you for your response, and these are the details for my problem: I have installed pwscf and then I have tried to run scf calculations, but before having the output I got this warning message: WARNING: There are more than one active ports on host 'stallo-2.local', but the default subnet GID prefix was detected on more than one of these ports. If these ports are connected to different physical IB networks, this configuration will fail in Open MPI. This version of Open MPI requires that every physically separate IB subnet that is used between connected MPI processes must have different subnet ID values. Please see this FAQ entry for more details: http://www..open-mpi.org/faq/?category=openfabrics#ofa-default-subnet-gid NOTE: You can turn off this warning by setting the MCA parameter btl_openib_warn_default_gid_prefix to 0. So the question is how can I turn off this warning or how can I change MCA parameter from 1 to 0? which command I have to use? I have tried with the link above but it doesn't work. perhaps I'm not using the right command. Thanks,
[OMPI users] Fortran90 functions missing: MPI_COMM_GET_ATTR / MPI_ATTR_GET()
Hi, I just switched from MPICH2 to openmpi because of sge-support, but I am missing some mpi-functions for fortran 90. Does anyone know why MPI_COMM_GET_ATTR() MPI_ATTR_GET() are not available? They work fine with MPICH2. I compiled openmpi 1.2.8/1.3rc on a clean CentOS 5.2 with GNU-compilers and Intel 11.0. Both give me the same error: GNU: Error: There is no specific subroutine for the generic 'mpi_attr_get' at (1) Intel 11.0: hello_f90.f90(22): error #6285: There is no matching specific subroutine for this generic subroutine call. [MPI_ATTR_GET] call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr) Any ideas ...? Greetings Jens program main use mpi implicit none integer :: ierr, rank, size integer :: attr_val, attr_flag call MPI_INIT(ierr) call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr) call MPI_COMM_GET_ATTR(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr) call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr) print *, "Hello, world, I am ", rank, " of ", size call MPI_FINALIZE(ierr) end ---
Re: [OMPI users] Issue with Profiling Fortran code
Hope I didn't misunderstand your question. If you implement your profiling library in C where you do your real instrumentation, you don't need to implement the fortran layer, you can simply link with Fortran to C MPI wrapper library -lmpi_f77. i.e. /bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib where libYourProfClib.a is your profiling tool written in C. If you don't want to intercept the MPI call twice for fortran program, you need to implment fortran layer. In that case, I would think you can just call C version of PMPI_xxx directly from your fortran layer, e.g. void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { printf("mpi_comm_rank call successfully intercepted\n"); *info = PMPI_Comm_rank(comm,rank); } A.Chan - "Nick Wright"wrote: > Hi > > I am trying to use the PMPI interface with OPENMPI to profile a > fortran > program. > > I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched > on. > > The problem seems to be that if one eg. intercepts to call to > mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then > > calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should. > > So if one wants to create a library that can profile C and Fortran > codes > at the same time one ends up intercepting the mpi call twice. Which is > > not desirable and not what should happen (and indeed doesn't happen in > > other MPI implementations). > > A simple example to illustrate is below. If somebody knows of a fix to > > avoid this issue that would be great ! > > Thanks > > Nick. > > pmpi_test.c: mpicc pmpi_test.c -c > > #include > #include "mpi.h" > void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) { >printf("mpi_comm_rank call successfully intercepted\n"); >pmpi_comm_rank_(comm,rank,info); > } > int MPI_Comm_rank(MPI_Comm comm, int *rank) { >printf("MPI_comm_rank call successfully intercepted\n"); >PMPI_Comm_rank(comm,rank); > } > > hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o > >program hello > implicit none > include 'mpif.h' > integer ierr > integer myid,nprocs > character*24 fdate,host > call MPI_Init( ierr ) >myid=0 >call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr ) >call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr ) >call getenv('HOST',host) >write (*,*) 'Hello World from proc',myid,' out of',nprocs,host >call mpi_finalize(ierr) >end > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users