Re: [OMPI users] MPI_Allreduce on local machine
Hi Jeff Thank you for opening a ticket and taking care of this. Jeff Squyres wrote: On Jul 28, 2010, at 5:07 PM, Gus Correa wrote: Still, the alignment under Intel may or may not be right. And this may or may not explain the errors that Hugo has got. FYI, the ompi_info from my OpenMPI 1.3.2 and 1.2.8 report exactly the same as OpenMPI 1.4.2, namely Fort dbl prec size: 4 and Fort dbl prec align: 4, except that *if the Intel Fortran compiler (ifort) was used* I get 1 byte alignment: Fort dbl prec align: 1 So, this issue has been around for a while, and involves both the size and the alignment (in Intel) of double precision. Yes, it's quite problematic to try to determine the alignment of Fortran types -- compilers can do different things and there's no reliable way (that I know of, at least) to absolutely get the "native" alignment. I can imagine this is not easy, specially with the large variety of architectures, compilers, and environments, that OpenMPI handles. That being said, we didn't previously find any correctness issues with using an alignment of 1. Does it affect only the information provided by ompi_info, as Martin Siegert suggested? Or does it really affect the actual alignment of MPI types when OpenMPI is compiled with Intel, as Martin, Ake Sandgren, Hugo Gagnon, and myself thought it might? We have a number of pieces of code here where grep shows MPI_DOUBLE_PRECISION. Not sure how much of it has actually been active, as there are always lots of cpp directives to select active code. In particular I found this interesting snippet: if (MPI_DOUBLE_PRECISION==20 .and. MPI_REAL8==18) then ! LAM MPI defined MPI_REAL8 differently from MPI_DOUBLE_PRECISION ! and LAM MPI's allreduce does not accept on MPI_REAL8 MPIreal_t= MPI_DOUBLE_PRECISION else MPIreal_t= MPI_REAL8 endif This kind of thing shouldn't be an issue with Open MPI, right? Yes, you are right. Actually, I checked (and wrote in my posting) that OpenMPI MPI_DOUBLE_PRECISION = 17, hence the code above boils down to redefining everything as MPI_REAL8 instead (the "else" part), hence MPI_DOUBLE_PRECISION is never actually used *in this source file*. BTW, I didn't write this code or the comments. The source file is part of CCSM4/CAM4, a widely used public domain big climate/atmosphere model: http://www.cesm.ucar.edu/models/ccsm4.0/ This particular source file (parallel_mod.F90, circa line 169) hasn't been used in previous incarnations of these programs (CAM3/CCSM3), which we ran extensively here, using OpenMPI. In the old CAM3/CCSM3 most (perhaps all) of the 8-byte floating point data are declared as real*8 or with the "kind" attribute, not as double precision. However, not only this source file, but many other source files in the new CCSM4/CAM4 declare 8-byte floating point data as double precision, and utilize MPI_DOUBLE_PRECISION in MPI function calls. Despite this style being a bit outdated, as Fortran90 seems to prefer to replace "double precision" by "real, kind(0.d0)", as Hugo did in his example. My concern is because we just started experimenting with CAM4/CCSM4, and the plan was to use OpenMPI libraries compiled with Intel. FWIW, OMPI uses different numbers for MPI_DOUBLE_PRECISION and MPI_REAL8 than LAM. They're distinct MPI datatypes because they *could* be different. Yes, I understand two different MPI_ constants should be kept, although the actual values of their size and alignment may be the same in specific architectures (e.g. x86_64). Many thanks, Gus Correa
Re: [OMPI users] MPI_Allreduce on local machine
On Jul 28, 2010, at 5:07 PM, Gus Correa wrote: > Still, the alignment under Intel may or may not be right. > And this may or may not explain the errors that Hugo has got. > > FYI, the ompi_info from my OpenMPI 1.3.2 and 1.2.8 > report exactly the same as OpenMPI 1.4.2, namely > Fort dbl prec size: 4 and > Fort dbl prec align: 4, > except that *if the Intel Fortran compiler (ifort) was used* > I get 1 byte alignment: > Fort dbl prec align: 1 > > So, this issue has been around for a while, > and involves both the size and the alignment (in Intel) > of double precision. Yes, it's quite problematic to try to determine the alignment of Fortran types -- compilers can do different things and there's no reliable way (that I know of, at least) to absolutely get the "native" alignment. That being said, we didn't previously find any correctness issues with using an alignment of 1. > We have a number of pieces of code here where grep shows > MPI_DOUBLE_PRECISION. > Not sure how much of it has actually been active, as there are always > lots of cpp directives to select active code. > > In particular I found this interesting snippet: > > if (MPI_DOUBLE_PRECISION==20 .and. MPI_REAL8==18) then > ! LAM MPI defined MPI_REAL8 differently from MPI_DOUBLE_PRECISION > ! and LAM MPI's allreduce does not accept on MPI_REAL8 > MPIreal_t= MPI_DOUBLE_PRECISION > else > MPIreal_t= MPI_REAL8 > endif This kind of thing shouldn't be an issue with Open MPI, right? FWIW, OMPI uses different numbers for MPI_DOUBLE_PRECISION and MPI_REAL8 than LAM. They're distinct MPI datatypes because they *could* be different. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] MPI_Allreduce on local machine
On Jul 28, 2010, at 12:21 PM, Åke Sandgren wrote: > > Jeff: Is this correct? > > This is wrong, it should be 8 and alignement should be 8 even for intel. > And i also see exactly the same thing. Good catch! I just fixed this in https://svn.open-mpi.org/trac/ompi/changeset/23580 -- it looks like a copy-n-paste error in displaying the Fortran sizes/alignments in ompi_info. It probably happened when ompi_info was converted from C++ to C. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] MPI_Allreduce on local machine
I also get 8 from "call MPI_Type_size(MPI_DOUBLE_PRECISION, size, mpierr)", but really I don't think this is the issue anymore. I mean I checked on my school cluster where OpenMPI has also been compiled with the intel64 compilers and "Fort dbl prec size:" also returns 4 but unlike on my Mac the code runs fine there. I am just saying that we should stop worrying about ompi_info output and wait until Jeff Squyres analyses my build output files that I sent to the list earlier. I might be wrong too as I have no idea of what's going on. -- Hugo Gagnon On Wed, 28 Jul 2010 17:07 -0400, "Gus Correa"wrote: > Hi All > > Martin Siegert wrote: > > On Wed, Jul 28, 2010 at 01:05:52PM -0700, Martin Siegert wrote: > >> On Wed, Jul 28, 2010 at 11:19:43AM -0400, Gus Correa wrote: > >>> Hugo Gagnon wrote: > Hi Gus, > Ompi_info --all lists its info regarding fortran right after C. In my > case: > Fort real size: 4 > Fort real4 size: 4 > Fort real8 size: 8 > Fort real16 size: 16 > Fort dbl prec size: 4 > Does it make any sense to you? > >>> Hi Hugo > >>> > >>> No, dbl prec size 4 sounds weird, should be 8, I suppose, > >>> same as real8, right? > >>> > >>> It doesn't make sense, but that's what I have (now that you told me > >>> that "dbl" , not "double", is the string to search for): > >>> > >>> $ Fort dbl prec size: 4 > >>> Fort dbl cplx size: 4 > >>> Fort dbl prec align: 4 > >>> Fort dbl cplx align: 4 > >>> > >>> Is this a bug in OpenMPI perhaps? > >>> > >>> I didn't come across to this problem, most likely because > >>> the codes here don't use "double precision" but real*8 or similar. > >>> > >>> Also make sure you are picking the right ompi_info, mpif90/f77, mpiexec. > >>> Often times old versions and tangled PATH make things very confusing. > >> This is indeed worrisome as I confirm the findings on our clusters both > >> with ompi 1.3.3 and 1.4.1: > >> > >> ompi_info --all | grep -i fort > >> ... > >> Fort real size: 4 > >> Fort real4 size: 4 > >> Fort real8 size: 8 > >> Fort real16 size: -1 > >> Fort dbl prec size: 4 > >> Fort cplx size: 4 > >> Fort dbl cplx size: 4 > >> Fort cplx8 size: 8 > >> Fort cplx16 size: 16 > >> Fort cplx32 size: -1 > >> Fort integer align: 4 > >> Fort integer1 align: 1 > >> Fort integer2 align: 2 > >> Fort integer4 align: 4 > >> Fort integer8 align: 8 > >> Fort integer16 align: -1 > >> Fort real align: 4 > >> Fort real4 align: 4 > >> Fort real8 align: 8 > >>Fort real16 align: -1 > >> Fort dbl prec align: 4 > >> Fort cplx align: 4 > >> Fort dbl cplx align: 4 > >> Fort cplx8 align: 4 > >>Fort cplx16 align: 8 > >> ... > >> > >> And this is the configure output: > >> checking if Fortran 77 compiler supports REAL*8... yes > >> checking size of Fortran 77 REAL*8... 8 > >> checking for C type corresponding to REAL*8... double > >> checking alignment of Fortran REAL*8... 1 > >> ... > >> checking if Fortran 77 compiler supports DOUBLE PRECISION... yes > >> checking size of Fortran 77 DOUBLE PRECISION... 8 > >> checking for C type corresponding to DOUBLE PRECISION... double > >> checking alignment of Fortran DOUBLE PRECISION... 1 > >> > >> But the following code actually appears to give the correct results: > >> > >> program types > >> use mpi > >> implicit none > >> integer :: mpierr, size > >> > >>call MPI_Init(mpierr) > >>call MPI_Type_size(MPI_DOUBLE_PRECISION, size, mpierr) > >>print*, 'double precision size: ', size > >>call MPI_Finalize(mpierr) > >> end > >> > >> mpif90 -g types.f90 > >> mpiexec -n 1 ./a.out > >> double precision size:8 > >> > >> Thus is this a bug in ompi_info only? > > > > answering my own question: > > This does not look right: > > > > ompi/tools/ompi_info/param.cc: > > > > out("Fort dbl prec size", > > "compiler:fortran:sizeof:double_precision", > > OMPI_SIZEOF_FORTRAN_REAL); > > > > that should be OMPI_SIZEOF_FORTRAN_DOUBLE_PRECISION. > > > > - Martin > > Hopefully Martin may got it and the issue is restricted to ompi_info. > Thanks, Martin, for writing and running the little diagnostic code, > and for checking the ompi_info guts! > > Still, the alignment under Intel may or may not be right. > And this may or may not explain the errors that Hugo has got. > > FYI, the ompi_info from my OpenMPI 1.3.2 and 1.2.8 > report exactly the same as OpenMPI 1.4.2, namely > Fort dbl prec size: 4 and > Fort dbl prec align: 4, > except that *if the Intel Fortran compiler (ifort) was used* > I get 1 byte alignment: > Fort dbl prec align: 1 > > So, this issue has been around for
Re: [OMPI users] MPI_Allreduce on local machine
Hi All Martin Siegert wrote: On Wed, Jul 28, 2010 at 01:05:52PM -0700, Martin Siegert wrote: On Wed, Jul 28, 2010 at 11:19:43AM -0400, Gus Correa wrote: Hugo Gagnon wrote: Hi Gus, Ompi_info --all lists its info regarding fortran right after C. In my case: Fort real size: 4 Fort real4 size: 4 Fort real8 size: 8 Fort real16 size: 16 Fort dbl prec size: 4 Does it make any sense to you? Hi Hugo No, dbl prec size 4 sounds weird, should be 8, I suppose, same as real8, right? It doesn't make sense, but that's what I have (now that you told me that "dbl" , not "double", is the string to search for): $ Fort dbl prec size: 4 Fort dbl cplx size: 4 Fort dbl prec align: 4 Fort dbl cplx align: 4 Is this a bug in OpenMPI perhaps? I didn't come across to this problem, most likely because the codes here don't use "double precision" but real*8 or similar. Also make sure you are picking the right ompi_info, mpif90/f77, mpiexec. Often times old versions and tangled PATH make things very confusing. This is indeed worrisome as I confirm the findings on our clusters both with ompi 1.3.3 and 1.4.1: ompi_info --all | grep -i fort ... Fort real size: 4 Fort real4 size: 4 Fort real8 size: 8 Fort real16 size: -1 Fort dbl prec size: 4 Fort cplx size: 4 Fort dbl cplx size: 4 Fort cplx8 size: 8 Fort cplx16 size: 16 Fort cplx32 size: -1 Fort integer align: 4 Fort integer1 align: 1 Fort integer2 align: 2 Fort integer4 align: 4 Fort integer8 align: 8 Fort integer16 align: -1 Fort real align: 4 Fort real4 align: 4 Fort real8 align: 8 Fort real16 align: -1 Fort dbl prec align: 4 Fort cplx align: 4 Fort dbl cplx align: 4 Fort cplx8 align: 4 Fort cplx16 align: 8 ... And this is the configure output: checking if Fortran 77 compiler supports REAL*8... yes checking size of Fortran 77 REAL*8... 8 checking for C type corresponding to REAL*8... double checking alignment of Fortran REAL*8... 1 ... checking if Fortran 77 compiler supports DOUBLE PRECISION... yes checking size of Fortran 77 DOUBLE PRECISION... 8 checking for C type corresponding to DOUBLE PRECISION... double checking alignment of Fortran DOUBLE PRECISION... 1 But the following code actually appears to give the correct results: program types use mpi implicit none integer :: mpierr, size call MPI_Init(mpierr) call MPI_Type_size(MPI_DOUBLE_PRECISION, size, mpierr) print*, 'double precision size: ', size call MPI_Finalize(mpierr) end mpif90 -g types.f90 mpiexec -n 1 ./a.out double precision size:8 Thus is this a bug in ompi_info only? answering my own question: This does not look right: ompi/tools/ompi_info/param.cc: out("Fort dbl prec size", "compiler:fortran:sizeof:double_precision", OMPI_SIZEOF_FORTRAN_REAL); that should be OMPI_SIZEOF_FORTRAN_DOUBLE_PRECISION. - Martin Hopefully Martin may got it and the issue is restricted to ompi_info. Thanks, Martin, for writing and running the little diagnostic code, and for checking the ompi_info guts! Still, the alignment under Intel may or may not be right. And this may or may not explain the errors that Hugo has got. FYI, the ompi_info from my OpenMPI 1.3.2 and 1.2.8 report exactly the same as OpenMPI 1.4.2, namely Fort dbl prec size: 4 and Fort dbl prec align: 4, except that *if the Intel Fortran compiler (ifort) was used* I get 1 byte alignment: Fort dbl prec align: 1 So, this issue has been around for a while, and involves both the size and the alignment (in Intel) of double precision. We have a number of pieces of code here where grep shows MPI_DOUBLE_PRECISION. Not sure how much of it has actually been active, as there are always lots of cpp directives to select active code. In particular I found this interesting snippet: if (MPI_DOUBLE_PRECISION==20 .and. MPI_REAL8==18) then ! LAM MPI defined MPI_REAL8 differently from MPI_DOUBLE_PRECISION ! and LAM MPI's allreduce does not accept on MPI_REAL8 MPIreal_t= MPI_DOUBLE_PRECISION else MPIreal_t= MPI_REAL8 endif where eventually MPIreal_t is what is used as the MPI type in some MPI calls, particularly in MPI_Allreduce, which is the one that triggered all this discussion (see this thread Subject line) when Hugo first asked his original question. Hopefully the if branch on the code snippet above worked alright, because here in our OpenMPIs 1.4.2, 1.3.2, and 1.2.8, MPI_DOUBLE_PRECISION value is 17, which should have safely produced MPIreal_t= MPI_REAL8 I have a lot more of code to check, but maybe not. If the issue is really restricted to ompi_info that would be a big relief. Many
Re: [OMPI users] MPI_Allreduce on local machine
On Wed, Jul 28, 2010 at 01:05:52PM -0700, Martin Siegert wrote: > On Wed, Jul 28, 2010 at 11:19:43AM -0400, Gus Correa wrote: > > Hugo Gagnon wrote: > >> Hi Gus, > >> Ompi_info --all lists its info regarding fortran right after C. In my > >> case: > >> Fort real size: 4 > >> Fort real4 size: 4 > >> Fort real8 size: 8 > >> Fort real16 size: 16 > >> Fort dbl prec size: 4 > >> Does it make any sense to you? > > > > Hi Hugo > > > > No, dbl prec size 4 sounds weird, should be 8, I suppose, > > same as real8, right? > > > > It doesn't make sense, but that's what I have (now that you told me > > that "dbl" , not "double", is the string to search for): > > > > $ Fort dbl prec size: 4 > > Fort dbl cplx size: 4 > > Fort dbl prec align: 4 > > Fort dbl cplx align: 4 > > > > Is this a bug in OpenMPI perhaps? > > > > I didn't come across to this problem, most likely because > > the codes here don't use "double precision" but real*8 or similar. > > > > Also make sure you are picking the right ompi_info, mpif90/f77, mpiexec. > > Often times old versions and tangled PATH make things very confusing. > > This is indeed worrisome as I confirm the findings on our clusters both > with ompi 1.3.3 and 1.4.1: > > ompi_info --all | grep -i fort > ... > Fort real size: 4 > Fort real4 size: 4 > Fort real8 size: 8 > Fort real16 size: -1 > Fort dbl prec size: 4 > Fort cplx size: 4 > Fort dbl cplx size: 4 > Fort cplx8 size: 8 > Fort cplx16 size: 16 > Fort cplx32 size: -1 > Fort integer align: 4 > Fort integer1 align: 1 > Fort integer2 align: 2 > Fort integer4 align: 4 > Fort integer8 align: 8 > Fort integer16 align: -1 > Fort real align: 4 > Fort real4 align: 4 > Fort real8 align: 8 >Fort real16 align: -1 > Fort dbl prec align: 4 > Fort cplx align: 4 > Fort dbl cplx align: 4 > Fort cplx8 align: 4 >Fort cplx16 align: 8 > ... > > And this is the configure output: > checking if Fortran 77 compiler supports REAL*8... yes > checking size of Fortran 77 REAL*8... 8 > checking for C type corresponding to REAL*8... double > checking alignment of Fortran REAL*8... 1 > ... > checking if Fortran 77 compiler supports DOUBLE PRECISION... yes > checking size of Fortran 77 DOUBLE PRECISION... 8 > checking for C type corresponding to DOUBLE PRECISION... double > checking alignment of Fortran DOUBLE PRECISION... 1 > > But the following code actually appears to give the correct results: > > program types > use mpi > implicit none > integer :: mpierr, size > >call MPI_Init(mpierr) >call MPI_Type_size(MPI_DOUBLE_PRECISION, size, mpierr) >print*, 'double precision size: ', size >call MPI_Finalize(mpierr) > end > > mpif90 -g types.f90 > mpiexec -n 1 ./a.out > double precision size:8 > > Thus is this a bug in ompi_info only? answering my own question: This does not look right: ompi/tools/ompi_info/param.cc: out("Fort dbl prec size", "compiler:fortran:sizeof:double_precision", OMPI_SIZEOF_FORTRAN_REAL); that should be OMPI_SIZEOF_FORTRAN_DOUBLE_PRECISION. - Martin
Re: [OMPI users] MPI_Allreduce on local machine
On Wed, Jul 28, 2010 at 11:19:43AM -0400, Gus Correa wrote: > Hugo Gagnon wrote: >> Hi Gus, >> Ompi_info --all lists its info regarding fortran right after C. In my >> case: >> Fort real size: 4 >> Fort real4 size: 4 >> Fort real8 size: 8 >> Fort real16 size: 16 >> Fort dbl prec size: 4 >> Does it make any sense to you? > > Hi Hugo > > No, dbl prec size 4 sounds weird, should be 8, I suppose, > same as real8, right? > > It doesn't make sense, but that's what I have (now that you told me > that "dbl" , not "double", is the string to search for): > > $ Fort dbl prec size: 4 > Fort dbl cplx size: 4 > Fort dbl prec align: 4 > Fort dbl cplx align: 4 > > Is this a bug in OpenMPI perhaps? > > I didn't come across to this problem, most likely because > the codes here don't use "double precision" but real*8 or similar. > > Also make sure you are picking the right ompi_info, mpif90/f77, mpiexec. > Often times old versions and tangled PATH make things very confusing. This is indeed worrisome as I confirm the findings on our clusters both with ompi 1.3.3 and 1.4.1: ompi_info --all | grep -i fort ... Fort real size: 4 Fort real4 size: 4 Fort real8 size: 8 Fort real16 size: -1 Fort dbl prec size: 4 Fort cplx size: 4 Fort dbl cplx size: 4 Fort cplx8 size: 8 Fort cplx16 size: 16 Fort cplx32 size: -1 Fort integer align: 4 Fort integer1 align: 1 Fort integer2 align: 2 Fort integer4 align: 4 Fort integer8 align: 8 Fort integer16 align: -1 Fort real align: 4 Fort real4 align: 4 Fort real8 align: 8 Fort real16 align: -1 Fort dbl prec align: 4 Fort cplx align: 4 Fort dbl cplx align: 4 Fort cplx8 align: 4 Fort cplx16 align: 8 ... And this is the configure output: checking if Fortran 77 compiler supports REAL*8... yes checking size of Fortran 77 REAL*8... 8 checking for C type corresponding to REAL*8... double checking alignment of Fortran REAL*8... 1 ... checking if Fortran 77 compiler supports DOUBLE PRECISION... yes checking size of Fortran 77 DOUBLE PRECISION... 8 checking for C type corresponding to DOUBLE PRECISION... double checking alignment of Fortran DOUBLE PRECISION... 1 But the following code actually appears to give the correct results: program types use mpi implicit none integer :: mpierr, size call MPI_Init(mpierr) call MPI_Type_size(MPI_DOUBLE_PRECISION, size, mpierr) print*, 'double precision size: ', size call MPI_Finalize(mpierr) end mpif90 -g types.f90 mpiexec -n 1 ./a.out double precision size:8 Thus is this a bug in ompi_info only? Cheers, Martin -- Martin Siegert Head, Research Computing WestGrid/ComputeCanada Site Lead IT Servicesphone: 778 782-4691 Simon Fraser Universityfax: 778 782-4242 Burnaby, British Columbia email: sieg...@sfu.ca Canada V5A 1S6
Re: [OMPI users] MPI_Allreduce on local machine
On Wed, 2010-07-28 at 11:48 -0400, Gus Correa wrote: > Hi Hugo, Jeff, list > > Hugo: I think David Zhang's suggestion was to use > MPI_REAL8 not MPI_REAL, instead of MPI_DOUBLE_PRECISION in your > MPI_Allreduce call. > > Still, to me it looks like OpenMPI is making double precision 4-byte > long, which shorter than I expected it be (8 bytes), > at least when looking at the output of ompi_info --all. > > I always get get a size 4 for dbl prec in my x86_64 machine, > from ompi_info --all. > I confirmed this in six builds of OpenMPI 1.4.2: gcc+gfortran, > gcc+pgf90, gcc+ifort, icc+ifort, pgcc+pgf90, and opencc+openf95. > Although the output of ompi_info never says this is actually the size > of MPI_DOUBLE_PRECISION, just of "dbl prec", which is a bit ambiguous. > > FWIW, I include the output below. Note that alignment for gcc+ifort > is 1, all others are 4. > > Jeff: Is this correct? This is wrong, it should be 8 and alignement should be 8 even for intel. And i also see exactly the same thing. -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
Re: [OMPI users] MPI_Allreduce on local machine
Here they are. -- Hugo Gagnon On Wed, 28 Jul 2010 12:01 -0400, "Jeff Squyres"wrote: > On Jul 28, 2010, at 11:55 AM, Gus Correa wrote: > > > I surely can send you the logs, but they're big. > > Off the list perhaps? > > If they're still big when compressed, sure, send them to me off list. > > But I think I'd be more interested to see Hugo's logs. :-) > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Hugo Gagnon ompi-output.tar.bz2 Description: BZip2 compressed data
Re: [OMPI users] MPI_Allreduce on local machine
Hi Jeff I surely can send you the logs, but they're big. Off the list perhaps? Thanks, Gus Jeff Squyres wrote: On Jul 28, 2010, at 11:19 AM, Gus Correa wrote: Ompi_info --all lists its info regarding fortran right after C. In my Ummm right... I should know that. I wrote ompi_info, after all. :-) I ran "ompi_info -all | grep -i fortran" and didn't see the fortran info, and I forgot that I put that stuff in there. Oops! :-) case: Fort real size: 4 Fort real4 size: 4 Fort real8 size: 8 Fort real16 size: 16 Fort dbl prec size: 4 That does seems weird. No, dbl prec size 4 sounds weird, should be 8, I suppose, same as real8, right? It doesn't make sense, but that's what I have (now that you told me that "dbl" , not "double", is the string to search for): $ Fort dbl prec size: 4 Fort dbl cplx size: 4 Fort dbl prec align: 4 Fort dbl cplx align: 4 Is this a bug in OpenMPI perhaps? Getting sizeof and alignment of Fortran variable types is very problematic. Can you send the stdout/stderr of configure and the config.log? http://www.open-mpi.org/community/help/
Re: [OMPI users] MPI_Allreduce on local machine
Hi Hugo, Jeff, list Hugo: I think David Zhang's suggestion was to use MPI_REAL8 not MPI_REAL, instead of MPI_DOUBLE_PRECISION in your MPI_Allreduce call. Still, to me it looks like OpenMPI is making double precision 4-byte long, which shorter than I expected it be (8 bytes), at least when looking at the output of ompi_info --all. I always get get a size 4 for dbl prec in my x86_64 machine, from ompi_info --all. I confirmed this in six builds of OpenMPI 1.4.2: gcc+gfortran, gcc+pgf90, gcc+ifort, icc+ifort, pgcc+pgf90, and opencc+openf95. Although the output of ompi_info never says this is actually the size of MPI_DOUBLE_PRECISION, just of "dbl prec", which is a bit ambiguous. FWIW, I include the output below. Note that alignment for gcc+ifort is 1, all others are 4. Jeff: Is this correct? Thanks, Gus Correa $ openmpi/1.4.2/open64-4.2.3-0/bin/ompi_info --all | grep -i dbl Fort dbl prec size: 4 Fort dbl cplx size: 4 Fort dbl prec align: 4 Fort dbl cplx align: 4 $ openmpi/1.4.2/gnu-4.1.2/bin/ompi_info --all | grep -i dbl Fort dbl prec size: 4 Fort dbl cplx size: 4 Fort dbl prec align: 4 Fort dbl cplx align: 4 $ openmpi/1.4.2/gnu-4.1.2-intel-10.1.017/bin/ompi_info --all | grep -i dbl Fort dbl prec size: 4 Fort dbl cplx size: 4 Fort dbl prec align: 1 Fort dbl cplx align: 1 $ openmpi/1.4.2/gnu-4.1.2-pgi-8.0-4/bin/ompi_info --all | grep -i dbl Fort dbl prec size: 4 Fort dbl cplx size: 4 Fort dbl prec align: 4 Fort dbl cplx align: 4 $ openmpi/1.4.2/pgi-8.0-4/bin/ompi_info --all | grep -i dbl Fort dbl prec size: 4 Fort dbl cplx size: 4 Fort dbl prec align: 4 Fort dbl cplx align: 4 Hugo Gagnon wrote: And how do I know how big my data buffer is? I ran MPI_TYPE_EXTENT of And how do I know how big my data buffer is? I ran MPI_TYPE_EXTENT of MPI_DOUBLE_PRECISION and the result was 8. So I changed my program to: 1 program test 2 3 use mpi 4 5 implicit none 6 7 integer :: ierr, nproc, myrank 8 !integer, parameter :: dp = kind(1.d0) 9 real(kind=8) :: inside(5), outside(5) 10 11 call mpi_init(ierr) 12 call mpi_comm_size(mpi_comm_world, nproc, ierr) 13 call mpi_comm_rank(mpi_comm_world, myrank, ierr) 14 15 inside = (/ 1., 2., 3., 4., 5. /) 16 call mpi_allreduce(inside, outside, 5, mpi_real, mpi_sum, mpi_comm_world, ierr) 17 18 if (myrank == 0) then 19 print*, outside 20 end if 21 22 call mpi_finalize(ierr) 23 24 end program test but I still get a SIGSEGV fault: forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PCRoutineLine Source libmpi.0.dylib 0001001BB4B7 Unknown Unknown Unknown libmpi_f77.0.dyli 0001000AF046 Unknown Unknown Unknown a.out 00010D87 _MAIN__16 test.f90 a.out 00010C9C Unknown Unknown Unknown a.out 00010C34 Unknown Unknown Unknown forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PCRoutineLine Source libmpi.0.dylib 0001001BB4B7 Unknown Unknown Unknown libmpi_f77.0.dyli 0001000AF046 Unknown Unknown Unknown a.out 00010D87 _MAIN__16 test.f90 a.out 00010C9C Unknown Unknown Unknown a.out 00010C34 Unknown Unknown Unknown What is wrong now?
Re: [OMPI users] MPI_Allreduce on local machine
Hugo Gagnon wrote: Hi Gus, Ompi_info --all lists its info regarding fortran right after C. In my case: Fort real size: 4 Fort real4 size: 4 Fort real8 size: 8 Fort real16 size: 16 Fort dbl prec size: 4 Does it make any sense to you? Hi Hugo No, dbl prec size 4 sounds weird, should be 8, I suppose, same as real8, right? It doesn't make sense, but that's what I have (now that you told me that "dbl" , not "double", is the string to search for): $ Fort dbl prec size: 4 Fort dbl cplx size: 4 Fort dbl prec align: 4 Fort dbl cplx align: 4 Is this a bug in OpenMPI perhaps? I didn't come across to this problem, most likely because the codes here don't use "double precision" but real*8 or similar. Also make sure you are picking the right ompi_info, mpif90/f77, mpiexec. Often times old versions and tangled PATH make things very confusing. Gus Correa
Re: [OMPI users] MPI_Allreduce on local machine
I mean to write: call mpi_allreduce(inside, outside, 5,mpi_real, mpi_double_precision, mpi_comm_world, ierr) -- Hugo Gagnon On Wed, 28 Jul 2010 09:33 -0400, "Hugo Gagnon"wrote: > And how do I know how big my data buffer is? I ran MPI_TYPE_EXTENT of > And how do I know how big my data buffer is? I ran MPI_TYPE_EXTENT of > MPI_DOUBLE_PRECISION and the result was 8. So I changed my program to: > > 1 program test > 2 > 3 use mpi > 4 > 5 implicit none > 6 > 7 integer :: ierr, nproc, myrank > 8 !integer, parameter :: dp = kind(1.d0) > 9 real(kind=8) :: inside(5), outside(5) > 10 > 11 call mpi_init(ierr) > 12 call mpi_comm_size(mpi_comm_world, nproc, ierr) > 13 call mpi_comm_rank(mpi_comm_world, myrank, ierr) > 14 > 15 inside = (/ 1., 2., 3., 4., 5. /) > 16 call mpi_allreduce(inside, outside, 5, mpi_real, mpi_sum, > mpi_comm_world, ierr) > 17 > 18 if (myrank == 0) then > 19 print*, outside > 20 end if > 21 > 22 call mpi_finalize(ierr) > 23 > 24 end program test > > but I still get a SIGSEGV fault: > > forrtl: severe (174): SIGSEGV, segmentation fault occurred > Image PCRoutineLine > Source > libmpi.0.dylib 0001001BB4B7 Unknown Unknown > Unknown > libmpi_f77.0.dyli 0001000AF046 Unknown Unknown > Unknown > a.out 00010D87 _MAIN__16 > test.f90 > a.out 00010C9C Unknown Unknown > Unknown > a.out 00010C34 Unknown Unknown > Unknown > forrtl: severe (174): SIGSEGV, segmentation fault occurred > Image PCRoutineLine > Source > libmpi.0.dylib 0001001BB4B7 Unknown Unknown > Unknown > libmpi_f77.0.dyli 0001000AF046 Unknown Unknown > Unknown > a.out 00010D87 _MAIN__16 > test.f90 > a.out 00010C9C Unknown Unknown > Unknown > a.out 00010C34 Unknown Unknown > Unknown > > What is wrong now? > -- > Hugo Gagnon > > > On Wed, 28 Jul 2010 07:56 -0400, "Jeff Squyres" > wrote: > > On Jul 27, 2010, at 4:19 PM, Gus Correa wrote: > > > > > Is there a simple way to check the number of bytes associated to each > > > MPI basic type of OpenMPI on a specific machine (or machine+compiler)? > > > > > > Something that would come out easily, say, from ompi_info? > > > > Not via ompi_info, but the MPI function MPI_GET_EXTENT will tell you the > > datatype's size. > > > > - > > [4:54] svbu-mpi:~/mpi % cat extent.f90 > > program main > > use mpi > > implicit none > > integer ierr, ext > > > > call MPI_INIT(ierr) > > call MPI_TYPE_EXTENT(MPI_DOUBLE_PRECISION, ext, ierr) > > print *, 'Type extent of DOUBLE_PREC is', ext > > call MPI_FINALIZE(ierr) > > > > end > > [4:54] svbu-mpi:~/mpi % mpif90 extent.f90 -o extent -g > > [4:54] svbu-mpi:~/mpi % ./extent > > Type extent of DOUBLE_PREC is 8 > > [4:54] svbu-mpi:~/mpi % > > - > > > > -- > > Jeff Squyres > > jsquy...@cisco.com > > For corporate legal information go to: > > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Hugo Gagnon > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Hugo Gagnon
Re: [OMPI users] MPI_Allreduce on local machine
I installed with: ./configure --prefix=/opt/openmpi CC=icc CXX=icpc F77=ifort FC=ifort make all install I would gladly give you a corefile but I have no idea on to produce one, I'm just an end user... -- Hugo Gagnon On Wed, 28 Jul 2010 08:57 -0400, "Jeff Squyres"wrote: > I don't have the intel compilers on my Mac, but I'm unable to replicate > this issue on Linux with the intel compilers v11.0. > > Can you get a corefile to see a backtrace where it died in Open MPI's > allreduce? > > How exactly did you configure your Open MPI, and how exactly did you > compile / run your sample application? > > > On Jul 27, 2010, at 10:35 PM, Hugo Gagnon wrote: > > > I did and it runs now, but the result is wrong: outside is still 1.d0, > > 2.d0, 3.d0, 4.d0, 5.d0 > > How can I make sure to compile OpenMPI so that datatypes such as > > mpi_double_precision behave as they "should"? > > Are there flags during the OpenMPI building process or something? > > Thanks, > > -- > > Hugo Gagnon > > > > > > On Tue, 27 Jul 2010 09:06 -0700, "David Zhang" > > wrote: > > > Try mpi_real8 for the type in allreduce > > > > > > On 7/26/10, Hugo Gagnon wrote: > > > > Hello, > > > > > > > > When I compile and run this code snippet: > > > > > > > > 1 program test > > > > 2 > > > > 3 use mpi > > > > 4 > > > > 5 implicit none > > > > 6 > > > > 7 integer :: ierr, nproc, myrank > > > > 8 integer, parameter :: dp = kind(1.d0) > > > > 9 real(kind=dp) :: inside(5), outside(5) > > > > 10 > > > > 11 call mpi_init(ierr) > > > > 12 call mpi_comm_size(mpi_comm_world, nproc, ierr) > > > > 13 call mpi_comm_rank(mpi_comm_world, myrank, ierr) > > > > 14 > > > > 15 inside = (/ 1, 2, 3, 4, 5 /) > > > > 16 call mpi_allreduce(inside, outside, 5, mpi_double_precision, > > > > mpi_sum, mpi_comm_world, ierr) > > > > 17 > > > > 18 print*, myrank, inside > > > > 19 print*, outside > > > > 20 > > > > 21 call mpi_finalize(ierr) > > > > 22 > > > > 23 end program test > > > > > > > > I get the following error, with say 2 processors: > > > > > > > > forrtl: severe (174): SIGSEGV, segmentation fault occurred > > > > Image PCRoutineLine > > > > Source > > > > libmpi.0.dylib 0001001BB4B7 Unknown Unknown > > > > Unknown > > > > libmpi_f77.0.dyli 0001000AF046 Unknown Unknown > > > > Unknown > > > > a.out 00010CE2 _MAIN__16 > > > > test.f90 > > > > a.out 00010BDC Unknown Unknown > > > > Unknown > > > > a.out 00010B74 Unknown Unknown > > > > Unknown > > > > forrtl: severe (174): SIGSEGV, segmentation fault occurred > > > > Image PCRoutineLine > > > > Source > > > > libmpi.0.dylib 0001001BB4B7 Unknown Unknown > > > > Unknown > > > > libmpi_f77.0.dyli 0001000AF046 Unknown Unknown > > > > Unknown > > > > a.out 00010CE2 _MAIN__16 > > > > test.f90 > > > > a.out 00010BDC Unknown Unknown > > > > Unknown > > > > a.out 00010B74 Unknown Unknown > > > > Unknown > > > > > > > > on my iMac having compiled OpenMPI with ifort according to: > > > > http://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers/ > > > > > > > > Note that the above code snippet runs fine on my school parallel cluster > > > > where ifort+intelmpi is installed. > > > > Is there something special about OpenMPI's MPI_Allreduce function call > > > > that I should be aware of? > > > > > > > > Thanks, > > > > -- > > > > Hugo Gagnon > > > > > > > > ___ > > > > users mailing list > > > > us...@open-mpi.org > > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > > > -- > > > Sent from my mobile device > > > > > > David Zhang > > > University of California, San Diego > > > ___ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > -- > > Hugo Gagnon > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Hugo Gagnon
Re: [OMPI users] MPI_Allreduce on local machine
And how do I know how big my data buffer is? I ran MPI_TYPE_EXTENT of And how do I know how big my data buffer is? I ran MPI_TYPE_EXTENT of MPI_DOUBLE_PRECISION and the result was 8. So I changed my program to: 1 program test 2 3 use mpi 4 5 implicit none 6 7 integer :: ierr, nproc, myrank 8 !integer, parameter :: dp = kind(1.d0) 9 real(kind=8) :: inside(5), outside(5) 10 11 call mpi_init(ierr) 12 call mpi_comm_size(mpi_comm_world, nproc, ierr) 13 call mpi_comm_rank(mpi_comm_world, myrank, ierr) 14 15 inside = (/ 1., 2., 3., 4., 5. /) 16 call mpi_allreduce(inside, outside, 5, mpi_real, mpi_sum, mpi_comm_world, ierr) 17 18 if (myrank == 0) then 19 print*, outside 20 end if 21 22 call mpi_finalize(ierr) 23 24 end program test but I still get a SIGSEGV fault: forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PCRoutineLine Source libmpi.0.dylib 0001001BB4B7 Unknown Unknown Unknown libmpi_f77.0.dyli 0001000AF046 Unknown Unknown Unknown a.out 00010D87 _MAIN__16 test.f90 a.out 00010C9C Unknown Unknown Unknown a.out 00010C34 Unknown Unknown Unknown forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PCRoutineLine Source libmpi.0.dylib 0001001BB4B7 Unknown Unknown Unknown libmpi_f77.0.dyli 0001000AF046 Unknown Unknown Unknown a.out 00010D87 _MAIN__16 test.f90 a.out 00010C9C Unknown Unknown Unknown a.out 00010C34 Unknown Unknown Unknown What is wrong now? -- Hugo Gagnon On Wed, 28 Jul 2010 07:56 -0400, "Jeff Squyres"wrote: > On Jul 27, 2010, at 4:19 PM, Gus Correa wrote: > > > Is there a simple way to check the number of bytes associated to each > > MPI basic type of OpenMPI on a specific machine (or machine+compiler)? > > > > Something that would come out easily, say, from ompi_info? > > Not via ompi_info, but the MPI function MPI_GET_EXTENT will tell you the > datatype's size. > > - > [4:54] svbu-mpi:~/mpi % cat extent.f90 > program main > use mpi > implicit none > integer ierr, ext > > call MPI_INIT(ierr) > call MPI_TYPE_EXTENT(MPI_DOUBLE_PRECISION, ext, ierr) > print *, 'Type extent of DOUBLE_PREC is', ext > call MPI_FINALIZE(ierr) > > end > [4:54] svbu-mpi:~/mpi % mpif90 extent.f90 -o extent -g > [4:54] svbu-mpi:~/mpi % ./extent > Type extent of DOUBLE_PREC is 8 > [4:54] svbu-mpi:~/mpi % > - > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Hugo Gagnon
Re: [OMPI users] MPI_Allreduce on local machine
I don't have the intel compilers on my Mac, but I'm unable to replicate this issue on Linux with the intel compilers v11.0. Can you get a corefile to see a backtrace where it died in Open MPI's allreduce? How exactly did you configure your Open MPI, and how exactly did you compile / run your sample application? On Jul 27, 2010, at 10:35 PM, Hugo Gagnon wrote: > I did and it runs now, but the result is wrong: outside is still 1.d0, > 2.d0, 3.d0, 4.d0, 5.d0 > How can I make sure to compile OpenMPI so that datatypes such as > mpi_double_precision behave as they "should"? > Are there flags during the OpenMPI building process or something? > Thanks, > -- > Hugo Gagnon > > > On Tue, 27 Jul 2010 09:06 -0700, "David Zhang"> wrote: > > Try mpi_real8 for the type in allreduce > > > > On 7/26/10, Hugo Gagnon wrote: > > > Hello, > > > > > > When I compile and run this code snippet: > > > > > > 1 program test > > > 2 > > > 3 use mpi > > > 4 > > > 5 implicit none > > > 6 > > > 7 integer :: ierr, nproc, myrank > > > 8 integer, parameter :: dp = kind(1.d0) > > > 9 real(kind=dp) :: inside(5), outside(5) > > > 10 > > > 11 call mpi_init(ierr) > > > 12 call mpi_comm_size(mpi_comm_world, nproc, ierr) > > > 13 call mpi_comm_rank(mpi_comm_world, myrank, ierr) > > > 14 > > > 15 inside = (/ 1, 2, 3, 4, 5 /) > > > 16 call mpi_allreduce(inside, outside, 5, mpi_double_precision, > > > mpi_sum, mpi_comm_world, ierr) > > > 17 > > > 18 print*, myrank, inside > > > 19 print*, outside > > > 20 > > > 21 call mpi_finalize(ierr) > > > 22 > > > 23 end program test > > > > > > I get the following error, with say 2 processors: > > > > > > forrtl: severe (174): SIGSEGV, segmentation fault occurred > > > Image PCRoutineLine > > > Source > > > libmpi.0.dylib 0001001BB4B7 Unknown Unknown > > > Unknown > > > libmpi_f77.0.dyli 0001000AF046 Unknown Unknown > > > Unknown > > > a.out 00010CE2 _MAIN__16 > > > test.f90 > > > a.out 00010BDC Unknown Unknown > > > Unknown > > > a.out 00010B74 Unknown Unknown > > > Unknown > > > forrtl: severe (174): SIGSEGV, segmentation fault occurred > > > Image PCRoutineLine > > > Source > > > libmpi.0.dylib 0001001BB4B7 Unknown Unknown > > > Unknown > > > libmpi_f77.0.dyli 0001000AF046 Unknown Unknown > > > Unknown > > > a.out 00010CE2 _MAIN__16 > > > test.f90 > > > a.out 00010BDC Unknown Unknown > > > Unknown > > > a.out 00010B74 Unknown Unknown > > > Unknown > > > > > > on my iMac having compiled OpenMPI with ifort according to: > > > http://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers/ > > > > > > Note that the above code snippet runs fine on my school parallel cluster > > > where ifort+intelmpi is installed. > > > Is there something special about OpenMPI's MPI_Allreduce function call > > > that I should be aware of? > > > > > > Thanks, > > > -- > > > Hugo Gagnon > > > > > > ___ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > -- > > Sent from my mobile device > > > > David Zhang > > University of California, San Diego > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Hugo Gagnon > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] MPI_Allreduce on local machine
On Jul 27, 2010, at 11:21 AM, Hugo Gagnon wrote: > I appreciate your replies but my question has to do with the function > MPI_Allreduce of OpenMPI built on a Mac OSX 10.6 with ifort (intel > fortran compiler). The implication I was going for was that if you were using MPI_DOUBLE_PRECISION with a data buffer that wasn't actually double precision, Bad Things would happen inside the allreduce because OMPI would likely read/write beyond the end of your buffer. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] MPI_Allreduce on local machine
On Tue, 2010-07-27 at 16:19 -0400, Gus Correa wrote: > Hi Hugo, David, Jeff, Terry, Anton, list > > I suppose maybe we're guessing that somehow on Hugo's iMac > MPI_DOUBLE_PRECISION may not have as many bytes as dp = kind(1.d0), > hence the segmentation fault on MPI_Allreduce. > > Question: > > Is there a simple way to check the number of bytes associated to each > MPI basic type of OpenMPI on a specific machine (or machine+compiler)? > > Something that would come out easily, say, from ompi_info? bit_size() will give you the integer size. For reals, digits() will give you a hint, but the Fortran real data model is designed to not tie you to a particular representation (my interpretation), so there's no language feature to give a simple word size.
Re: [OMPI users] MPI_Allreduce on local machine
I did and it runs now, but the result is wrong: outside is still 1.d0, 2.d0, 3.d0, 4.d0, 5.d0 How can I make sure to compile OpenMPI so that datatypes such as mpi_double_precision behave as they "should"? Are there flags during the OpenMPI building process or something? Thanks, -- Hugo Gagnon On Tue, 27 Jul 2010 09:06 -0700, "David Zhang"wrote: > Try mpi_real8 for the type in allreduce > > On 7/26/10, Hugo Gagnon wrote: > > Hello, > > > > When I compile and run this code snippet: > > > > 1 program test > > 2 > > 3 use mpi > > 4 > > 5 implicit none > > 6 > > 7 integer :: ierr, nproc, myrank > > 8 integer, parameter :: dp = kind(1.d0) > > 9 real(kind=dp) :: inside(5), outside(5) > > 10 > > 11 call mpi_init(ierr) > > 12 call mpi_comm_size(mpi_comm_world, nproc, ierr) > > 13 call mpi_comm_rank(mpi_comm_world, myrank, ierr) > > 14 > > 15 inside = (/ 1, 2, 3, 4, 5 /) > > 16 call mpi_allreduce(inside, outside, 5, mpi_double_precision, > > mpi_sum, mpi_comm_world, ierr) > > 17 > > 18 print*, myrank, inside > > 19 print*, outside > > 20 > > 21 call mpi_finalize(ierr) > > 22 > > 23 end program test > > > > I get the following error, with say 2 processors: > > > > forrtl: severe (174): SIGSEGV, segmentation fault occurred > > Image PCRoutineLine > > Source > > libmpi.0.dylib 0001001BB4B7 Unknown Unknown > > Unknown > > libmpi_f77.0.dyli 0001000AF046 Unknown Unknown > > Unknown > > a.out 00010CE2 _MAIN__16 > > test.f90 > > a.out 00010BDC Unknown Unknown > > Unknown > > a.out 00010B74 Unknown Unknown > > Unknown > > forrtl: severe (174): SIGSEGV, segmentation fault occurred > > Image PCRoutineLine > > Source > > libmpi.0.dylib 0001001BB4B7 Unknown Unknown > > Unknown > > libmpi_f77.0.dyli 0001000AF046 Unknown Unknown > > Unknown > > a.out 00010CE2 _MAIN__16 > > test.f90 > > a.out 00010BDC Unknown Unknown > > Unknown > > a.out 00010B74 Unknown Unknown > > Unknown > > > > on my iMac having compiled OpenMPI with ifort according to: > > http://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers/ > > > > Note that the above code snippet runs fine on my school parallel cluster > > where ifort+intelmpi is installed. > > Is there something special about OpenMPI's MPI_Allreduce function call > > that I should be aware of? > > > > Thanks, > > -- > > Hugo Gagnon > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > -- > Sent from my mobile device > > David Zhang > University of California, San Diego > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Hugo Gagnon
Re: [OMPI users] MPI_Allreduce on local machine
Hi Hugo, David, Jeff, Terry, Anton, list I suppose maybe we're guessing that somehow on Hugo's iMac MPI_DOUBLE_PRECISION may not have as many bytes as dp = kind(1.d0), hence the segmentation fault on MPI_Allreduce. Question: Is there a simple way to check the number of bytes associated to each MPI basic type of OpenMPI on a specific machine (or machine+compiler)? Something that would come out easily, say, from ompi_info? The information I get is C-centered: :( $ ompi_info --all |grep -i double C double size: 8 C double align: 8 If not possible yet, please consider it a feature request ... :) (Or is this perhaps against the "opacity" in the MPI standard?) I poked around on the OpenMPI include directory to no avail. MPI_DOUBLE_PRECISION is defined as a constant (it is 17 here) which doesn't seem to be related to the actual size in bytes. I found some stuff on my OpenMPI config.log, though: $ grep -i double_precision config.log ... (tons of lines) ompi_cv_f77_alignment_DOUBLE_PRECISION=8 ompi_cv_f77_have_DOUBLE_PRECISION=yes ompi_cv_f77_sizeof_DOUBLE_PRECISION=8 ompi_cv_f90_have_DOUBLE_PRECISION=yes ompi_cv_f90_sizeof_DOUBLE_PRECISION=8 ompi_cv_find_type_DOUBLE_PRECISION=double OMPI_SIZEOF_F90_DOUBLE_PRECISION='8' #define OMPI_HAVE_FORTRAN_DOUBLE_PRECISION 1 #define OMPI_SIZEOF_FORTRAN_DOUBLE_PRECISION 8 #define OMPI_ALIGNMENT_FORTRAN_DOUBLE_PRECISION 8 #define ompi_fortran_double_precision_t double #define OMPI_HAVE_F90_DOUBLE_PRECISION 1 Thank you, Gus Correa David Zhang wrote: Try mpi_real8 for the type in allreduce On 7/26/10, Hugo Gagnonwrote: Hello, When I compile and run this code snippet: 1 program test 2 3 use mpi 4 5 implicit none 6 7 integer :: ierr, nproc, myrank 8 integer, parameter :: dp = kind(1.d0) 9 real(kind=dp) :: inside(5), outside(5) 10 11 call mpi_init(ierr) 12 call mpi_comm_size(mpi_comm_world, nproc, ierr) 13 call mpi_comm_rank(mpi_comm_world, myrank, ierr) 14 15 inside = (/ 1, 2, 3, 4, 5 /) 16 call mpi_allreduce(inside, outside, 5, mpi_double_precision, mpi_sum, mpi_comm_world, ierr) 17 18 print*, myrank, inside 19 print*, outside 20 21 call mpi_finalize(ierr) 22 23 end program test I get the following error, with say 2 processors: forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PCRoutineLine Source libmpi.0.dylib 0001001BB4B7 Unknown Unknown Unknown libmpi_f77.0.dyli 0001000AF046 Unknown Unknown Unknown a.out 00010CE2 _MAIN__16 test.f90 a.out 00010BDC Unknown Unknown Unknown a.out 00010B74 Unknown Unknown Unknown forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PCRoutineLine Source libmpi.0.dylib 0001001BB4B7 Unknown Unknown Unknown libmpi_f77.0.dyli 0001000AF046 Unknown Unknown Unknown a.out 00010CE2 _MAIN__16 test.f90 a.out 00010BDC Unknown Unknown Unknown a.out 00010B74 Unknown Unknown Unknown on my iMac having compiled OpenMPI with ifort according to: http://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers/ Note that the above code snippet runs fine on my school parallel cluster where ifort+intelmpi is installed. Is there something special about OpenMPI's MPI_Allreduce function call that I should be aware of? Thanks, -- Hugo Gagnon ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] MPI_Allreduce on local machine
Try mpi_real8 for the type in allreduce On 7/26/10, Hugo Gagnonwrote: > Hello, > > When I compile and run this code snippet: > > 1 program test > 2 > 3 use mpi > 4 > 5 implicit none > 6 > 7 integer :: ierr, nproc, myrank > 8 integer, parameter :: dp = kind(1.d0) > 9 real(kind=dp) :: inside(5), outside(5) > 10 > 11 call mpi_init(ierr) > 12 call mpi_comm_size(mpi_comm_world, nproc, ierr) > 13 call mpi_comm_rank(mpi_comm_world, myrank, ierr) > 14 > 15 inside = (/ 1, 2, 3, 4, 5 /) > 16 call mpi_allreduce(inside, outside, 5, mpi_double_precision, > mpi_sum, mpi_comm_world, ierr) > 17 > 18 print*, myrank, inside > 19 print*, outside > 20 > 21 call mpi_finalize(ierr) > 22 > 23 end program test > > I get the following error, with say 2 processors: > > forrtl: severe (174): SIGSEGV, segmentation fault occurred > Image PCRoutineLine > Source > libmpi.0.dylib 0001001BB4B7 Unknown Unknown > Unknown > libmpi_f77.0.dyli 0001000AF046 Unknown Unknown > Unknown > a.out 00010CE2 _MAIN__16 > test.f90 > a.out 00010BDC Unknown Unknown > Unknown > a.out 00010B74 Unknown Unknown > Unknown > forrtl: severe (174): SIGSEGV, segmentation fault occurred > Image PCRoutineLine > Source > libmpi.0.dylib 0001001BB4B7 Unknown Unknown > Unknown > libmpi_f77.0.dyli 0001000AF046 Unknown Unknown > Unknown > a.out 00010CE2 _MAIN__16 > test.f90 > a.out 00010BDC Unknown Unknown > Unknown > a.out 00010B74 Unknown Unknown > Unknown > > on my iMac having compiled OpenMPI with ifort according to: > http://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers/ > > Note that the above code snippet runs fine on my school parallel cluster > where ifort+intelmpi is installed. > Is there something special about OpenMPI's MPI_Allreduce function call > that I should be aware of? > > Thanks, > -- > Hugo Gagnon > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Sent from my mobile device David Zhang University of California, San Diego
Re: [OMPI users] MPI_Allreduce on local machine
On Tue, Jul 27, 2010 at 08:11:39AM -0400, Jeff Squyres wrote: > On Jul 26, 2010, at 11:06 PM, Hugo Gagnon wrote: > > > 8 integer, parameter :: dp = kind(1.d0) > > 9 real(kind=dp) :: inside(5), outside(5) > > I'm not a fortran expert -- is kind(1.d0) really double precision? According > to http://gcc.gnu.org/onlinedocs/gcc-3.4.6/g77/Kind-Notation.html, kind(2) is > double precision (but that's for a different compiler, and I don't quite grok > the ".d0" notation). *quote* kind (x) has type default integer and value equal to the kind type parameter value of x. *end quote* p. 161 from Metcalf et al (2007) Fortran 95/2003 explained. % cat tmp.f90 program z integer, parameter :: dp = kind(1.d0) real(kind=dp) :: inside(5), outside(5) write(*,*)dp end program z % g95 -L/usr/local/lib tmp.f90 % ./a.out 8 % Kind 8 is (on most arch) 8-byte real, i.e. typically double precision. -- Anton Shterenlikht Room 2.6, Queen's Building Mech Eng Dept Bristol University University Walk, Bristol BS8 1TR, UK Tel: +44 (0)117 331 5944 Fax: +44 (0)117 929 4423
Re: [OMPI users] MPI_Allreduce on local machine
On Tue, 2010-07-27 at 08:11 -0400, Jeff Squyres wrote: > On Jul 26, 2010, at 11:06 PM, Hugo Gagnon wrote: > > > 8 integer, parameter :: dp = kind(1.d0) > > 9 real(kind=dp) :: inside(5), outside(5) > > I'm not a fortran expert -- is kind(1.d0) really double precision? According > to http://gcc.gnu.org/onlinedocs/gcc-3.4.6/g77/Kind-Notation.html, kind(2) is > double precision (but that's for a different compiler, and I don't quite grok > the ".d0" notation). > Urgh! Thank heavens gcc have moved away from that stupid idea. kind=8 is normally double precision (and is with gfortran). kind(1.0d0) is always double precision. The d (as opposed to e) means DP.