Re: [OMPI users] MPI_Allreduce on local machine

2010-08-10 Thread Gus Correa

Hi Jeff

Thank you for opening a ticket and taking care of this.

Jeff Squyres wrote:

On Jul 28, 2010, at 5:07 PM, Gus Correa wrote:


Still, the alignment under Intel may or may not be right.
And this may or may not explain the errors that Hugo has got.

FYI, the ompi_info from my OpenMPI 1.3.2 and 1.2.8
report exactly the same as OpenMPI 1.4.2, namely
Fort dbl prec size: 4  and
Fort dbl prec align: 4,
except that *if the Intel Fortran compiler (ifort) was used*
I get 1 byte alignment:
Fort dbl prec align: 1

So, this issue has been around for a while,
and involves both the size and the alignment (in Intel)
of double precision.


Yes, it's quite problematic to try to determine the alignment 
of Fortran types -- compilers can do different things 
and there's no reliable way (that I know of, at least) 
to absolutely get the "native" alignment.




I can imagine this is not easy, specially with the large variety
of architectures, compilers, and environments, that OpenMPI handles.

That being said, we didn't previously find any correctness 
issues with using an alignment of 1.




Does it affect only the information
provided by ompi_info, as Martin Siegert suggested?

Or does it really affect the actual alignment of
MPI types when OpenMPI is compiled with Intel,
as Martin, Ake Sandgren, Hugo Gagnon, and myself
thought it might?


We have a number of pieces of code here where grep shows
MPI_DOUBLE_PRECISION.
Not sure how much of it has actually been active, as there are always
lots of cpp directives to select active code.

In particular I found this interesting snippet:

 if (MPI_DOUBLE_PRECISION==20 .and. MPI_REAL8==18) then
! LAM MPI defined MPI_REAL8 differently from MPI_DOUBLE_PRECISION
! and LAM MPI's allreduce does not accept on MPI_REAL8
MPIreal_t= MPI_DOUBLE_PRECISION
 else
MPIreal_t= MPI_REAL8
 endif


This kind of thing shouldn't be an issue with Open MPI, right?



Yes, you are right.
Actually, I checked (and wrote in my posting)
that OpenMPI MPI_DOUBLE_PRECISION = 17, hence the code above
boils down to redefining everything as MPI_REAL8 instead
(the "else" part), hence MPI_DOUBLE_PRECISION
is never actually used *in this source file*.

BTW, I didn't write this code or the comments.
The source file is part of CCSM4/CAM4,
a widely used public domain big climate/atmosphere model:

http://www.cesm.ucar.edu/models/ccsm4.0/

This particular source file (parallel_mod.F90, circa line 169)
hasn't been used in previous incarnations of these programs 
(CAM3/CCSM3), which we ran extensively here, using OpenMPI.

In the old CAM3/CCSM3 most (perhaps all) of the 8-byte
floating point data are declared as real*8 or with the "kind" attribute,
not as double precision.

However, not only this source file, but many other source files
in the new CCSM4/CAM4 declare 8-byte floating point data
as double precision,
and utilize MPI_DOUBLE_PRECISION in MPI function calls.
Despite this style being a bit outdated,
as Fortran90 seems to prefer to replace "double precision"
by "real, kind(0.d0)", as Hugo did in his example.

My concern is because we just started experimenting with CAM4/CCSM4,
and the plan was to use OpenMPI libraries compiled with Intel.

FWIW, OMPI uses different numbers for MPI_DOUBLE_PRECISION and MPI_REAL8 
than LAM.  They're distinct MPI datatypes because they *could* be different.




Yes, I understand two different MPI_ constants should be kept,
although the actual values of their size and alignment may be the same
in specific architectures (e.g. x86_64).

Many thanks,
Gus Correa


Re: [OMPI users] MPI_Allreduce on local machine

2010-08-09 Thread Jeff Squyres
On Jul 28, 2010, at 5:07 PM, Gus Correa wrote:

> Still, the alignment under Intel may or may not be right.
> And this may or may not explain the errors that Hugo has got.
> 
> FYI, the ompi_info from my OpenMPI 1.3.2 and 1.2.8
> report exactly the same as OpenMPI 1.4.2, namely
> Fort dbl prec size: 4  and
> Fort dbl prec align: 4,
> except that *if the Intel Fortran compiler (ifort) was used*
> I get 1 byte alignment:
> Fort dbl prec align: 1
> 
> So, this issue has been around for a while,
> and involves both the size and the alignment (in Intel)
> of double precision.

Yes, it's quite problematic to try to determine the alignment of Fortran types 
-- compilers can do different things and there's no reliable way (that I know 
of, at least) to absolutely get the "native" alignment.

That being said, we didn't previously find any correctness issues with using an 
alignment of 1.

> We have a number of pieces of code here where grep shows
> MPI_DOUBLE_PRECISION.
> Not sure how much of it has actually been active, as there are always
> lots of cpp directives to select active code.
> 
> In particular I found this interesting snippet:
> 
>  if (MPI_DOUBLE_PRECISION==20 .and. MPI_REAL8==18) then
> ! LAM MPI defined MPI_REAL8 differently from MPI_DOUBLE_PRECISION
> ! and LAM MPI's allreduce does not accept on MPI_REAL8
> MPIreal_t= MPI_DOUBLE_PRECISION
>  else
> MPIreal_t= MPI_REAL8
>  endif

This kind of thing shouldn't be an issue with Open MPI, right?

FWIW, OMPI uses different numbers for MPI_DOUBLE_PRECISION and MPI_REAL8 than 
LAM.  They're distinct MPI datatypes because they *could* be different.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] MPI_Allreduce on local machine

2010-08-09 Thread Jeff Squyres
On Jul 28, 2010, at 12:21 PM, Åke Sandgren wrote:

> > Jeff:  Is this correct?
> 
> This is wrong, it should be 8 and alignement should be 8 even for intel.
> And i also see exactly the same thing.

Good catch!

I just fixed this in https://svn.open-mpi.org/trac/ompi/changeset/23580 -- it 
looks like a copy-n-paste error in displaying the Fortran sizes/alignments in 
ompi_info.  It probably happened when ompi_info was converted from C++ to C.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Hugo Gagnon
I also get 8 from "call MPI_Type_size(MPI_DOUBLE_PRECISION, size,
mpierr)", but really I don't think this is the issue anymore. I mean I
checked on my school cluster where OpenMPI has also been compiled with
the intel64 compilers and "Fort dbl prec size:" also returns 4 but
unlike on my Mac the code runs fine there. I am just saying that we
should stop worrying about ompi_info output and wait until Jeff Squyres
analyses my build output files that I sent to the list earlier. I might
be wrong too as I have no idea of what's going on.
-- 
  Hugo Gagnon


On Wed, 28 Jul 2010 17:07 -0400, "Gus Correa" 
wrote:
> Hi All
> 
> Martin Siegert wrote:
> > On Wed, Jul 28, 2010 at 01:05:52PM -0700, Martin Siegert wrote:
> >> On Wed, Jul 28, 2010 at 11:19:43AM -0400, Gus Correa wrote:
> >>> Hugo Gagnon wrote:
>  Hi Gus,
>  Ompi_info --all lists its info regarding fortran right after C. In my
>  case:
>    Fort real size: 4
>   Fort real4 size: 4
>   Fort real8 size: 8
>  Fort real16 size: 16
>    Fort dbl prec size: 4
>  Does it make any sense to you?
> >>> Hi Hugo
> >>>
> >>> No, dbl prec size 4 sounds weird, should be 8, I suppose,
> >>> same as real8, right?
> >>>
> >>> It doesn't make sense, but that's what I have (now that you told me
> >>> that "dbl" , not "double", is the string to search for):
> >>>
> >>> $  Fort dbl prec size: 4
> >>>   Fort dbl cplx size: 4
> >>>  Fort dbl prec align: 4
> >>>  Fort dbl cplx align: 4
> >>>
> >>> Is this a bug in OpenMPI perhaps?
> >>>
> >>> I didn't come across to this problem, most likely because
> >>> the codes here don't use "double precision" but real*8 or similar.
> >>>
> >>> Also make sure you are picking the right ompi_info, mpif90/f77, mpiexec.
> >>> Often times old versions and tangled PATH make things very confusing.
> >> This is indeed worrisome as I confirm the findings on our clusters both
> >> with ompi 1.3.3 and 1.4.1:
> >>
> >> ompi_info --all | grep -i fort
> >> ...
> >>   Fort real size: 4
> >>  Fort real4 size: 4
> >>  Fort real8 size: 8
> >> Fort real16 size: -1
> >>   Fort dbl prec size: 4
> >>   Fort cplx size: 4
> >>   Fort dbl cplx size: 4
> >>  Fort cplx8 size: 8
> >> Fort cplx16 size: 16
> >> Fort cplx32 size: -1
> >>   Fort integer align: 4
> >>  Fort integer1 align: 1
> >>  Fort integer2 align: 2
> >>  Fort integer4 align: 4
> >>  Fort integer8 align: 8
> >> Fort integer16 align: -1
> >>  Fort real align: 4
> >> Fort real4 align: 4
> >> Fort real8 align: 8
> >>Fort real16 align: -1
> >>  Fort dbl prec align: 4
> >>  Fort cplx align: 4  
> >>  Fort dbl cplx align: 4  
> >> Fort cplx8 align: 4  
> >>Fort cplx16 align: 8  
> >> ...
> >>
> >> And this is the configure output:
> >> checking if Fortran 77 compiler supports REAL*8... yes
> >> checking size of Fortran 77 REAL*8... 8
> >> checking for C type corresponding to REAL*8... double
> >> checking alignment of Fortran REAL*8... 1
> >> ...
> >> checking if Fortran 77 compiler supports DOUBLE PRECISION... yes
> >> checking size of Fortran 77 DOUBLE PRECISION... 8
> >> checking for C type corresponding to DOUBLE PRECISION... double
> >> checking alignment of Fortran DOUBLE PRECISION... 1
> >>
> >> But the following code actually appears to give the correct results:
> >>
> >> program types
> >> use mpi
> >> implicit none
> >> integer :: mpierr, size
> >>
> >>call MPI_Init(mpierr)
> >>call MPI_Type_size(MPI_DOUBLE_PRECISION, size, mpierr)
> >>print*, 'double precision size: ', size
> >>call MPI_Finalize(mpierr)
> >> end
> >>
> >> mpif90 -g types.f90
> >> mpiexec -n 1 ./a.out
> >>  double precision size:8
> >>
> >> Thus is this a bug in ompi_info only?
> > 
> > answering my own question:
> > This does not look right:
> > 
> > ompi/tools/ompi_info/param.cc:
> > 
> >   out("Fort dbl prec size",
> >   "compiler:fortran:sizeof:double_precision",
> >   OMPI_SIZEOF_FORTRAN_REAL);
> > 
> > that should be OMPI_SIZEOF_FORTRAN_DOUBLE_PRECISION.
> > 
> > - Martin
> 
> Hopefully Martin may got it and the issue is restricted to ompi_info.
> Thanks, Martin, for writing and running the little diagnostic code,
> and for checking the ompi_info guts!
> 
> Still, the alignment under Intel may or may not be right.
> And this may or may not explain the errors that Hugo has got.
> 
> FYI, the ompi_info from my OpenMPI 1.3.2 and 1.2.8
> report exactly the same as OpenMPI 1.4.2, namely
> Fort dbl prec size: 4  and
> Fort dbl prec align: 4,
> except that *if the Intel Fortran compiler (ifort) was used*
> I get 1 byte alignment:
> Fort dbl prec align: 1
> 
> So, this issue has been around for 

Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Gus Correa

Hi All

Martin Siegert wrote:

On Wed, Jul 28, 2010 at 01:05:52PM -0700, Martin Siegert wrote:

On Wed, Jul 28, 2010 at 11:19:43AM -0400, Gus Correa wrote:

Hugo Gagnon wrote:

Hi Gus,
Ompi_info --all lists its info regarding fortran right after C. In my
case:
  Fort real size: 4
 Fort real4 size: 4
 Fort real8 size: 8
Fort real16 size: 16
  Fort dbl prec size: 4
Does it make any sense to you?

Hi Hugo

No, dbl prec size 4 sounds weird, should be 8, I suppose,
same as real8, right?

It doesn't make sense, but that's what I have (now that you told me
that "dbl" , not "double", is the string to search for):

$  Fort dbl prec size: 4
  Fort dbl cplx size: 4
 Fort dbl prec align: 4
 Fort dbl cplx align: 4

Is this a bug in OpenMPI perhaps?

I didn't come across to this problem, most likely because
the codes here don't use "double precision" but real*8 or similar.

Also make sure you are picking the right ompi_info, mpif90/f77, mpiexec.
Often times old versions and tangled PATH make things very confusing.

This is indeed worrisome as I confirm the findings on our clusters both
with ompi 1.3.3 and 1.4.1:

ompi_info --all | grep -i fort
...
  Fort real size: 4
 Fort real4 size: 4
 Fort real8 size: 8
Fort real16 size: -1
  Fort dbl prec size: 4
  Fort cplx size: 4
  Fort dbl cplx size: 4
 Fort cplx8 size: 8
Fort cplx16 size: 16
Fort cplx32 size: -1
  Fort integer align: 4
 Fort integer1 align: 1
 Fort integer2 align: 2
 Fort integer4 align: 4
 Fort integer8 align: 8
Fort integer16 align: -1
 Fort real align: 4
Fort real4 align: 4
Fort real8 align: 8
   Fort real16 align: -1
 Fort dbl prec align: 4
 Fort cplx align: 4  
 Fort dbl cplx align: 4  
Fort cplx8 align: 4  
   Fort cplx16 align: 8  
...


And this is the configure output:
checking if Fortran 77 compiler supports REAL*8... yes
checking size of Fortran 77 REAL*8... 8
checking for C type corresponding to REAL*8... double
checking alignment of Fortran REAL*8... 1
...
checking if Fortran 77 compiler supports DOUBLE PRECISION... yes
checking size of Fortran 77 DOUBLE PRECISION... 8
checking for C type corresponding to DOUBLE PRECISION... double
checking alignment of Fortran DOUBLE PRECISION... 1

But the following code actually appears to give the correct results:

program types
use mpi
implicit none
integer :: mpierr, size

   call MPI_Init(mpierr)
   call MPI_Type_size(MPI_DOUBLE_PRECISION, size, mpierr)
   print*, 'double precision size: ', size
   call MPI_Finalize(mpierr)
end

mpif90 -g types.f90
mpiexec -n 1 ./a.out
 double precision size:8

Thus is this a bug in ompi_info only?


answering my own question:
This does not look right:

ompi/tools/ompi_info/param.cc:

  out("Fort dbl prec size",
  "compiler:fortran:sizeof:double_precision",
  OMPI_SIZEOF_FORTRAN_REAL);

that should be OMPI_SIZEOF_FORTRAN_DOUBLE_PRECISION.

- Martin


Hopefully Martin may got it and the issue is restricted to ompi_info.
Thanks, Martin, for writing and running the little diagnostic code,
and for checking the ompi_info guts!

Still, the alignment under Intel may or may not be right.
And this may or may not explain the errors that Hugo has got.

FYI, the ompi_info from my OpenMPI 1.3.2 and 1.2.8
report exactly the same as OpenMPI 1.4.2, namely
Fort dbl prec size: 4  and
Fort dbl prec align: 4,
except that *if the Intel Fortran compiler (ifort) was used*
I get 1 byte alignment:
Fort dbl prec align: 1

So, this issue has been around for a while,
and involves both the size and the alignment (in Intel)
of double precision.

We have a number of pieces of code here where grep shows 
MPI_DOUBLE_PRECISION.

Not sure how much of it has actually been active, as there are always
lots of cpp directives to select active code.

In particular I found this interesting snippet:

if (MPI_DOUBLE_PRECISION==20 .and. MPI_REAL8==18) then
   ! LAM MPI defined MPI_REAL8 differently from MPI_DOUBLE_PRECISION
   ! and LAM MPI's allreduce does not accept on MPI_REAL8
   MPIreal_t= MPI_DOUBLE_PRECISION
else
   MPIreal_t= MPI_REAL8
endif

where eventually MPIreal_t is what is used as
the MPI type in some MPI calls, particularly in MPI_Allreduce,
which is the one that triggered all this discussion
(see this thread Subject line) when Hugo first
asked his original question.

Hopefully the if branch on the code snippet above worked alright,
because here in our OpenMPIs 1.4.2, 1.3.2, and 1.2.8,
MPI_DOUBLE_PRECISION value is 17,
which should have safely produced
MPIreal_t= MPI_REAL8

I have a lot more of code to check, but maybe not.
If the issue is really restricted to ompi_info that would be a
big relief.

Many 

Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Martin Siegert
On Wed, Jul 28, 2010 at 01:05:52PM -0700, Martin Siegert wrote:
> On Wed, Jul 28, 2010 at 11:19:43AM -0400, Gus Correa wrote:
> > Hugo Gagnon wrote:
> >> Hi Gus,
> >> Ompi_info --all lists its info regarding fortran right after C. In my
> >> case:
> >>   Fort real size: 4
> >>  Fort real4 size: 4
> >>  Fort real8 size: 8
> >> Fort real16 size: 16
> >>   Fort dbl prec size: 4
> >> Does it make any sense to you?
> >
> > Hi Hugo
> >
> > No, dbl prec size 4 sounds weird, should be 8, I suppose,
> > same as real8, right?
> >
> > It doesn't make sense, but that's what I have (now that you told me
> > that "dbl" , not "double", is the string to search for):
> >
> > $  Fort dbl prec size: 4
> >   Fort dbl cplx size: 4
> >  Fort dbl prec align: 4
> >  Fort dbl cplx align: 4
> >
> > Is this a bug in OpenMPI perhaps?
> >
> > I didn't come across to this problem, most likely because
> > the codes here don't use "double precision" but real*8 or similar.
> >
> > Also make sure you are picking the right ompi_info, mpif90/f77, mpiexec.
> > Often times old versions and tangled PATH make things very confusing.
> 
> This is indeed worrisome as I confirm the findings on our clusters both
> with ompi 1.3.3 and 1.4.1:
> 
> ompi_info --all | grep -i fort
> ...
>   Fort real size: 4
>  Fort real4 size: 4
>  Fort real8 size: 8
> Fort real16 size: -1
>   Fort dbl prec size: 4
>   Fort cplx size: 4
>   Fort dbl cplx size: 4
>  Fort cplx8 size: 8
> Fort cplx16 size: 16
> Fort cplx32 size: -1
>   Fort integer align: 4
>  Fort integer1 align: 1
>  Fort integer2 align: 2
>  Fort integer4 align: 4
>  Fort integer8 align: 8
> Fort integer16 align: -1
>  Fort real align: 4
> Fort real4 align: 4
> Fort real8 align: 8
>Fort real16 align: -1
>  Fort dbl prec align: 4
>  Fort cplx align: 4  
>  Fort dbl cplx align: 4  
> Fort cplx8 align: 4  
>Fort cplx16 align: 8  
> ...
> 
> And this is the configure output:
> checking if Fortran 77 compiler supports REAL*8... yes
> checking size of Fortran 77 REAL*8... 8
> checking for C type corresponding to REAL*8... double
> checking alignment of Fortran REAL*8... 1
> ...
> checking if Fortran 77 compiler supports DOUBLE PRECISION... yes
> checking size of Fortran 77 DOUBLE PRECISION... 8
> checking for C type corresponding to DOUBLE PRECISION... double
> checking alignment of Fortran DOUBLE PRECISION... 1
> 
> But the following code actually appears to give the correct results:
> 
> program types
> use mpi
> implicit none
> integer :: mpierr, size
> 
>call MPI_Init(mpierr)
>call MPI_Type_size(MPI_DOUBLE_PRECISION, size, mpierr)
>print*, 'double precision size: ', size
>call MPI_Finalize(mpierr)
> end
> 
> mpif90 -g types.f90
> mpiexec -n 1 ./a.out
>  double precision size:8
> 
> Thus is this a bug in ompi_info only?

answering my own question:
This does not look right:

ompi/tools/ompi_info/param.cc:

  out("Fort dbl prec size",
  "compiler:fortran:sizeof:double_precision",
  OMPI_SIZEOF_FORTRAN_REAL);

that should be OMPI_SIZEOF_FORTRAN_DOUBLE_PRECISION.

- Martin


Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Martin Siegert
On Wed, Jul 28, 2010 at 11:19:43AM -0400, Gus Correa wrote:
> Hugo Gagnon wrote:
>> Hi Gus,
>> Ompi_info --all lists its info regarding fortran right after C. In my
>> case:
>>   Fort real size: 4
>>  Fort real4 size: 4
>>  Fort real8 size: 8
>> Fort real16 size: 16
>>   Fort dbl prec size: 4
>> Does it make any sense to you?
>
> Hi Hugo
>
> No, dbl prec size 4 sounds weird, should be 8, I suppose,
> same as real8, right?
>
> It doesn't make sense, but that's what I have (now that you told me
> that "dbl" , not "double", is the string to search for):
>
> $  Fort dbl prec size: 4
>   Fort dbl cplx size: 4
>  Fort dbl prec align: 4
>  Fort dbl cplx align: 4
>
> Is this a bug in OpenMPI perhaps?
>
> I didn't come across to this problem, most likely because
> the codes here don't use "double precision" but real*8 or similar.
>
> Also make sure you are picking the right ompi_info, mpif90/f77, mpiexec.
> Often times old versions and tangled PATH make things very confusing.

This is indeed worrisome as I confirm the findings on our clusters both
with ompi 1.3.3 and 1.4.1:

ompi_info --all | grep -i fort
...
  Fort real size: 4
 Fort real4 size: 4
 Fort real8 size: 8
Fort real16 size: -1
  Fort dbl prec size: 4
  Fort cplx size: 4
  Fort dbl cplx size: 4
 Fort cplx8 size: 8
Fort cplx16 size: 16
Fort cplx32 size: -1
  Fort integer align: 4
 Fort integer1 align: 1
 Fort integer2 align: 2
 Fort integer4 align: 4
 Fort integer8 align: 8
Fort integer16 align: -1
 Fort real align: 4
Fort real4 align: 4
Fort real8 align: 8
   Fort real16 align: -1
 Fort dbl prec align: 4
 Fort cplx align: 4  
 Fort dbl cplx align: 4  
Fort cplx8 align: 4  
   Fort cplx16 align: 8  
...

And this is the configure output:
checking if Fortran 77 compiler supports REAL*8... yes
checking size of Fortran 77 REAL*8... 8
checking for C type corresponding to REAL*8... double
checking alignment of Fortran REAL*8... 1
...
checking if Fortran 77 compiler supports DOUBLE PRECISION... yes
checking size of Fortran 77 DOUBLE PRECISION... 8
checking for C type corresponding to DOUBLE PRECISION... double
checking alignment of Fortran DOUBLE PRECISION... 1

But the following code actually appears to give the correct results:

program types
use mpi
implicit none
integer :: mpierr, size

   call MPI_Init(mpierr)
   call MPI_Type_size(MPI_DOUBLE_PRECISION, size, mpierr)
   print*, 'double precision size: ', size
   call MPI_Finalize(mpierr)
end

mpif90 -g types.f90
mpiexec -n 1 ./a.out
 double precision size:8

Thus is this a bug in ompi_info only?

Cheers,
Martin

-- 
Martin Siegert
Head, Research Computing
WestGrid/ComputeCanada Site Lead
IT Servicesphone: 778 782-4691
Simon Fraser Universityfax:   778 782-4242
Burnaby, British Columbia  email: sieg...@sfu.ca
Canada  V5A 1S6


Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Åke Sandgren
On Wed, 2010-07-28 at 11:48 -0400, Gus Correa wrote:
> Hi Hugo, Jeff, list
> 
> Hugo: I think David Zhang's suggestion was to use
> MPI_REAL8 not MPI_REAL, instead of MPI_DOUBLE_PRECISION in your
> MPI_Allreduce call.
> 
> Still, to me it looks like OpenMPI is making double precision 4-byte 
> long, which shorter than I expected it be (8 bytes),
> at least when looking at the output of ompi_info --all.
> 
> I always get get a size 4 for dbl prec in my x86_64 machine,
> from ompi_info --all.
> I confirmed this in six builds of OpenMPI 1.4.2:  gcc+gfortran,
> gcc+pgf90, gcc+ifort, icc+ifort, pgcc+pgf90, and opencc+openf95.
> Although the output of ompi_info never says this is actually the size
> of MPI_DOUBLE_PRECISION, just of "dbl prec", which is a bit ambiguous.
> 
> FWIW, I include the output below.  Note that alignment for gcc+ifort
> is 1, all others are 4.
> 
> Jeff:  Is this correct?

This is wrong, it should be 8 and alignement should be 8 even for intel.
And i also see exactly the same thing.

-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se



Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Hugo Gagnon
Here they are.
-- 
  Hugo Gagnon


On Wed, 28 Jul 2010 12:01 -0400, "Jeff Squyres" 
wrote:
> On Jul 28, 2010, at 11:55 AM, Gus Correa wrote:
> 
> > I surely can send you the logs, but they're big.
> > Off the list perhaps?
> 
> If they're still big when compressed, sure, send them to me off list.
> 
> But I think I'd be more interested to see Hugo's logs.  :-)
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
-- 
  Hugo Gagnon



ompi-output.tar.bz2
Description: BZip2 compressed data


Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Gus Correa

Hi Jeff

I surely can send you the logs, but they're big.
Off the list perhaps?

Thanks,
Gus

Jeff Squyres wrote:

On Jul 28, 2010, at 11:19 AM, Gus Correa wrote:


Ompi_info --all lists its info regarding fortran right after C. In my


Ummm right... I should know that.  I wrote ompi_info, after all.  :-)  I ran 
"ompi_info -all | grep -i fortran" and didn't see the fortran info, and I 
forgot that I put that stuff in there.  Oops!  :-)


case:
  Fort real size: 4
 Fort real4 size: 4
 Fort real8 size: 8
Fort real16 size: 16
  Fort dbl prec size: 4


That does seems weird.  


No, dbl prec size 4 sounds weird, should be 8, I suppose,
same as real8, right?

It doesn't make sense, but that's what I have (now that you told me
that "dbl" , not "double", is the string to search for):

$  Fort dbl prec size: 4
   Fort dbl cplx size: 4
  Fort dbl prec align: 4
  Fort dbl cplx align: 4

Is this a bug in OpenMPI perhaps?


Getting sizeof and alignment of Fortran variable types is very problematic.  
Can you send the stdout/stderr of configure and the config.log?

http://www.open-mpi.org/community/help/





Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Gus Correa

Hi Hugo, Jeff, list

Hugo: I think David Zhang's suggestion was to use
MPI_REAL8 not MPI_REAL, instead of MPI_DOUBLE_PRECISION in your
MPI_Allreduce call.

Still, to me it looks like OpenMPI is making double precision 4-byte 
long, which shorter than I expected it be (8 bytes),

at least when looking at the output of ompi_info --all.

I always get get a size 4 for dbl prec in my x86_64 machine,
from ompi_info --all.
I confirmed this in six builds of OpenMPI 1.4.2:  gcc+gfortran,
gcc+pgf90, gcc+ifort, icc+ifort, pgcc+pgf90, and opencc+openf95.
Although the output of ompi_info never says this is actually the size
of MPI_DOUBLE_PRECISION, just of "dbl prec", which is a bit ambiguous.

FWIW, I include the output below.  Note that alignment for gcc+ifort
is 1, all others are 4.

Jeff:  Is this correct?

Thanks,
Gus Correa


$ openmpi/1.4.2/open64-4.2.3-0/bin/ompi_info --all | grep -i dbl
  Fort dbl prec size: 4
  Fort dbl cplx size: 4
 Fort dbl prec align: 4
 Fort dbl cplx align: 4
$ openmpi/1.4.2/gnu-4.1.2/bin/ompi_info --all | grep -i dbl
  Fort dbl prec size: 4
  Fort dbl cplx size: 4
 Fort dbl prec align: 4
 Fort dbl cplx align: 4
$ openmpi/1.4.2/gnu-4.1.2-intel-10.1.017/bin/ompi_info --all | grep -i dbl
  Fort dbl prec size: 4
  Fort dbl cplx size: 4
 Fort dbl prec align: 1
 Fort dbl cplx align: 1
$ openmpi/1.4.2/gnu-4.1.2-pgi-8.0-4/bin/ompi_info --all | grep -i dbl
  Fort dbl prec size: 4
  Fort dbl cplx size: 4
 Fort dbl prec align: 4
 Fort dbl cplx align: 4
$ openmpi/1.4.2/pgi-8.0-4/bin/ompi_info --all | grep -i dbl
  Fort dbl prec size: 4
  Fort dbl cplx size: 4
 Fort dbl prec align: 4
 Fort dbl cplx align: 4

Hugo Gagnon wrote:

And how do I know how big my data buffer is? I ran MPI_TYPE_EXTENT of
And how do I know how big my data buffer is? I ran MPI_TYPE_EXTENT of
MPI_DOUBLE_PRECISION and the result was 8. So I changed my program to:

  1 program test
  2 
  3 use mpi
  4 
  5 implicit none
  6 
  7 integer :: ierr, nproc, myrank

  8 !integer, parameter :: dp = kind(1.d0)
  9 real(kind=8) :: inside(5), outside(5)
 10 
 11 call mpi_init(ierr)

 12 call mpi_comm_size(mpi_comm_world, nproc, ierr)
 13 call mpi_comm_rank(mpi_comm_world, myrank, ierr)
 14 
 15 inside = (/ 1., 2., 3., 4., 5. /)

 16 call mpi_allreduce(inside, outside, 5, mpi_real, mpi_sum,
 mpi_comm_world, ierr)
 17 
 18 if (myrank == 0) then

 19 print*, outside
 20 end if
 21 
 22 call mpi_finalize(ierr)
 23 
 24 end program test


but I still get a SIGSEGV fault:

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image  PCRoutineLine   
Source 
libmpi.0.dylib 0001001BB4B7  Unknown   Unknown 
Unknown
libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown 
Unknown
a.out  00010D87  _MAIN__16 
test.f90
a.out  00010C9C  Unknown   Unknown 
Unknown
a.out  00010C34  Unknown   Unknown 
Unknown

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image  PCRoutineLine   
Source 
libmpi.0.dylib 0001001BB4B7  Unknown   Unknown 
Unknown
libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown 
Unknown
a.out  00010D87  _MAIN__16 
test.f90
a.out  00010C9C  Unknown   Unknown 
Unknown
a.out  00010C34  Unknown   Unknown 
Unknown


What is wrong now?




Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Gus Correa

Hugo Gagnon wrote:

Hi Gus,
Ompi_info --all lists its info regarding fortran right after C. In my
case:
  Fort real size: 4
 Fort real4 size: 4
 Fort real8 size: 8
Fort real16 size: 16
  Fort dbl prec size: 4
Does it make any sense to you?


Hi Hugo

No, dbl prec size 4 sounds weird, should be 8, I suppose,
same as real8, right?

It doesn't make sense, but that's what I have (now that you told me
that "dbl" , not "double", is the string to search for):

$  Fort dbl prec size: 4
  Fort dbl cplx size: 4
 Fort dbl prec align: 4
 Fort dbl cplx align: 4

Is this a bug in OpenMPI perhaps?

I didn't come across to this problem, most likely because
the codes here don't use "double precision" but real*8 or similar.

Also make sure you are picking the right ompi_info, mpif90/f77, mpiexec.
Often times old versions and tangled PATH make things very confusing.

Gus Correa


Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Hugo Gagnon
I mean to write:
call mpi_allreduce(inside, outside, 5,mpi_real, mpi_double_precision,
mpi_comm_world, ierr)
-- 
  Hugo Gagnon


On Wed, 28 Jul 2010 09:33 -0400, "Hugo Gagnon"
 wrote:
> And how do I know how big my data buffer is? I ran MPI_TYPE_EXTENT of
> And how do I know how big my data buffer is? I ran MPI_TYPE_EXTENT of
> MPI_DOUBLE_PRECISION and the result was 8. So I changed my program to:
> 
>   1 program test
>   2 
>   3 use mpi
>   4 
>   5 implicit none
>   6 
>   7 integer :: ierr, nproc, myrank
>   8 !integer, parameter :: dp = kind(1.d0)
>   9 real(kind=8) :: inside(5), outside(5)
>  10 
>  11 call mpi_init(ierr)
>  12 call mpi_comm_size(mpi_comm_world, nproc, ierr)
>  13 call mpi_comm_rank(mpi_comm_world, myrank, ierr)
>  14 
>  15 inside = (/ 1., 2., 3., 4., 5. /)
>  16 call mpi_allreduce(inside, outside, 5, mpi_real, mpi_sum,
>  mpi_comm_world, ierr)
>  17 
>  18 if (myrank == 0) then
>  19 print*, outside
>  20 end if
>  21 
>  22 call mpi_finalize(ierr)
>  23 
>  24 end program test
> 
> but I still get a SIGSEGV fault:
> 
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image  PCRoutineLine   
> Source 
> libmpi.0.dylib 0001001BB4B7  Unknown   Unknown 
> Unknown
> libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown 
> Unknown
> a.out  00010D87  _MAIN__16 
> test.f90
> a.out  00010C9C  Unknown   Unknown 
> Unknown
> a.out  00010C34  Unknown   Unknown 
> Unknown
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image  PCRoutineLine   
> Source 
> libmpi.0.dylib 0001001BB4B7  Unknown   Unknown 
> Unknown
> libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown 
> Unknown
> a.out  00010D87  _MAIN__16 
> test.f90
> a.out  00010C9C  Unknown   Unknown 
> Unknown
> a.out  00010C34  Unknown   Unknown 
> Unknown
> 
> What is wrong now?
> -- 
>   Hugo Gagnon
> 
> 
> On Wed, 28 Jul 2010 07:56 -0400, "Jeff Squyres" 
> wrote:
> > On Jul 27, 2010, at 4:19 PM, Gus Correa wrote:
> > 
> > > Is there a simple way to check the number of bytes associated to each
> > > MPI basic type of OpenMPI on a specific machine (or machine+compiler)?
> > > 
> > > Something that would come out easily, say, from ompi_info?
> > 
> > Not via ompi_info, but the MPI function MPI_GET_EXTENT will tell you the
> > datatype's size.
> > 
> > -
> > [4:54] svbu-mpi:~/mpi % cat extent.f90
> >   program main
> >   use mpi
> >   implicit none
> >   integer ierr, ext
> >   
> >   call MPI_INIT(ierr)
> >   call MPI_TYPE_EXTENT(MPI_DOUBLE_PRECISION, ext, ierr)
> >   print *, 'Type extent of DOUBLE_PREC is', ext
> >   call MPI_FINALIZE(ierr)
> >   
> >   end
> > [4:54] svbu-mpi:~/mpi % mpif90 extent.f90 -o extent -g
> > [4:54] svbu-mpi:~/mpi % ./extent
> >  Type extent of DOUBLE_PREC is   8
> > [4:54] svbu-mpi:~/mpi % 
> > -
> > 
> > -- 
> > Jeff Squyres
> > jsquy...@cisco.com
> > For corporate legal information go to:
> > http://www.cisco.com/web/about/doing_business/legal/cri/
> > 
> > 
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > 
> -- 
>   Hugo Gagnon
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
-- 
  Hugo Gagnon



Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Hugo Gagnon
I installed with:

./configure --prefix=/opt/openmpi CC=icc CXX=icpc F77=ifort FC=ifort
make all install

I would gladly give you a corefile but I have no idea on to produce one,
I'm just an end user...
-- 
  Hugo Gagnon


On Wed, 28 Jul 2010 08:57 -0400, "Jeff Squyres" 
wrote:
> I don't have the intel compilers on my Mac, but I'm unable to replicate
> this issue on Linux with the intel compilers v11.0.
> 
> Can you get a corefile to see a backtrace where it died in Open MPI's
> allreduce?
> 
> How exactly did you configure your Open MPI, and how exactly did you
> compile / run your sample application?
> 
> 
> On Jul 27, 2010, at 10:35 PM, Hugo Gagnon wrote:
> 
> > I did and it runs now, but the result is wrong: outside is still 1.d0,
> > 2.d0, 3.d0, 4.d0, 5.d0
> > How can I make sure to compile OpenMPI so that datatypes such as
> > mpi_double_precision behave as they "should"?
> > Are there flags during the OpenMPI building process or something?
> > Thanks,
> > --
> >   Hugo Gagnon
> > 
> > 
> > On Tue, 27 Jul 2010 09:06 -0700, "David Zhang" 
> > wrote:
> > > Try mpi_real8 for the type in allreduce
> > >
> > > On 7/26/10, Hugo Gagnon  wrote:
> > > > Hello,
> > > >
> > > > When I compile and run this code snippet:
> > > >
> > > >   1 program test
> > > >   2
> > > >   3 use mpi
> > > >   4
> > > >   5 implicit none
> > > >   6
> > > >   7 integer :: ierr, nproc, myrank
> > > >   8 integer, parameter :: dp = kind(1.d0)
> > > >   9 real(kind=dp) :: inside(5), outside(5)
> > > >  10
> > > >  11 call mpi_init(ierr)
> > > >  12 call mpi_comm_size(mpi_comm_world, nproc, ierr)
> > > >  13 call mpi_comm_rank(mpi_comm_world, myrank, ierr)
> > > >  14
> > > >  15 inside = (/ 1, 2, 3, 4, 5 /)
> > > >  16 call mpi_allreduce(inside, outside, 5, mpi_double_precision,
> > > >  mpi_sum, mpi_comm_world, ierr)
> > > >  17
> > > >  18 print*, myrank, inside
> > > >  19 print*, outside
> > > >  20
> > > >  21 call mpi_finalize(ierr)
> > > >  22
> > > >  23 end program test
> > > >
> > > > I get the following error, with say 2 processors:
> > > >
> > > > forrtl: severe (174): SIGSEGV, segmentation fault occurred
> > > > Image  PCRoutineLine
> > > > Source
> > > > libmpi.0.dylib 0001001BB4B7  Unknown   Unknown
> > > > Unknown
> > > > libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown
> > > > Unknown
> > > > a.out  00010CE2  _MAIN__16
> > > > test.f90
> > > > a.out  00010BDC  Unknown   Unknown
> > > > Unknown
> > > > a.out  00010B74  Unknown   Unknown
> > > > Unknown
> > > > forrtl: severe (174): SIGSEGV, segmentation fault occurred
> > > > Image  PCRoutineLine
> > > > Source
> > > > libmpi.0.dylib 0001001BB4B7  Unknown   Unknown
> > > > Unknown
> > > > libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown
> > > > Unknown
> > > > a.out  00010CE2  _MAIN__16
> > > > test.f90
> > > > a.out  00010BDC  Unknown   Unknown
> > > > Unknown
> > > > a.out  00010B74  Unknown   Unknown
> > > > Unknown
> > > >
> > > > on my iMac having compiled OpenMPI with ifort according to:
> > > > http://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers/
> > > >
> > > > Note that the above code snippet runs fine on my school parallel cluster
> > > > where ifort+intelmpi is installed.
> > > > Is there something special about OpenMPI's MPI_Allreduce function call
> > > > that I should be aware of?
> > > >
> > > > Thanks,
> > > > --
> > > >   Hugo Gagnon
> > > >
> > > > ___
> > > > users mailing list
> > > > us...@open-mpi.org
> > > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > > >
> > >
> > > --
> > > Sent from my mobile device
> > >
> > > David Zhang
> > > University of California, San Diego
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> > --
> >   Hugo Gagnon
> > 
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > 
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
-- 
  Hugo Gagnon



Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Hugo Gagnon
And how do I know how big my data buffer is? I ran MPI_TYPE_EXTENT of
And how do I know how big my data buffer is? I ran MPI_TYPE_EXTENT of
MPI_DOUBLE_PRECISION and the result was 8. So I changed my program to:

  1 program test
  2 
  3 use mpi
  4 
  5 implicit none
  6 
  7 integer :: ierr, nproc, myrank
  8 !integer, parameter :: dp = kind(1.d0)
  9 real(kind=8) :: inside(5), outside(5)
 10 
 11 call mpi_init(ierr)
 12 call mpi_comm_size(mpi_comm_world, nproc, ierr)
 13 call mpi_comm_rank(mpi_comm_world, myrank, ierr)
 14 
 15 inside = (/ 1., 2., 3., 4., 5. /)
 16 call mpi_allreduce(inside, outside, 5, mpi_real, mpi_sum,
 mpi_comm_world, ierr)
 17 
 18 if (myrank == 0) then
 19 print*, outside
 20 end if
 21 
 22 call mpi_finalize(ierr)
 23 
 24 end program test

but I still get a SIGSEGV fault:

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image  PCRoutineLine   
Source 
libmpi.0.dylib 0001001BB4B7  Unknown   Unknown 
Unknown
libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown 
Unknown
a.out  00010D87  _MAIN__16 
test.f90
a.out  00010C9C  Unknown   Unknown 
Unknown
a.out  00010C34  Unknown   Unknown 
Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image  PCRoutineLine   
Source 
libmpi.0.dylib 0001001BB4B7  Unknown   Unknown 
Unknown
libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown 
Unknown
a.out  00010D87  _MAIN__16 
test.f90
a.out  00010C9C  Unknown   Unknown 
Unknown
a.out  00010C34  Unknown   Unknown 
Unknown

What is wrong now?
-- 
  Hugo Gagnon


On Wed, 28 Jul 2010 07:56 -0400, "Jeff Squyres" 
wrote:
> On Jul 27, 2010, at 4:19 PM, Gus Correa wrote:
> 
> > Is there a simple way to check the number of bytes associated to each
> > MPI basic type of OpenMPI on a specific machine (or machine+compiler)?
> > 
> > Something that would come out easily, say, from ompi_info?
> 
> Not via ompi_info, but the MPI function MPI_GET_EXTENT will tell you the
> datatype's size.
> 
> -
> [4:54] svbu-mpi:~/mpi % cat extent.f90
>   program main
>   use mpi
>   implicit none
>   integer ierr, ext
>   
>   call MPI_INIT(ierr)
>   call MPI_TYPE_EXTENT(MPI_DOUBLE_PRECISION, ext, ierr)
>   print *, 'Type extent of DOUBLE_PREC is', ext
>   call MPI_FINALIZE(ierr)
>   
>   end
> [4:54] svbu-mpi:~/mpi % mpif90 extent.f90 -o extent -g
> [4:54] svbu-mpi:~/mpi % ./extent
>  Type extent of DOUBLE_PREC is   8
> [4:54] svbu-mpi:~/mpi % 
> -
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
-- 
  Hugo Gagnon



Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Jeff Squyres
I don't have the intel compilers on my Mac, but I'm unable to replicate this 
issue on Linux with the intel compilers v11.0.

Can you get a corefile to see a backtrace where it died in Open MPI's allreduce?

How exactly did you configure your Open MPI, and how exactly did you compile / 
run your sample application?


On Jul 27, 2010, at 10:35 PM, Hugo Gagnon wrote:

> I did and it runs now, but the result is wrong: outside is still 1.d0,
> 2.d0, 3.d0, 4.d0, 5.d0
> How can I make sure to compile OpenMPI so that datatypes such as
> mpi_double_precision behave as they "should"?
> Are there flags during the OpenMPI building process or something?
> Thanks,
> --
>   Hugo Gagnon
> 
> 
> On Tue, 27 Jul 2010 09:06 -0700, "David Zhang" 
> wrote:
> > Try mpi_real8 for the type in allreduce
> >
> > On 7/26/10, Hugo Gagnon  wrote:
> > > Hello,
> > >
> > > When I compile and run this code snippet:
> > >
> > >   1 program test
> > >   2
> > >   3 use mpi
> > >   4
> > >   5 implicit none
> > >   6
> > >   7 integer :: ierr, nproc, myrank
> > >   8 integer, parameter :: dp = kind(1.d0)
> > >   9 real(kind=dp) :: inside(5), outside(5)
> > >  10
> > >  11 call mpi_init(ierr)
> > >  12 call mpi_comm_size(mpi_comm_world, nproc, ierr)
> > >  13 call mpi_comm_rank(mpi_comm_world, myrank, ierr)
> > >  14
> > >  15 inside = (/ 1, 2, 3, 4, 5 /)
> > >  16 call mpi_allreduce(inside, outside, 5, mpi_double_precision,
> > >  mpi_sum, mpi_comm_world, ierr)
> > >  17
> > >  18 print*, myrank, inside
> > >  19 print*, outside
> > >  20
> > >  21 call mpi_finalize(ierr)
> > >  22
> > >  23 end program test
> > >
> > > I get the following error, with say 2 processors:
> > >
> > > forrtl: severe (174): SIGSEGV, segmentation fault occurred
> > > Image  PCRoutineLine
> > > Source
> > > libmpi.0.dylib 0001001BB4B7  Unknown   Unknown
> > > Unknown
> > > libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown
> > > Unknown
> > > a.out  00010CE2  _MAIN__16
> > > test.f90
> > > a.out  00010BDC  Unknown   Unknown
> > > Unknown
> > > a.out  00010B74  Unknown   Unknown
> > > Unknown
> > > forrtl: severe (174): SIGSEGV, segmentation fault occurred
> > > Image  PCRoutineLine
> > > Source
> > > libmpi.0.dylib 0001001BB4B7  Unknown   Unknown
> > > Unknown
> > > libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown
> > > Unknown
> > > a.out  00010CE2  _MAIN__16
> > > test.f90
> > > a.out  00010BDC  Unknown   Unknown
> > > Unknown
> > > a.out  00010B74  Unknown   Unknown
> > > Unknown
> > >
> > > on my iMac having compiled OpenMPI with ifort according to:
> > > http://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers/
> > >
> > > Note that the above code snippet runs fine on my school parallel cluster
> > > where ifort+intelmpi is installed.
> > > Is there something special about OpenMPI's MPI_Allreduce function call
> > > that I should be aware of?
> > >
> > > Thanks,
> > > --
> > >   Hugo Gagnon
> > >
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> >
> > --
> > Sent from my mobile device
> >
> > David Zhang
> > University of California, San Diego
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> --
>   Hugo Gagnon
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Jeff Squyres
On Jul 27, 2010, at 11:21 AM, Hugo Gagnon wrote:

> I appreciate your replies but my question has to do with the function
> MPI_Allreduce of OpenMPI built on a Mac OSX 10.6 with ifort (intel
> fortran compiler).

The implication I was going for was that if you were using MPI_DOUBLE_PRECISION 
with a data buffer that wasn't actually double precision, Bad Things would 
happen inside the allreduce because OMPI would likely read/write beyond the end 
of your buffer.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Terry Frankcombe
On Tue, 2010-07-27 at 16:19 -0400, Gus Correa wrote:
> Hi Hugo, David, Jeff, Terry, Anton, list
> 
> I suppose maybe we're guessing that somehow on Hugo's iMac 
> MPI_DOUBLE_PRECISION may not have as many bytes as dp = kind(1.d0),
> hence the segmentation fault on MPI_Allreduce.
> 
> Question:
> 
> Is there a simple way to check the number of bytes associated to each
> MPI basic type of OpenMPI on a specific machine (or machine+compiler)?
> 
> Something that would come out easily, say, from ompi_info?

bit_size() will give you the integer size.  For reals, digits() will
give you a hint, but the Fortran real data model is designed to not tie
you to a particular representation (my interpretation), so there's no
language feature to give a simple word size.




Re: [OMPI users] MPI_Allreduce on local machine

2010-07-27 Thread Hugo Gagnon
I did and it runs now, but the result is wrong: outside is still 1.d0,
2.d0, 3.d0, 4.d0, 5.d0
How can I make sure to compile OpenMPI so that datatypes such as
mpi_double_precision behave as they "should"?
Are there flags during the OpenMPI building process or something?
Thanks,
-- 
  Hugo Gagnon


On Tue, 27 Jul 2010 09:06 -0700, "David Zhang" 
wrote:
> Try mpi_real8 for the type in allreduce
> 
> On 7/26/10, Hugo Gagnon  wrote:
> > Hello,
> >
> > When I compile and run this code snippet:
> >
> >   1 program test
> >   2
> >   3 use mpi
> >   4
> >   5 implicit none
> >   6
> >   7 integer :: ierr, nproc, myrank
> >   8 integer, parameter :: dp = kind(1.d0)
> >   9 real(kind=dp) :: inside(5), outside(5)
> >  10
> >  11 call mpi_init(ierr)
> >  12 call mpi_comm_size(mpi_comm_world, nproc, ierr)
> >  13 call mpi_comm_rank(mpi_comm_world, myrank, ierr)
> >  14
> >  15 inside = (/ 1, 2, 3, 4, 5 /)
> >  16 call mpi_allreduce(inside, outside, 5, mpi_double_precision,
> >  mpi_sum, mpi_comm_world, ierr)
> >  17
> >  18 print*, myrank, inside
> >  19 print*, outside
> >  20
> >  21 call mpi_finalize(ierr)
> >  22
> >  23 end program test
> >
> > I get the following error, with say 2 processors:
> >
> > forrtl: severe (174): SIGSEGV, segmentation fault occurred
> > Image  PCRoutineLine
> > Source
> > libmpi.0.dylib 0001001BB4B7  Unknown   Unknown
> > Unknown
> > libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown
> > Unknown
> > a.out  00010CE2  _MAIN__16
> > test.f90
> > a.out  00010BDC  Unknown   Unknown
> > Unknown
> > a.out  00010B74  Unknown   Unknown
> > Unknown
> > forrtl: severe (174): SIGSEGV, segmentation fault occurred
> > Image  PCRoutineLine
> > Source
> > libmpi.0.dylib 0001001BB4B7  Unknown   Unknown
> > Unknown
> > libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown
> > Unknown
> > a.out  00010CE2  _MAIN__16
> > test.f90
> > a.out  00010BDC  Unknown   Unknown
> > Unknown
> > a.out  00010B74  Unknown   Unknown
> > Unknown
> >
> > on my iMac having compiled OpenMPI with ifort according to:
> > http://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers/
> >
> > Note that the above code snippet runs fine on my school parallel cluster
> > where ifort+intelmpi is installed.
> > Is there something special about OpenMPI's MPI_Allreduce function call
> > that I should be aware of?
> >
> > Thanks,
> > --
> >   Hugo Gagnon
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> 
> -- 
> Sent from my mobile device
> 
> David Zhang
> University of California, San Diego
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
-- 
  Hugo Gagnon



Re: [OMPI users] MPI_Allreduce on local machine

2010-07-27 Thread Gus Correa

Hi Hugo, David, Jeff, Terry, Anton, list

I suppose maybe we're guessing that somehow on Hugo's iMac 
MPI_DOUBLE_PRECISION may not have as many bytes as dp = kind(1.d0),

hence the segmentation fault on MPI_Allreduce.

Question:

Is there a simple way to check the number of bytes associated to each
MPI basic type of OpenMPI on a specific machine (or machine+compiler)?

Something that would come out easily, say, from ompi_info?

The information I get is C-centered:  :(

$ ompi_info --all |grep -i double
   C double size: 8
  C double align: 8

If not possible yet, please consider it a feature request ...  :)
(Or is this perhaps against the "opacity" in the MPI standard?)

I poked around on the OpenMPI include directory to no avail.
MPI_DOUBLE_PRECISION is defined as a constant (it is 17 here)
which doesn't seem to be related to the actual size in bytes.

I found some stuff on my OpenMPI config.log, though:

$ grep -i double_precision config.log
... (tons of lines)
ompi_cv_f77_alignment_DOUBLE_PRECISION=8
ompi_cv_f77_have_DOUBLE_PRECISION=yes
ompi_cv_f77_sizeof_DOUBLE_PRECISION=8
ompi_cv_f90_have_DOUBLE_PRECISION=yes
ompi_cv_f90_sizeof_DOUBLE_PRECISION=8
ompi_cv_find_type_DOUBLE_PRECISION=double
OMPI_SIZEOF_F90_DOUBLE_PRECISION='8'
#define OMPI_HAVE_FORTRAN_DOUBLE_PRECISION 1
#define OMPI_SIZEOF_FORTRAN_DOUBLE_PRECISION 8
#define OMPI_ALIGNMENT_FORTRAN_DOUBLE_PRECISION 8
#define ompi_fortran_double_precision_t double
#define OMPI_HAVE_F90_DOUBLE_PRECISION 1

Thank you,
Gus Correa



David Zhang wrote:

Try mpi_real8 for the type in allreduce

On 7/26/10, Hugo Gagnon  wrote:

Hello,

When I compile and run this code snippet:

  1 program test
  2
  3 use mpi
  4
  5 implicit none
  6
  7 integer :: ierr, nproc, myrank
  8 integer, parameter :: dp = kind(1.d0)
  9 real(kind=dp) :: inside(5), outside(5)
 10
 11 call mpi_init(ierr)
 12 call mpi_comm_size(mpi_comm_world, nproc, ierr)
 13 call mpi_comm_rank(mpi_comm_world, myrank, ierr)
 14
 15 inside = (/ 1, 2, 3, 4, 5 /)
 16 call mpi_allreduce(inside, outside, 5, mpi_double_precision,
 mpi_sum, mpi_comm_world, ierr)
 17
 18 print*, myrank, inside
 19 print*, outside
 20
 21 call mpi_finalize(ierr)
 22
 23 end program test

I get the following error, with say 2 processors:

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image  PCRoutineLine
Source
libmpi.0.dylib 0001001BB4B7  Unknown   Unknown
Unknown
libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown
Unknown
a.out  00010CE2  _MAIN__16
test.f90
a.out  00010BDC  Unknown   Unknown
Unknown
a.out  00010B74  Unknown   Unknown
Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image  PCRoutineLine
Source
libmpi.0.dylib 0001001BB4B7  Unknown   Unknown
Unknown
libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown
Unknown
a.out  00010CE2  _MAIN__16
test.f90
a.out  00010BDC  Unknown   Unknown
Unknown
a.out  00010B74  Unknown   Unknown
Unknown

on my iMac having compiled OpenMPI with ifort according to:
http://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers/

Note that the above code snippet runs fine on my school parallel cluster
where ifort+intelmpi is installed.
Is there something special about OpenMPI's MPI_Allreduce function call
that I should be aware of?

Thanks,
--
  Hugo Gagnon

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users







Re: [OMPI users] MPI_Allreduce on local machine

2010-07-27 Thread David Zhang
Try mpi_real8 for the type in allreduce

On 7/26/10, Hugo Gagnon  wrote:
> Hello,
>
> When I compile and run this code snippet:
>
>   1 program test
>   2
>   3 use mpi
>   4
>   5 implicit none
>   6
>   7 integer :: ierr, nproc, myrank
>   8 integer, parameter :: dp = kind(1.d0)
>   9 real(kind=dp) :: inside(5), outside(5)
>  10
>  11 call mpi_init(ierr)
>  12 call mpi_comm_size(mpi_comm_world, nproc, ierr)
>  13 call mpi_comm_rank(mpi_comm_world, myrank, ierr)
>  14
>  15 inside = (/ 1, 2, 3, 4, 5 /)
>  16 call mpi_allreduce(inside, outside, 5, mpi_double_precision,
>  mpi_sum, mpi_comm_world, ierr)
>  17
>  18 print*, myrank, inside
>  19 print*, outside
>  20
>  21 call mpi_finalize(ierr)
>  22
>  23 end program test
>
> I get the following error, with say 2 processors:
>
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image  PCRoutineLine
> Source
> libmpi.0.dylib 0001001BB4B7  Unknown   Unknown
> Unknown
> libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown
> Unknown
> a.out  00010CE2  _MAIN__16
> test.f90
> a.out  00010BDC  Unknown   Unknown
> Unknown
> a.out  00010B74  Unknown   Unknown
> Unknown
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image  PCRoutineLine
> Source
> libmpi.0.dylib 0001001BB4B7  Unknown   Unknown
> Unknown
> libmpi_f77.0.dyli  0001000AF046  Unknown   Unknown
> Unknown
> a.out  00010CE2  _MAIN__16
> test.f90
> a.out  00010BDC  Unknown   Unknown
> Unknown
> a.out  00010B74  Unknown   Unknown
> Unknown
>
> on my iMac having compiled OpenMPI with ifort according to:
> http://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers/
>
> Note that the above code snippet runs fine on my school parallel cluster
> where ifort+intelmpi is installed.
> Is there something special about OpenMPI's MPI_Allreduce function call
> that I should be aware of?
>
> Thanks,
> --
>   Hugo Gagnon
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Sent from my mobile device

David Zhang
University of California, San Diego


Re: [OMPI users] MPI_Allreduce on local machine

2010-07-27 Thread Anton Shterenlikht
On Tue, Jul 27, 2010 at 08:11:39AM -0400, Jeff Squyres wrote:
> On Jul 26, 2010, at 11:06 PM, Hugo Gagnon wrote:
> 
> >   8 integer, parameter :: dp = kind(1.d0)
> >   9 real(kind=dp) :: inside(5), outside(5)
> 
> I'm not a fortran expert -- is kind(1.d0) really double precision?  According 
> to http://gcc.gnu.org/onlinedocs/gcc-3.4.6/g77/Kind-Notation.html, kind(2) is 
> double precision (but that's for a different compiler, and I don't quite grok 
> the ".d0" notation).

*quote*
kind (x) has type default integer and value equal to the kind
type parameter value of x.
*end quote*

p. 161 from Metcalf et al (2007) Fortran 95/2003 explained.

% cat tmp.f90
program z

integer, parameter :: dp = kind(1.d0)
real(kind=dp) :: inside(5), outside(5)

write(*,*)dp

end program z
% g95 -L/usr/local/lib tmp.f90
% ./a.out
 8
% 

Kind 8 is (on most arch) 8-byte real, i.e. typically
double precision.

-- 
Anton Shterenlikht
Room 2.6, Queen's Building
Mech Eng Dept
Bristol University
University Walk, Bristol BS8 1TR, UK
Tel: +44 (0)117 331 5944
Fax: +44 (0)117 929 4423


Re: [OMPI users] MPI_Allreduce on local machine

2010-07-27 Thread Terry Frankcombe
On Tue, 2010-07-27 at 08:11 -0400, Jeff Squyres wrote:
> On Jul 26, 2010, at 11:06 PM, Hugo Gagnon wrote:
> 
> >   8 integer, parameter :: dp = kind(1.d0)
> >   9 real(kind=dp) :: inside(5), outside(5)
> 
> I'm not a fortran expert -- is kind(1.d0) really double precision?  According 
> to http://gcc.gnu.org/onlinedocs/gcc-3.4.6/g77/Kind-Notation.html, kind(2) is 
> double precision (but that's for a different compiler, and I don't quite grok 
> the ".d0" notation).
> 

Urgh!  Thank heavens gcc have moved away from that stupid idea.

kind=8 is normally double precision (and is with gfortran).  kind(1.0d0)
is always double precision.

The d (as opposed to e) means DP.