[OMPI users] Fortran MPI module and gfortran

2014-03-30 Thread W Spector

Hi,

The mpi.mod file that is created from both the openmpi-1.7.4 and 
openmpi-1.8rc1 tarballs does not seem to be generating interface blocks 
for the Fortran API - whether the calls use choice buffers or not.


I initially tried the default gfortran on my system - 4.7.2.  The 
configure commands are:


export CC=gcc
export CXX=g++
export FC=gfortran
export F90=gfortran
./configure --prefix=/home/wws/openmpi_gfortran  \
--enable-mpi-fortran --enable-mpi-thread-multiple \
--enable-mpirun-prefix-by-default  \
2>&1 | tee config.gfortran.out

The relevant configure output reads:


checking if building Fortran mpif.h bindings... yes
checking for Fortran compiler module include flag... -I
checking Fortran compiler ignore TKR syntax... not cached; checking variants
checking for Fortran compiler support of TYPE(*), DIMENSION(*)... no
checking for Fortran compiler support of !DEC$ ATTRIBUTES NO_ARG_CHECK... no
checking for Fortran compiler support of !$PRAGMA IGNORE_TKR... no
checking for Fortran compiler support of !DIR$ IGNORE_TKR... no
checking for Fortran compiler support of !IBM* IGNORE_TKR... no
checking Fortran compiler ignore TKR syntax... 0:real:!
checking if building Fortran 'use mpi' bindings... yes
checking if building Fortran 'use mpi_f08' bindings... no


I have also tried using a version of the 4.9 trunk that I generated from 
a March 18th, 2014 snapshot of the gcc trunk.  This latter compiler 
supports some of the TS 29 features.  (I set the latter by setting PATH 
to find the 4.9 compilers first.  I also set the F90 and FC environment 
variables to point to the 4.9 compiler.)


make clean
export PATH=/usr/local/gcc-trunk/bin:$PATH
export CC=gcc
export CXX=g++
export FC=/usr/local/gcc-trunk/bin/gfortran
export F90=/usr/local/gcc-trunk/bin/gfortran
./configure --prefix=/home/wws/openmpi_gfortran49  \
--enable-mpi-fortran --enable-mpi-thread-multiple \
--enable-mpirun-prefix-by-default  \
2>&1 | tee config.gfortran49.out

The configure output is identical to the 4.7 compiler.  Note that it did 
NOT recognize that gfortran now supports the !GCC$ ATTRIBUTE 
NO_ARG_CHECK directive, nor did it recognize that gfortran also accepts 
'TYPE(*), DIMENSION(*)'.


I have also verified with strace that the proper mpi.mod file is being 
accessed when I am trying to USE the mpi module.


I have not dug into the openmpi code yet.  Just wondering if this is a 
known problem before I start?  Or did I do something wrong during configure?


Walter Spector


Re: [OMPI users] ierr vs ierror in F90 mpi module

2014-05-15 Thread W Spector

Hi Jeff and the list,

A year ago, we had the discussion appended below.  I just downloaded 
v1.8.1 and the F90 module is still very broken.  And once again I am 
having to modify my local version.  (+1 for open source!)  Will it be 
fixed in v1.8.2?


Configure is using the "use-mpi-tkr" version on my system.  I can see 
that the "use-mpi-f08" version is much better.


Walter

On 04/26/2013 03:14 PM, Jeff Squyres (jsquyres) wrote:

I committed that part; thanks.

On Apr 26, 2013, at 5:51 PM, W Spector  wrote:


Hi Jeff,

To take care of the ierr->ierror conversion, simply do the following:

  cd openmpi-1.7.1/ompi/mpi/fortran/use-mpi-tkr/scripts
  ls -1 *.sh | xargs -i -t ex -c ":1,\$s?ierr?ierror?" -c ":wq" {}

Then go up a level to openmpi-1.7.1/ompi/mpi/fortran/use-mpi-tk and use:

  cd ..
  ls -1 fort*.in | xargs -i -t ex -c ":1,\$s?ierr?ierror?" -c ":wq" {}

Last, the use-mpi-ignore-tkr directory:

  cd ../use-mpi-ignore-tkr
  ls -1 mpi*.in | xargs -i -t ex -c ":1,\$s?ierr?ierror?" -c ":wq" {}

As you can tell from the below, I needed to use a few MPI_Type calls. So fixed 
the few that I needed in the openmpi-1.7.1/ompi/mpi/fortran/use-mpi-tkr/scripts 
directory.  I didn't exhaustively go through and verify every interface in the 
whole MPI library.

Walter

On 04/26/2013 11:53 AM, Jeff Squyres (jsquyres) wrote:

On Apr 25, 2013, at 10:52 PM, W Spector  wrote:
...

I went into the openmpi-1.7.1/ompi/mpi/fortran/use-mpi-tkr/scripts directory 
and modified the files to use ierror instead of ierr.  (One well-crafted line 
of shell script.)  Did the same with a couple of .h.in files in the use-mpi-tkr 
and use-mpi-ignore-tkr directories, and 
use-mpi-tkr/attr_fn-f90-interfaces.h.in.  (One editor command each.)

With the above, the mpi module is in much better shape.  However there are 
still some scattered incorrect non-ierror argument names.  A few examples from 
the code I am working with:

  MPI_Type_create_struct: The 2nd argument should be "array_of_blocklengths", instead of 
"array_of_block_lengths"

  MPI_Type_commit: "datatype" instead of "type"

  MPI_Type_free: Again, "datatype" instead of "type"

There are more...


Cool.  Any chance you could send us a patch?


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





[OMPI users] False positive from valgrind in sec_basic.c

2014-05-21 Thread W Spector

Hi,

When running under valgrind, I get warnings from each MPI process at 
MPI_Init time.  The warnings come from function sec_basic.c at lines 70 
and 71 (openmpi v1.8.1):


my_cred.credential = strdup("12345");
my_cred.size = strlen(my_cred.credential)+1;  // include 
the NULL


This is because strdup(3c) and strlen(3c) are apparently optimized to 
use 4-byte integer loads to speed up the copy and search operations, and 
"overrun" the malloced area.  (In fact, since malloc tends to pad 
allocations, it is safe.  But valgrind doesn't know that.)


Since the "12345" appears to be a dummy string, would it be ok to add a 
couple of additional characters in the strdup call to:


my_cred.credential = strdup("1234567");

This gives an 8 byte string (counting the NULL) and quiets valgrind down.

Walter


[OMPI users] Valgrind reports lots of memory leakage

2014-05-30 Thread W Spector

Hi,

I have been doing a lot of testing/fixing lately on our code, using 
valgrind to find problems.  Unfortunately, OpenMPI causes a lot of 
'false positives' in our testing due to memory leaks of its own.


It appears that MPI_Init allocates a lot of memory blocks that 
MPI_Finalize never bothers to clean up.  (Perhaps some should be cleaned 
up during the MPI_Init process itself?)  There are also a couple of 
blocks that are created during MPI_Finalize that are not freed.


Appended is a trivial 'hello world' program which demonstrates this 
using valgrind.  Rerunning with the valgrind --leak-check=full option 
shows a plethora of objects which are not deallocated.


In these runs, OpenMPI is at version 1.8.1, but older versions also have 
the problem.


Walter

wws@w6ws-4:/tmp$ cat hellompi.f90
program hellompi
  use mpi
  implicit none

  integer :: mpierr

  call MPI_INIT (ierror=mpierr)
  print *, 'hello world!'
  call MPI_FINALIZE (ierror=mpierr)

end program
wws@w6ws-4:/tmp$ mpif90 --version hellompi.f90
GNU Fortran (Ubuntu 4.8.2-19ubuntu1) 4.8.2
Copyright (C) 2013 Free Software Foundation, Inc.

GNU Fortran comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of GNU Fortran
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING

wws@w6ws-4:/tmp$ mpif90 hellompi.f90
wws@w6ws-4:/tmp$ valgrind a.out
==6897== Memcheck, a memory error detector
==6897== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==6897== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for 
copyright info

==6897== Command: a.out
==6897==
 hello world!
==6897==
==6897== HEAP SUMMARY:
==6897== in use at exit: 23,899 bytes in 110 blocks
==6897==   total heap usage: 15,436 allocs, 15,326 frees, 14,034,006 
bytes allocated

==6897==
==6897== LEAK SUMMARY:
==6897==definitely lost: 13,159 bytes in 26 blocks
==6897==indirectly lost: 2,800 bytes in 13 blocks
==6897==  possibly lost: 0 bytes in 0 blocks
==6897==still reachable: 7,940 bytes in 71 blocks
==6897== suppressed: 0 bytes in 0 blocks
==6897== Rerun with --leak-check=full to see details of leaked memory
==6897==
==6897== For counts of detected and suppressed errors, rerun with: -v
==6897== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
wws@w6ws-4:/tmp$
wws@w6ws-4:/tmp$
wws@w6ws-4:/tmp$
wws@w6ws-4:/tmp$
wws@w6ws-4:/tmp$ valgrind --leak-check=full a.out
==6932== Memcheck, a memory error detector
==6932== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==6932== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for 
copyright info

==6932== Command: a.out
==6932==
 hello world!
==6932==
==6932== HEAP SUMMARY:
==6932== in use at exit: 23,899 bytes in 110 blocks
==6932==   total heap usage: 15,438 allocs, 15,328 frees, 14,034,092 
bytes allocated

==6932==
==6932== 1 bytes in 1 blocks are definitely lost in loss record 2 of 89
==6932==at 0x4C2AB80: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)

==6932==by 0x917CAD0: ???
==6932==by 0x5AAD346: opal_db_base_store (db_base_fns.c:49)
==6932==by 0x57B12A2: ompi_modex_send_string 
(ompi_module_exchange.c:119)

==6932==by 0x57AD42A: ompi_mpi_init (ompi_mpi_init.c:511)
==6932==by 0x57CE572: PMPI_Init (pinit.c:84)
==6932==by 0x4E777C4: MPI_INIT (pinit_f.c:82)
==6932==by 0x400B33: MAIN__ (in /tmp/a.out)
==6932==by 0x400BD5: main (in /tmp/a.out)
==6932==
==6932== 6 bytes in 1 blocks are definitely lost in loss record 3 of 89
==6932==at 0x4C2AB80: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)

==6932==by 0x66BA951: orte_register_params (orte_mca_params.c:719)
==6932==by 0x66B1042: orte_init (orte_init.c:107)
==6932==by 0x57AD39C: ompi_mpi_init (ompi_mpi_init.c:464)
==6932==by 0x57CE572: PMPI_Init (pinit.c:84)
==6932==by 0x4E777C4: MPI_INIT (pinit_f.c:82)
==6932==by 0x400B33: MAIN__ (in /tmp/a.out)
==6932==by 0x400BD5: main (in /tmp/a.out)
==6932==
==6932== 8 bytes in 1 blocks are definitely lost in loss record 4 of 89
==6932==at 0x4C2AB80: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)

==6932==by 0xAFB7C33: ???
==6932==by 0x57AE12F: ompi_mpi_finalize (ompi_mpi_finalize.c:143)
==6932==by 0x4E74878: mpi_finalize (pfinalize_f.c:69)
==6932==by 0x400B9F: MAIN__ (in /tmp/a.out)
==6932==by 0x400BD5: main (in /tmp/a.out)
==6932==
==6932== 17 (16 direct, 1 indirect) bytes in 1 blocks are definitely 
lost in loss record 8 of 89
==6932==at 0x4C2AB80: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)

==6932==by 0x917D30D: ???
==6932==by 0x5AAD58A: opal_db_base_fetch (db_base_fns.c:133)
==6932==by 0x57FEDF0: ompi_rte_db_fetch (rte_orte_module.c:281)
==6932==by 0x57B12DF: ompi_modex_recv_string 
(ompi_module_exchange.c:138)

==6932==by 0x579CE2A: ompi_comm_cid_init (comm_cid.c:164)
==6932==by 0x5

Re: [OMPI users] ierr vs ierror in F90 mpi module

2014-06-03 Thread W Spector

Jeff Squyres wrote:
> Did you find any other places where we accidentally had ierr instead 
of ierror?


I will have to check the trunk and see.

The only place I know of where the Standard wants IERR instead of IERROR 
is with the user-defined subroutines for MPI_KEYVAL_CREATE - which is 
deprecated.  And that is only important if you create a proper interface 
spec for the COPY_FN and DELETE_FN arguments - rather than simply 
declaring them EXTERNAL.


The user-defined subroutines for the newer MPI_COMM_KEYVAL_CREATE call 
use IERROR.


I had found some other keyword issues last year besides IERROR.  I only 
reported the IERROR arguments because they were so pervasive across 
almost every interface definition.  Again, I will need to recheck the trunk.


Walter


[OMPI users] ierr vs ierror in F90 mpi module

2013-04-24 Thread W Spector

Hi,

The MPI Standard specifies to use 'ierror' for the final argument in 
most Fortran MPI calls.  However the Openmpi f90 module defines it as 
being 'ierr'.  This messes up those who want to use keyword=value syntax 
in their calls.


I just checked the latest 1.6.4 release and it is still broken.

Is this something that can be fixed?

Walter


Re: [OMPI users] ierr vs ierror in F90 mpi module

2013-04-25 Thread W Spector

Hi Jeff,

I just downloaded 1.7.1.  The new files in the use-mpi-f08 look great!

However the use-mpi-tkr and use-mpi-ignore-tkr directories don't fare so 
well.  Literally all the interfaces are still 'ierr'.


While I realize that both the F90 mpi module and interface checking, 
were optional prior to MPI 3.0, the final argument has been called 
'ierror' since MPI 1!  This really should be fixed.


Walter

On 04/24/2013 06:08 PM, Jeff Squyres (jsquyres) wrote:

Can you try v1.7.1?

We did a major Fortran revamp in the 1.7.x series to bring it up to speed with 
MPI-3 Fortran stuff (at least mostly).  I mention MPI-3 because the name-based 
parameter passing stuff wasn't guaranteed until MPI-3.  I think 1.7.x should 
have gotten all the name-based parameter passing stuff correct (please let me 
know if you find any bugs!).

Just to be clear: it is unlikely that we'll be updating the Fortran support in 
the 1.6.x series.


On Apr 24, 2013, at 8:52 PM, W Spector 
  wrote:


Hi,

The MPI Standard specifies to use 'ierror' for the final argument in most 
Fortran MPI calls.  However the Openmpi f90 module defines it as being 'ierr'.  
This messes up those who want to use keyword=value syntax in their calls.

I just checked the latest 1.6.4 release and it is still broken.

Is this something that can be fixed?

Walter
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] ierr vs ierror in F90 mpi module

2013-04-25 Thread W Spector

Jeff,

I tried building 1.7.1 on my Ubuntu system.  The default gfortran is 
v4.6.3, so configure won't enable the mpi_f08 module build.  I also 
tried a three week old snapshot of the gfortran 4.9 trunk.  This has 
Tobias's new TYPE(*) in it, but not his latest !GCC$ attributes 
NO_ARG_CHECK stuff.  However configure still won't enable the mpi_f08 
module.


Is there a trick to getting a recent gfortran to compile the mpi_f08 module?

I went into the openmpi-1.7.1/ompi/mpi/fortran/use-mpi-tkr/scripts 
directory and modified the files to use ierror instead of ierr.  (One 
well-crafted line of shell script.)  Did the same with a couple of .h.in 
files in the use-mpi-tkr and use-mpi-ignore-tkr directories, and 
use-mpi-tkr/attr_fn-f90-interfaces.h.in.  (One editor command each.)


With the above, the mpi module is in much better shape.  However there 
are still some scattered incorrect non-ierror argument names.  A few 
examples from the code I am working with:


  MPI_Type_create_struct: The 2nd argument should be 
"array_of_blocklengths", instead of "array_of_block_lengths"


  MPI_Type_commit: "datatype" instead of "type"

  MPI_Type_free: Again, "datatype" instead of "type"

There are more...

Walter

On 04/25/2013 06:50 AM, W Spector wrote:

Hi Jeff,

I just downloaded 1.7.1.  The new files in the use-mpi-f08 look great!

However the use-mpi-tkr and use-mpi-ignore-tkr directories don't fare so
well.  Literally all the interfaces are still 'ierr'.

While I realize that both the F90 mpi module and interface checking,
were optional prior to MPI 3.0, the final argument has been called
'ierror' since MPI 1!  This really should be fixed.

Walter

On 04/24/2013 06:08 PM, Jeff Squyres (jsquyres) wrote:

Can you try v1.7.1?

We did a major Fortran revamp in the 1.7.x series to bring it up to
speed with MPI-3 Fortran stuff (at least mostly).  I mention MPI-3
because the name-based parameter passing stuff wasn't guaranteed until
MPI-3.  I think 1.7.x should have gotten all the name-based parameter
passing stuff correct (please let me know if you find any bugs!).

Just to be clear: it is unlikely that we'll be updating the Fortran
support in the 1.6.x series.


On Apr 24, 2013, at 8:52 PM, W Spector 
  wrote:


Hi,

The MPI Standard specifies to use 'ierror' for the final argument in
most Fortran MPI calls.  However the Openmpi f90 module defines it as
being 'ierr'.  This messes up those who want to use keyword=value
syntax in their calls.

I just checked the latest 1.6.4 release and it is still broken.

Is this something that can be fixed?

Walter
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] ierr vs ierror in F90 mpi module

2013-04-26 Thread W Spector

Hi Jeff,

To take care of the ierr->ierror conversion, simply do the following:

  cd openmpi-1.7.1/ompi/mpi/fortran/use-mpi-tkr/scripts
  ls -1 *.sh | xargs -i -t ex -c ":1,\$s?ierr?ierror?" -c ":wq" {}

Then go up a level to openmpi-1.7.1/ompi/mpi/fortran/use-mpi-tk and use:

  cd ..
  ls -1 fort*.in | xargs -i -t ex -c ":1,\$s?ierr?ierror?" -c ":wq" {}

Last, the use-mpi-ignore-tkr directory:

  cd ../use-mpi-ignore-tkr
  ls -1 mpi*.in | xargs -i -t ex -c ":1,\$s?ierr?ierror?" -c ":wq" {}

As you can tell from the below, I needed to use a few MPI_Type calls. 
So fixed the few that I needed in the 
openmpi-1.7.1/ompi/mpi/fortran/use-mpi-tkr/scripts directory.  I didn't 
exhaustively go through and verify every interface in the whole MPI library.


Walter

On 04/26/2013 11:53 AM, Jeff Squyres (jsquyres) wrote:

On Apr 25, 2013, at 10:52 PM, W Spector  wrote:
...

I went into the openmpi-1.7.1/ompi/mpi/fortran/use-mpi-tkr/scripts directory 
and modified the files to use ierror instead of ierr.  (One well-crafted line 
of shell script.)  Did the same with a couple of .h.in files in the use-mpi-tkr 
and use-mpi-ignore-tkr directories, and 
use-mpi-tkr/attr_fn-f90-interfaces.h.in.  (One editor command each.)

With the above, the mpi module is in much better shape.  However there are 
still some scattered incorrect non-ierror argument names.  A few examples from 
the code I am working with:

  MPI_Type_create_struct: The 2nd argument should be "array_of_blocklengths", instead of 
"array_of_block_lengths"

  MPI_Type_commit: "datatype" instead of "type"

  MPI_Type_free: Again, "datatype" instead of "type"

There are more...


Cool.  Any chance you could send us a patch?



[OMPI users] 1.7.1 Hang with MPI_THREAD_MULTIPLE set

2013-05-31 Thread W Spector

Dear OpenMPI group,

The following trivial program hangs on the mpi_barrier call with 1.7.1. 
 I am using gfortran/gcc 4.6.3 on Ubuntu linux.  OpenMPI was built with 
--enable-mpi-thread-multiple support and no other options (other than 
--prefix).


Are there additional options we should be telling configure about?  Or 
have we done something very silly?  Mpich2 works just fine...


Walter Spector


program hang
  use mpi
  implicit none

  integer :: me, npes
  integer :: mpierr, provided
  logical :: iampe0

  call mpi_init_thread (  &
  MPI_THREAD_MULTIPLE,  &
  provided,  &
  mpierr)
  print *, 'hello, world!'

! Hangs here with MPI_THREAD_MULTIPLE set...
  call mpi_barrier (MPI_COMM_WORLD, mpierr)

  call mpi_comm_rank (MPI_COMM_WORLD, me, mpierr)
  iampe0 = me == 0
  call mpi_comm_size (MPI_COMM_WORLD, npes, mpierr)
  print *, 'pe:', me, ', total comm size:', npes
  print *, 'I am ', trim (merge ('PE 0', 'not PE 0', iampe0))

  call mpi_finalize (mpierr)

end program


Re: [OMPI users] 1.7.1 Hang with MPI_THREAD_MULTIPLE set

2013-06-04 Thread W Spector

On 06/04/2013 03:23 AM, Jeff Squyres (jsquyres) wrote:

On Jun 3, 2013, at 5:06 AM, Paul Kapinos  wrote:


It is more or less well-known that MPI_THREAD_MULTIPLE disable the OpenFabric / 
InfiniBand networking in Open MPI:

http://www.open-mpi.org/faq/?category=supported-systems#thread-support
http://www.open-mpi.org/community/lists/users/2010/03/12345.php


Yes, this is true -- MPI_THREAD_MULITPLE support is fairly incomplete in Open 
MPI.


One would hope a simple MPI_Barrier call would work though...

My home linux system is nothing sophisticated.  Just a quad core I-5 on 
a Intel DP55WB motherboard and Ubuntu Linux.  No fancy interconnects.


Walter