On Sep 15, 2014, at 1:01 AM, Matthew Knepley <[email protected]> wrote:

> On Sun, Sep 14, 2014 at 5:56 PM, Christophe Prud'homme 
> <[email protected]> wrote:
> Hello
> 
> Pierre and I found this problem occurring using openmpi 1.8.1 petsc 3.5.1 
> (this was done on my side using homebrew on osx)
> we haven't checked mpich. However it is disturbing that removing the petsc.h 
> header actually solve the problem.
> 
> from the mpi standard, if we understand correctly, we shouldn't have to 
> specify the data type for MPI_IN_PLACE scatter/gather operations. To avoid 
> either a crash with, deadlock(scatter) or wrong results (gather) with 
> boost::mpi we need to specify the data type.
> 
> In our code Feel++ we have added some tests to verify this behavior [1]
> basically
>  - mpi alone OK
>  - boost::mpi alone OK (boost::mpi is just used as an alternative to 
> initialize mpi)
>  - mpi+petsc.h  with MPI_DATATYPE_NULL in scatter crash
>  - mpi+petsc.h+proper datatype OK
>  - boost::mpi.h+petsc  with MPI_DATATYPE_NULL hangs in scatter
>  - boost::mpi.h+petsc+proper datatype OK
> 
>  1. 
> https://github.com/feelpp/feelpp/blob/develop/testsuite/feelcore/test_gatherscatter.cpp
> 
> 1) I think this is an OpenMPI bug
> 
> 2) I think this because I cannot reproduce with MPICH, and it is not valgrind 
> clean
> 
> 3) OpenMPI has lots of bugs. If you guys can reproduce with MPICH, I will 
> track it
> down and fix it.

True, but they fixed the similar wrong behavior (which was inherited from 
libNBC) when I pinged them 
(http://www.open-mpi.org/community/lists/users/2013/11/23034.php).
BTW, I had the same problem with master and MPICH 3.1.

Anyways, thanks Barry for the quick fix.

Pierre

>   Matt
>  
> 
> Best regards
> C.
> 
> On Mon, Sep 15, 2014 at 12:37 AM, Matthew Knepley <[email protected]> wrote:
> On Sun, Sep 14, 2014 at 4:16 PM, Pierre Jolivet <[email protected]> 
> wrote:
> Hello,
> Could you please explain to me why the following example is not working 
> properly when <petsc.h> (from master, with OpenMPI 1.8.1) is included ?
> 
> $ mpicxx in-place.cpp  -I$PETSC_DIR/include -I$PETSC_DIR/$PETSC_ARCH/include 
> -L$PETSC_DIR/$PETSC_ARCH/lib -lpetsc
> $ mpirun -np 2 ./a.out
> Done with the scatter !
> 0 0 0 0 (this line should be filled with 0)
> 1 1 1 1 (this line should be filled with 1)
> Done with the gather !
> 
> $ mpicxx in-place.cpp  -I$PETSC_DIR/include -I$PETSC_DIR/$PETSC_ARCH/include 
> -L$PETSC_DIR/$PETSC_ARCH/lib -lpetsc -DPETSC_BUG
> $ mpirun -np 2 ./a.out
> [:3367] *** An error occurred in MPI_Type_size
> [:3367] *** reported by process [4819779585,140733193388032]
> [:3367] *** on communicator MPI_COMM_WORLD
> [:3367] *** MPI_ERR_TYPE: invalid datatype
> [:3367] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
> abort,
> [:3367] ***    and potentially your MPI job)
> 
> I just built this with MPICH and it runs fine:
> 
> master:/PETSc3/petsc/petsc-pylith$ 
> /PETSc3/petsc/petsc-pylith/arch-pylith-cxx-debug/bin/mpiexec -host localhost 
> -n 2 
> /PETSc3/petsc/petsc-pylith/arch-pylith-cxx-debug/lib/in-place-obj/in-place 
> Done with the scatter !
> 0 0 0 0 (this line should be filled with 0)
> 1 1 1 1 (this line should be filled with 1)
> Done with the gather !
> 
> Will valgrind.
> 
>    Matt
>  
> Thank you for looking,
> Pierre
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
> 
> 
> 
> -- 
> Christophe Prud'homme
> Feel++ Project Manager
> Professor in Applied Mathematics 
> @ Université Joseph Fourier (Grenoble, France)
> @ Université de Strasbourg (France)
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener

Reply via email to