Ok,

here is the 2 points answered:

#1) got valgrind output... here is the fatal free operation:

==107156== Invalid free() / delete / delete[] / realloc()
==107156== at 0x4C2A37C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==107156==    by 0x1E63CD5F: opal_free (malloc.c:184)
==107156== by 0x27622627: mca_pml_ob1_recv_request_fini (pml_ob1_recvreq.h:133) ==107156== by 0x27622C4F: mca_pml_ob1_recv_request_free (pml_ob1_recvreq.c:90)
==107156==    by 0x1D3EF9DC: ompi_request_free (request.h:362)
==107156==    by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59)
==107156==    by 0x14AE3B9C: VecScatterDestroy_PtoP (vpscat.c:219)
==107156==    by 0x14ADEB74: VecScatterDestroy (vscat.c:1860)
==107156==    by 0x14A8D426: VecDestroy_MPI (pdvec.c:25)
==107156==    by 0x14A33809: VecDestroy (vector.c:432)
==107156== by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&) (girefConfigurationPETSc.h:115) ==107156== by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc() (VecteurPETSc.cc:2292) ==107156== by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc() (VecteurPETSc.cc:287) ==107156== by 0x10BA9F48: VecteurPETSc::~VecteurPETSc() (VecteurPETSc.cc:281) ==107156== by 0x1135A57B: PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D() (PPReactionsAppuiEL3D.cc:216) ==107156== by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so)
==107156==    by 0x435702: main (Test.ProblemeGD.icc:381)
==107156== Address 0x1d6acbc0 is 0 bytes inside data symbol "ompi_mpi_double" --107156-- REDIR: 0x1dda2680 (libc.so.6:__GI_stpcpy) redirected to 0x4c2f330 (__GI_stpcpy)
==107156==
==107156== Process terminating with default action of signal 6 (SIGABRT): dumping core
==107156==    at 0x1DD520C7: raise (in /lib64/libc-2.19.so)
==107156==    by 0x1DD53534: abort (in /lib64/libc-2.19.so)
==107156==    by 0x1DD4B145: __assert_fail_base (in /lib64/libc-2.19.so)
==107156==    by 0x1DD4B1F1: __assert_fail (in /lib64/libc-2.19.so)
==107156== by 0x27626D12: mca_pml_ob1_send_request_fini (pml_ob1_sendreq.h:221) ==107156== by 0x276274C9: mca_pml_ob1_send_request_free (pml_ob1_sendreq.c:117)
==107156==    by 0x1D3EF9DC: ompi_request_free (request.h:362)
==107156==    by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59)
==107156==    by 0x14AE3C3C: VecScatterDestroy_PtoP (vpscat.c:225)
==107156==    by 0x14ADEB74: VecScatterDestroy (vscat.c:1860)
==107156==    by 0x14A8D426: VecDestroy_MPI (pdvec.c:25)
==107156==    by 0x14A33809: VecDestroy (vector.c:432)
==107156== by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&) (girefConfigurationPETSc.h:115) ==107156== by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc() (VecteurPETSc.cc:2292) ==107156== by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc() (VecteurPETSc.cc:287) ==107156== by 0x10BA9F48: VecteurPETSc::~VecteurPETSc() (VecteurPETSc.cc:281) ==107156== by 0x1135A57B: PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D() (PPReactionsAppuiEL3D.cc:216) ==107156== by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so)
==107156==    by 0x435702: main (Test.ProblemeGD.icc:381)


#2) For the run with -vecscatter_alltoall it works...!

As an "end user", should I ever modify these VecScatterCreate options? How do they change the performances of the code on large problems?

Thanks,

Eric

On 25/07/16 02:57 PM, Matthew Knepley wrote:
On Mon, Jul 25, 2016 at 11:33 AM, Eric Chamberland
<eric.chamberl...@giref.ulaval.ca
<mailto:eric.chamberl...@giref.ulaval.ca>> wrote:

    Hi,

    has someone tried OpenMPI 2.0 with Petsc 3.7.2?

    I am having some errors with petsc, maybe someone have them too?

    Here are the configure logs for PETSc:

    
http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log

    
http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log

    And for OpenMPI:
    
http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log

    (in fact, I am testing the ompi-release branch, a sort of
    petsc-master branch, since I need the commit 9ba6678156).

    For a set of parallel tests, I have 104 that works on 124 total tests.


It appears that the fault happens when freeing the VecScatter we build
for MatMult, which contains Request structures
for the ISends and  IRecvs. These looks like internal OpenMPI errors to
me since the Request should be opaque.
I would try at least two things:

1) Run under valgrind.

2) Switch the VecScatter implementation. All the options are here,

  
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate

but maybe use alltoall.

  Thanks,

     Matt


    And the typical error:
    *** Error in
    
`/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev':
    free(): invalid pointer:
    ======= Backtrace: =========
    /lib64/libc.so.6(+0x7277f)[0x7f80eb11677f]
    /lib64/libc.so.6(+0x78026)[0x7f80eb11c026]
    /lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53]
    /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60]
    /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628]
    /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50]
    /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd]
    /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334]

    a similar one:
    *** Error in
    
`/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev':
    free(): invalid pointer: 0x00007f382a7c5bc0 ***
    ======= Backtrace: =========
    /lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f]
    /lib64/libc.so.6(+0x78026)[0x7f3829f22026]
    /lib64/libc.so.6(+0x78d53)[0x7f3829f22d53]
    /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60]
    /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628]
    /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50]
    /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd]
    /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f38323c8588]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x4e2)[0x7f383287f87a]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f383287ffd9]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f3832b20334]

    another one:

    *** Error in
    
`/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.MortierDiffusion.dev':
    free(): invalid pointer: 0x00007f67b6d37bc0 ***
    ======= Backtrace: =========
    /lib64/libc.so.6(+0x7277f)[0x7f67b648e77f]
    /lib64/libc.so.6(+0x78026)[0x7f67b6494026]
    /lib64/libc.so.6(+0x78d53)[0x7f67b6494d53]
    /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f67b5c71d60]
    /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1adae)[0x7f67aa4cddae]
    /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7f67aa4ce4ca]
    /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f67b6a7a9dd]
    /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f67b6a7aad6]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7f67be31eb09]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f67be319c45]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574f7)[0x7f67be2c84f7]
    
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7f67be26e8da]

    I feel like I should wait until someone else from Petsc have tested
    it too...

    Thanks,

    Eric




--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener

Reply via email to