Hi,

each night we are testing mpich/master with our petsc-based code.  I don't know if PETSc team is doing the same thing with mpich/master?   (Maybe it is a good idea?)


Everything was fine (except the issue https://github.com/pmodels/mpich/issues/2892) up to commit 7b8d64debd, but since commit mpich:a8a2b30fd21), I have a segfault on a any parallel nightly test.

For example, a 2 process test ends at almost different execution points:

rank 0:

#003: /lib64/libpthread.so.0(+0xf870) [0x7f25bf908870]
#004: 
/pmi/cmpbib/compilation_BIB_dernier_mpich/COMPILE_AUTO/BIB/bin/BIBMEFGD.opt() 
[0x64a788]
#005: /lib64/libc.so.6(+0x35140) [0x7f25bca18140]
#006: /lib64/libc.so.6(__poll+0x2d) [0x7f25bcabfbfd]
#007: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1e4cc9) [0x7f25bd90ccc9]
#008: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1ea55c) [0x7f25bd91255c]
#009: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0xba657) [0x7f25bd7e2657]
#010: /opt/mpich-3.x_debug/lib/libmpi.so.0(PMPI_Waitall+0xe3) [0x7f25bd7e3343]
#011: 
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(PetscGatherMessageLengths+0x654)
 [0x7f25c4bb3193]
#012: 
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate_PtoS+0x859)
 [0x7f25c4e82d7f]
#013: 
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate+0x5684)
 [0x7f25c4e4d055]
#014: 
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhostWithArray+0x688)
 [0x7f25c4e01a39]
#015: 
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhost+0x179)
 [0x7f25c4e020f6]

rank 1:

#002: 
/pmi/cmpbib/compilation_BIB_dernier_mpich/COMPILE_AUTO/GIREF/lib/libgiref_opt_Util.so(traitementSignal+0x2bd0)
 [0x7f62df8e7310]
#003: /lib64/libc.so.6(+0x35140) [0x7f62d3bc9140]
#004: /lib64/libc.so.6(__poll+0x2d) [0x7f62d3c70bfd]
#005: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1e4cc9) [0x7f62d4abdcc9]
#006: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1ea55c) [0x7f62d4ac355c]
#007: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x12c9c5) [0x7f62d4a059c5]
#008: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x12e102) [0x7f62d4a07102]
#009: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0xf17a1) [0x7f62d49ca7a1]
#010: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3facf) [0x7f62d4918acf]
#011: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3fc3d) [0x7f62d4918c3d]
#012: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0xf18d8) [0x7f62d49ca8d8]
#013: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3fb88) [0x7f62d4918b88]
#014: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3fc3d) [0x7f62d4918c3d]
#015: /opt/mpich-3.x_debug/lib/libmpi.so.0(MPI_Barrier+0x27b) [0x7f62d4918edb]
#016: 
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(PetscCommGetNewTag+0x3ff)
 [0x7f62dbceb055]
#017: 
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(PetscObjectGetNewTag+0x15d)
 [0x7f62dbceaadb]
#018: 
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreateCommon_PtoS+0x1ee)
 [0x7f62dc03625c]
#019: 
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate_PtoS+0x29c4)
 [0x7f62dc035eea]
#020: 
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate+0x5684)
 [0x7f62dbffe055]
#021: 
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhostWithArray+0x688)
 [0x7f62dbfb2a39]
#022: 
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhost+0x179)
 [0x7f62dbfb30f6]

Have some other users (PETSc users?) reported problem?

Thanks,

Eric

ps: usual informations:

mpich logs:

http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_config.log
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_config.system
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpich_version.txt
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_c.txt
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_m.txt
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mi.txt
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_openmpa_config.log
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpl_config.log
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_pm_hydra_config.log
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_pm_hydra_tools_topo_config.log
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpiexec_info.txt

Petsc logs:
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_configure.log
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_make.log
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_default.log
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_RDict.log
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_CMakeLists.txt



Reply via email to