Hi Eric,

1 hour is normal for running the entire MPICH test suite. It looks like your failures are either dynamic process (e.g., spawn) or fault tolerance related tests. They should be irrelevant to the PETSc issue.

Min
On 2018/03/27 21:09, Eric Chamberland wrote:

Hi Min and Matthew,

In fact, I just ran an "hello word" on 2 processes and it works.  I do not have a more complicated example without PETSc since I have a Petsc-based source code...

However, I just tried to launch "make testing" into the mpich directory and it ended with some failed tests  and it was very long: about an hour.  Is it normal?

Please, see the attached file: summary.xml

Thanks,

Eric





On 27/03/18 05:04 PM, Matthew Knepley wrote:
On Tue, Mar 27, 2018 at 4:59 PM, Min Si <[email protected] <mailto:[email protected]>> wrote:

    Hi Eric,

    It will be great if you could give us a simple MPI program (not
    with PETSc) to reproduce this issue. If this is a problem happens
    only when PETSc is involved, the PETSc team can give you more
    suggestions.


Hi Min,

It is really easy to run PETSc at ANL. I am sure one of us can help if you cannot reproduce this bug on your own.

  Thanks,

     Matt

    Thanks,
    Min


    On 2018/03/27 15:38, Eric Chamberland wrote:

        Hi,

        since more than 2 weeks that the master branch of mpich is
        still and it can be reproduced with a simple "make test"
        after a fresh installation of PETSc...

        Is anyone testing it?

        Is it supposed to be working?

        Just tell me if I should "follow" another mpich branch please.

        Thanks,

        Eric


        On 14/03/18 03:35 AM, Eric Chamberland wrote:

            Hi,

            fwiw, the actual mpich/master branch doesn't passes the
            PETSc "make test" after a fresh installation...  It hangs
            just afer the 1 MPI process test, meaning it is locked
            into the 2 process test:

            make
            
PETSC_DIR=/pmi/cmpbib/compilation_BIB_dernier_mpich/COMPILE_AUTO/mpich-3.x-debug/petsc-3.8.3-debug
            PETSC_ARCH=arch-linux2-c-debug test
            Running test examples to verify correct installation
            Using
            
PETSC_DIR=/pmi/cmpbib/compilation_BIB_dernier_mpich/COMPILE_AUTO/mpich-3.x-debug/petsc-3.8.3-debug
            and PETSC_ARCH=arch-linux2-c-debug
            C/C++ example src/snes/examples/tutorials/ex19 run
            successfully with 1 MPI process




            ^Cmakefile:151: recipe for target 'test' failed
            make: [test] Interrupt (ignored)

            thanks,

            Eric

            On 13/03/18 08:07 AM, Eric Chamberland wrote:


                Hi,

                each night we are testing mpich/master with our
                petsc-based code. I don't know if PETSc team is doing
                the same thing with mpich/master?   (Maybe it is a
                good idea?)

                Everything was fine (except the issue
                https://github.com/pmodels/mpich/issues/2892
                <https://github.com/pmodels/mpich/issues/2892>) up to
                commit 7b8d64debd, but since commit
                mpich:a8a2b30fd21), I have a segfault on a any
                parallel nightly test.

                For example, a 2 process test ends at almost
                different execution points:

                rank 0:

                #003: /lib64/libpthread.so.0(+0xf870) [0x7f25bf908870]
                #004:
                
/pmi/cmpbib/compilation_BIB_dernier_mpich/COMPILE_AUTO/BIB/bin/BIBMEFGD.opt()
                [0x64a788]
                #005: /lib64/libc.so.6(+0x35140) [0x7f25bca18140]
                #006: /lib64/libc.so.6(__poll+0x2d) [0x7f25bcabfbfd]
                #007: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1e4cc9)
                [0x7f25bd90ccc9]
                #008: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1ea55c)
                [0x7f25bd91255c]
                #009: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0xba657)
                [0x7f25bd7e2657]
                #010:
                /opt/mpich-3.x_debug/lib/libmpi.so.0(PMPI_Waitall+0xe3)
                [0x7f25bd7e3343]
                #011:
                
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(PetscGatherMessageLengths+0x654)
                [0x7f25c4bb3193]
                #012:
                
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate_PtoS+0x859)
                [0x7f25c4e82d7f]
                #013:
                
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate+0x5684)
                [0x7f25c4e4d055]
                #014:
                
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhostWithArray+0x688)
                [0x7f25c4e01a39]
                #015:
                
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhost+0x179)
                [0x7f25c4e020f6]

                rank 1:

                #002:
                
/pmi/cmpbib/compilation_BIB_dernier_mpich/COMPILE_AUTO/GIREF/lib/libgiref_opt_Util.so(traitementSignal+0x2bd0)
                [0x7f62df8e7310]
                #003: /lib64/libc.so.6(+0x35140) [0x7f62d3bc9140]
                #004: /lib64/libc.so.6(__poll+0x2d) [0x7f62d3c70bfd]
                #005: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1e4cc9)
                [0x7f62d4abdcc9]
                #006: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1ea55c)
                [0x7f62d4ac355c]
                #007: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x12c9c5)
                [0x7f62d4a059c5]
                #008: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x12e102)
                [0x7f62d4a07102]
                #009: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0xf17a1)
                [0x7f62d49ca7a1]
                #010: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3facf)
                [0x7f62d4918acf]
                #011: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3fc3d)
                [0x7f62d4918c3d]
                #012: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0xf18d8)
                [0x7f62d49ca8d8]
                #013: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3fb88)
                [0x7f62d4918b88]
                #014: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3fc3d)
                [0x7f62d4918c3d]
                #015:
                /opt/mpich-3.x_debug/lib/libmpi.so.0(MPI_Barrier+0x27b)
                [0x7f62d4918edb]
                #016:
                
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(PetscCommGetNewTag+0x3ff)
                [0x7f62dbceb055]
                #017:
                
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(PetscObjectGetNewTag+0x15d)
                [0x7f62dbceaadb]
                #018:
                
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreateCommon_PtoS+0x1ee)
                [0x7f62dc03625c]
                #019:
                
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate_PtoS+0x29c4)
                [0x7f62dc035eea]
                #020:
                
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate+0x5684)
                [0x7f62dbffe055]
                #021:
                
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhostWithArray+0x688)
                [0x7f62dbfb2a39]
                #022:
                
/opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhost+0x179)
                [0x7f62dbfb30f6]

                Have some other users (PETSc users?) reported problem?

                Thanks,

                Eric

                ps: usual informations:

                mpich logs:
                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_config.log
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_config.log>

                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_config.system
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_config.system>

                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpich_version.txt
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpich_version.txt>

                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_c.txt
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_c.txt>

                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_m.txt
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_m.txt>

                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mi.txt
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_mi.txt>

                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_openmpa_config.log
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_openmpa_config.log>

                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpl_config.log
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpl_config.log>

                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_pm_hydra_config.log
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_pm_hydra_config.log>

                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_pm_hydra_tools_topo_config.log
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_pm_hydra_tools_topo_config.log>

                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpiexec_info.txt
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpiexec_info.txt>


                Petsc logs:
                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_configure.log
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_configure.log>

                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_make.log
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_make.log>

                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_default.log
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_default.log>

                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_RDict.log
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_RDict.log>

                
http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_CMakeLists.txt
                
<http://www.giref.ulaval.ca/%7Ecmpgiref/dernier_mpich/2018.03.12.05h39m54s_CMakeLists.txt>




        _______________________________________________
        discuss mailing list [email protected] <mailto:[email protected]>
        To manage subscription options or unsubscribe:
        https://lists.mpich.org/mailman/listinfo/discuss
        <https://lists.mpich.org/mailman/listinfo/discuss>





--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/%7Emk51/>


Reply via email to