> On 11 Mar 2021, at 8:46 AM, Pierre Jolivet <[email protected]> wrote: > >> On 11 Mar 2021, at 6:16 AM, Barry Smith <[email protected] >> <mailto:[email protected]>> wrote: >> >> Eric, >> >> Sorry about not being more immediate. We still have this in our active >> email so you don't need to submit individual issues. We'll try to get to >> them as soon as we can. > > Indeed, I’m still trying to figure this out. > I realized that some of my configure flags were different than yours, e.g., > no --with-memalign. > I’ve also added SuperLU_DIST to my installation. > Still, I can’t reproduce any issue. > I will continue looking into this, it appears I’m seeing some valgrind > errors, but I don’t know if this is some side effect of OpenMPI not being > valgrind-clean (last time I checked, there was no error with MPICH).
It looks like Valgrind + OpenMPI (+ OpenMP?) is complaining about uninitialized memory in PetscSFGetMultiSF(). Could you please try out the following branch https://gitlab.com/petsc/petsc/-/commits/jolivet/fix-valgrind-openmpi <https://gitlab.com/petsc/petsc/-/commits/jolivet/fix-valgrind-openmpi> ? I’m not sure why there would be such a warning with OpenMPI and not with MPICH, and it is unlikely to fix anything, but for good measure, after compilation, could you please try: $ make -f gmakefile test search='snes_tutorials-ex12_quad_hpddm_reuse_baij’ Thanks, Pierre > Thank you for your patience, > Pierre > > /usr/bin/gmake -f gmakefile test test-fail=1 > Using MAKEFLAGS: test-fail=1 > TEST > arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex12_quad_hpddm_reuse_baij.counts > ok snes_tutorials-ex12_quad_hpddm_reuse_baij > ok diff-snes_tutorials-ex12_quad_hpddm_reuse_baij > TEST > arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tests-ex33_superlu_dist_2.counts > ok ksp_ksp_tests-ex33_superlu_dist_2 > ok diff-ksp_ksp_tests-ex33_superlu_dist_2 > TEST > arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tests-ex49_superlu_dist.counts > ok ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-0 > ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-0 > ok ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-1 > ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-0_conv-1 > ok ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-0 > ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-0 > ok ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-1 > ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-1herm-1_conv-1 > ok ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-0 > ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-0 > ok ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-1 > ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-0_conv-1 > ok ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-0 > ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-0 > ok ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-1 > ok diff-ksp_ksp_tests-ex49_superlu_dist+nsize-4herm-1_conv-1 > TEST > arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex50_tut_2.counts > ok ksp_ksp_tutorials-ex50_tut_2 > ok diff-ksp_ksp_tutorials-ex50_tut_2 > TEST > arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tests-ex33_superlu_dist.counts > ok ksp_ksp_tests-ex33_superlu_dist > ok diff-ksp_ksp_tests-ex33_superlu_dist > TEST > arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex56_hypre.counts > ok snes_tutorials-ex56_hypre > ok diff-snes_tutorials-ex56_hypre > TEST > arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex56_2.counts > ok ksp_ksp_tutorials-ex56_2 > ok diff-ksp_ksp_tutorials-ex56_2 > TEST > arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex17_3d_q3_trig_elas.counts > ok snes_tutorials-ex17_3d_q3_trig_elas > ok diff-snes_tutorials-ex17_3d_q3_trig_elas > TEST > arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex12_quad_hpddm_reuse_threshold_baij.counts > ok snes_tutorials-ex12_quad_hpddm_reuse_threshold_baij > ok diff-snes_tutorials-ex12_quad_hpddm_reuse_threshold_baij > TEST > arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5_superlu_dist_3.counts > not ok ksp_ksp_tutorials-ex5_superlu_dist_3 # Error code: 1 > # srun: error: Unable to create step for job 1426755: More processors > requested than permitted > ok ksp_ksp_tutorials-ex5_superlu_dist_3 # SKIP Command failed so no diff > TEST > arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5f_superlu_dist.counts > ok ksp_ksp_tutorials-ex5f_superlu_dist # SKIP Fortran required for this test > TEST > arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex12_tri_parmetis_hpddm_baij.counts > ok snes_tutorials-ex12_tri_parmetis_hpddm_baij > ok diff-snes_tutorials-ex12_tri_parmetis_hpddm_baij > TEST > arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex19_tut_3.counts > ok snes_tutorials-ex19_tut_3 > ok diff-snes_tutorials-ex19_tut_3 > TEST > arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex17_3d_q3_trig_vlap.counts > ok snes_tutorials-ex17_3d_q3_trig_vlap > ok diff-snes_tutorials-ex17_3d_q3_trig_vlap > TEST > arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5f_superlu_dist_3.counts > ok ksp_ksp_tutorials-ex5f_superlu_dist_3 # SKIP Fortran required for this > test > TEST > arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex19_superlu_dist.counts > ok snes_tutorials-ex19_superlu_dist > ok diff-snes_tutorials-ex19_superlu_dist > TEST > arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre.counts > ok snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre > ok diff-snes_tutorials-ex56_attach_mat_nearnullspace-1_bddc_approx_hypre > TEST > arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex49_hypre_nullspace.counts > ok ksp_ksp_tutorials-ex49_hypre_nullspace > ok diff-ksp_ksp_tutorials-ex49_hypre_nullspace > TEST > arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex19_superlu_dist_2.counts > ok snes_tutorials-ex19_superlu_dist_2 > ok diff-snes_tutorials-ex19_superlu_dist_2 > TEST > arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5_superlu_dist_2.counts > not ok ksp_ksp_tutorials-ex5_superlu_dist_2 # Error code: 1 > # srun: error: Unable to create step for job 1426755: More processors > requested than permitted > ok ksp_ksp_tutorials-ex5_superlu_dist_2 # SKIP Command failed so no diff > TEST > arch-linux2-c-opt-ompi/tests/counts/snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre.counts > ok snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre > ok diff-snes_tutorials-ex56_attach_mat_nearnullspace-0_bddc_approx_hypre > TEST > arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex64_1.counts > ok ksp_ksp_tutorials-ex64_1 > ok diff-ksp_ksp_tutorials-ex64_1 > TEST > arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5_superlu_dist.counts > not ok ksp_ksp_tutorials-ex5_superlu_dist # Error code: 1 > # srun: error: Unable to create step for job 1426755: More processors > requested than permitted > ok ksp_ksp_tutorials-ex5_superlu_dist # SKIP Command failed so no diff > TEST > arch-linux2-c-opt-ompi/tests/counts/ksp_ksp_tutorials-ex5f_superlu_dist_2.counts > ok ksp_ksp_tutorials-ex5f_superlu_dist_2 # SKIP Fortran required for this > test > >> Barry >> >> >>> On Mar 10, 2021, at 11:03 PM, Eric Chamberland >>> <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Barry, >>> >>> to get a some follow up on --with-openmp=1 failures, shall I open gitlab >>> issues for: >>> >>> a) all hypre failures giving DIVERGED_INDEFINITE_PC >>> >>> b) all superlu_dist failures giving different results with initia and >>> "Exceeded timeout limit of 60 s" >>> >>> c) hpddm failures "free(): invalid next size (fast)" and "Segmentation >>> Violation" >>> >>> d) all tao's "Exceeded timeout limit of 60 s" >>> >>> I don't see how I could do all these debugging by myself... >>> >>> Thanks, >>> >>> Eric >>> >>> >> >
