On Tue, Jul 6, 2021 at 9:31 AM Vijay S Kumar <[email protected]> wrote:
> Hello all, > > By way of background, we have a PetSc-based solver that we run on our > in-house Cray system. We are carrying out performance analysis using > profilers in the CrayPat suite that provide more fine-grained > performance-related information than the PetSc log_view summary. > > When instrumented using CrayPat perftools, it turns out that the MPI > initialization (MPI_Init) internally invoked by PetscInitialize is not > picked up by the profiler. That is, simply specifying the following: > ierr = PetscInitialize(&argc,&argv,(char*)0,NULL);if (ierr) > return ierr; > results in the following runtime error: > > CrayPat/X: Version 7.1.1 Revision 7c0ddd79b 08/19/19 > 16:58:46 > > Attempting to use an MPI routine before initializing MPICH > > To circumvent this, we had to explicitly call MPI_Init prior to > PetscInitialize: > MPI_Init(&argc,&argv); > ierr = PetscInitialize(&argc,&argv,(char*)0,NULL);if (ierr) > return ierr; > > However, the side-effect of this above workaround seems to be several > downstream runtime (assertion) errors with VecAssemblyBegin/End and > MatAssemblyBeing/End statements: > > CrayPat/X: Version 7.1.1 Revision 7c0ddd79b 08/19/19 16:58:46 > main.x: ../rtsum.c:5662: __pat_trsup_trace_waitsome_rtsum: Assertion > `recv_count != MPI_UNDEFINED' failed. > > [email protected]:769 > VecAssemblyEnd@0x2aaab951b3ba > VecAssemblyEnd_MPI_BTS@0x2aaab950b179 > MPI_Waitsome@0x43a238 > __pat_trsup_trace_waitsome_rtsum@0x5f1a17 > __GI___assert_fail@0x2aaabc61e7d1 > __assert_fail_base@0x2aaabc61e759 > __GI_abort@0x2aaabc627740 > __GI_raise@0x2aaabc626160 > > > Interestingly, we do not see such errors when there is no explicit > MPI_Init, and no instrumentation for performance. > Looking for someone to help throw more light on why PetSc Mat/Vec > AssemblyEnd statements lead to such MPI-level assertion errors in cases > where MPI_Init is explicitly called. > (Or alternatively, is there a way to call PetscInitialize in a manner that > ensures that the MPI initialization is picked up by the profilers in > question?) > > We would highly appreciate any help/pointers, > There is no problem calling MPI_Init() before PetscInitialize(), although then you also have to call MPI_Finalize() explicitly at the end. Both errors appears to arise from the Cray instrumentation, which is evidently buggy. Did you try calling MPI_Init() yourself without instrumentation? Also, what info are you getting from CrayPat that we do not log? Thanks, Matt > Thanks! > Vijay > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
