Barry, If I look at the symbols available to trace I find the following. > nm xSYMMIC | grep " T MPI" | grep "attr" <#> T MPIR_Call_attr_copy <#> T MPIR_Call_attr_delete <#> T MPIR_Comm_delete_attr_impl <#> T MPIR_Comm_set_attr_impl <#> T MPIU_nem_gni_smsg_mbox_attr_init
=> Are the two _Comm_ symbols the ones of interest? > nm xSYMMIC | grep " T MPI" | grep "arrier" <#> T MPIDI_CRAY_dmapp_barrier_join <#> T MPIDI_Cray_shared_mem_coll_barrier <#> T MPIDI_Cray_shared_mem_coll_barrier_gather <#> T MPID_Sched_barrier <#> T MPID_nem_barrier <#> T MPID_nem_barrier_init <#> T MPID_nem_barrier_vars_init <#> T MPIR_Barrier <#> T MPIR_Barrier_impl <#> T MPIR_Barrier_inter <#> T MPIR_Barrier_intra <#> T MPIR_CRAY_Barrier <#> T MPIR_Ibarrier_impl <#> T MPIR_Ibarrier_inter <#> T MPIR_Ibarrier_intra => Which of these barriers should I trace? Finally, the current version of PETSc seems to be 3.7.2; I am not able to load 3.7.3. Thanks, Matt Overholt -----Original Message----- From: Barry Smith [mailto:[email protected]] Sent: Thursday, October 13, 2016 11:46 PM To: [email protected] Cc: Jed Brown; PETSc Subject: Re: [petsc-users] large PetscCommDuplicate overhead Mathew, Thanks for the additional information. This is all very weird since the same number of calls made to PetscCommDuplicate() are the same regardless of geometry and the time of the call shouldn't depend on the geometry. Would you be able to do another set of tests where you track the time in MPI_Get_attr() and MPI_Barrier() instead of PetscCommDuplicate()? It could be Cray did something "funny" in their implementation of PETSc. You could also try using the module petsc/3.7.3 instead of the cray-petsc module Thanks Barry > On Oct 12, 2016, at 10:48 AM, Matthew Overholt <[email protected]> wrote: > > Jed, > > I realize that the PetscCommDuplicate (PCD) overhead I am seeing must > be only indirectly related to the problem size, etc., and I wouldn't > be surprised if it was an artifact of some sort related to my specific > algorithm. So you may not want to pursue this much further. However, > I did make three runs using the same Edison environment and code but > different input geometry files. Earlier I found a strong dependence > on the number of processes, so for this test I ran all of the tests on > 1 node with 8 processes (N=1, n=8). What I found was that the amount > of PCD overhead was geometry dependent, not size dependent. A > moderately-sized simple geometry (with relatively few ghosted vertices > at the simple-planar interfaces) had no PCD overhead, whereas both > small and large complex geometries (with relatively more ghosted > vertices at the more-complex interfaces) had 5 - 6% PCD overhead. The log files follow. > > Thanks, > Matt Overholt > --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus
