If I look at the symbols available to trace I find the following.
> nm xSYMMIC | grep " T MPI" | grep "attr"
<#> T MPIR_Call_attr_copy
<#> T MPIR_Call_attr_delete
<#> T MPIR_Comm_delete_attr_impl
<#> T MPIR_Comm_set_attr_impl
<#> T MPIU_nem_gni_smsg_mbox_attr_init

=> Are the two _Comm_ symbols the ones of interest?

> nm xSYMMIC | grep " T MPI" | grep "arrier"
<#> T MPIDI_CRAY_dmapp_barrier_join
<#> T MPIDI_Cray_shared_mem_coll_barrier
<#> T MPIDI_Cray_shared_mem_coll_barrier_gather
<#> T MPID_Sched_barrier
<#> T MPID_nem_barrier
<#> T MPID_nem_barrier_init
<#> T MPID_nem_barrier_vars_init
<#> T MPIR_Barrier
<#> T MPIR_Barrier_impl
<#> T MPIR_Barrier_inter
<#> T MPIR_Barrier_intra
<#> T MPIR_CRAY_Barrier
<#> T MPIR_Ibarrier_impl
<#> T MPIR_Ibarrier_inter
<#> T MPIR_Ibarrier_intra

=> Which of these barriers should I trace?

Finally, the current version of PETSc seems to be 3.7.2; I am not able to
load 3.7.3.

Matt Overholt

-----Original Message-----
From: Barry Smith [] 
Sent: Thursday, October 13, 2016 11:46 PM
Cc: Jed Brown; PETSc
Subject: Re: [petsc-users] large PetscCommDuplicate overhead


    Thanks for the additional information. This is all very weird since the
same number of calls  made to PetscCommDuplicate() are the same regardless
of geometry and the time of the call shouldn't depend on the geometry.

    Would you be able to do another set of tests where you track the time in
MPI_Get_attr() and MPI_Barrier() instead of PetscCommDuplicate()? It could
be Cray did something "funny" in their implementation of PETSc.

   You could also try using the module petsc/3.7.3 instead of the cray-petsc



> On Oct 12, 2016, at 10:48 AM, Matthew Overholt <>
> Jed,
> I realize that the PetscCommDuplicate (PCD) overhead I am seeing must 
> be only indirectly related to the problem size, etc., and I wouldn't 
> be surprised if it was an artifact of some sort related to my specific 
> algorithm.  So you may not want to pursue this much further.  However, 
> I did make three runs using the same Edison environment and code but 
> different input geometry files.  Earlier I found a strong dependence 
> on the number of processes, so for this test I ran all of the tests on 
> 1 node with 8 processes (N=1, n=8).  What I found was that the amount 
> of PCD overhead was geometry dependent, not size dependent.  A 
> moderately-sized simple geometry (with relatively few ghosted vertices 
> at the simple-planar interfaces) had no PCD overhead, whereas both 
> small and large complex geometries (with relatively more ghosted 
> vertices at the more-complex interfaces) had 5 - 6% PCD overhead.  The log
files follow.
> Thanks,
> Matt Overholt

This email has been checked for viruses by Avast antivirus software.

Reply via email to