Re: [petsc-users] PetscAllreduceBarrierCheck is valgrind clean?

2021-01-13 Thread Fande Kong
On Wed, Jan 13, 2021 at 11:49 AM Barry Smith  wrote:

>
>   Fande,
>
>  Look at
> https://scm.mvapich.cse.ohio-state.edu/svn/mpi/mvapich2/trunk/src/mpid/ch3/channels/common/src/detect/arch/mv2_arch_detect.c
>
>  cpubind_set = hwloc_bitmap_alloc();
>
>  but I don't find a corresponding hwloc_bitmap_free(cpubind_set ); in
> get_socket_bound_info().
>

Thanks. I added hwloc_bitmap_free(cpubind_set ) to the end
of get_socket_bound_info(). And then these  valgrind messages disappeared.

Will ask mvapich developers to fix this.

Thanks,

Fande,


>
>
>   Barry
>
>
> >
>
> > On Jan 13, 2021, at 12:32 PM, Fande Kong  wrote:
> >
> > Hi All,
> >
> > I ran valgrind with mvapich-2.3.5 for a moose simulation.  The
> motivation was that we have a few non-deterministic parallel simulations in
> moose. I want to check if we have any memory issues. I got some complaints
> from PetscAllreduceBarrierCheck
> >
> > Thanks,
> >
> >
> > Fande
> >
> >
> >
> > ==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely
> lost in loss record 31 of 54
> > ==98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
> > ==98001==by 0xDAE1D5E: hwloc_bitmap_alloc (bitmap.c:74)
> > ==98001==by 0xDA7523F: get_socket_bound_info (mv2_arch_detect.c:898)
> > ==98001==by 0xD93C87A: create_intra_sock_comm
> (create_2level_comm.c:593)
> > ==98001==by 0xD93BEBA: create_2level_comm (create_2level_comm.c:1762)
> > ==98001==by 0xD59A894: mv2_increment_shmem_coll_counter
> (ch3_shmem_coll.c:2183)
> > ==98001==by 0xD4E4CBB: PMPI_Allreduce (allreduce.c:912)
> > ==98001==by 0x99F1766: PetscAllreduceBarrierCheck (pbarrier.c:26)
> > ==98001==by 0x99F70BE: PetscSplitOwnership (psplit.c:84)
> > ==98001==by 0x9C5C26B: PetscLayoutSetUp (pmap.c:262)
> > ==98001==by 0xA08C66B: MatMPIAdjSetPreallocation_MPIAdj
> (mpiadj.c:630)
> > ==98001==by 0xA08EB9A: MatMPIAdjSetPreallocation (mpiadj.c:856)
> > ==98001==by 0xA08F6D3: MatCreateMPIAdj (mpiadj.c:904)
> >
> >
> > ==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely
> lost in loss record 32 of 54
> > ==98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
> > ==98001==by 0xDAE1D5E: hwloc_bitmap_alloc (bitmap.c:74)
> > ==98001==by 0xDA7523F: get_socket_bound_info (mv2_arch_detect.c:898)
> > ==98001==by 0xD93C87A: create_intra_sock_comm
> (create_2level_comm.c:593)
> > ==98001==by 0xD93BEBA: create_2level_comm (create_2level_comm.c:1762)
> > ==98001==by 0xD59A9A4: mv2_increment_allgather_coll_counter
> (ch3_shmem_coll.c:2218)
> > ==98001==by 0xD4E4CE4: PMPI_Allreduce (allreduce.c:917)
> > ==98001==by 0xCD9D74D: libparmetis__gkMPI_Allreduce (gkmpi.c:103)
> > ==98001==by 0xCDBB663: libparmetis__ComputeParallelBalance
> (stat.c:87)
> > ==98001==by 0xCDA4FE0: libparmetis__KWayFM (kwayrefine.c:352)
> > ==98001==by 0xCDA21ED: libparmetis__Global_Partition (kmetis.c:222)
> > ==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
> > ==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
> > ==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
> > ==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
> > ==98001==by 0xCDA2748: ParMETIS_V3_PartKway (kmetis.c:94)
> > ==98001==by 0xA2D6B39: MatPartitioningApply_Parmetis_Private
> (pmetis.c:145)
> > ==98001==by 0xA2D77D9: MatPartitioningApply_Parmetis (pmetis.c:219)
> > ==98001==by 0xA2CD46A: MatPartitioningApply (partition.c:332)
> >
> >
> > ==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely
> lost in loss record 33 of 54
> > ==98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
> > ==98001==by 0xDAE1D5E: hwloc_bitmap_alloc (bitmap.c:74)
> > ==98001==by 0xDA7523F: get_socket_bound_info (mv2_arch_detect.c:898)
> > ==98001==by 0xD93C87A: create_intra_sock_comm
> (create_2level_comm.c:593)
> > ==98001==by 0xD93BEBA: create_2level_comm (create_2level_comm.c:1762)
> > ==98001==by 0xD59A894: mv2_increment_shmem_coll_counter
> (ch3_shmem_coll.c:2183)
> > ==98001==by 0xD4E4CBB: PMPI_Allreduce (allreduce.c:912)
> > ==98001==by 0x99F1766: PetscAllreduceBarrierCheck (pbarrier.c:26)
> > ==98001==by 0x99F733E: PetscSplitOwnership (psplit.c:91)
> > ==98001==by 0x9C5C26B: PetscLayoutSetUp (pmap.c:262)
> > ==98001==by 0x9C5DB0D: PetscLayoutCreateFromSizes (pmap.c:112)
> > ==98001==by 0x9D9A018: ISGeneralSetIndices_General (general.c:568)
> > ==98001==by 0x9D9AB44: ISGeneralSetIndices (general.c:554)
> > ==98001==by 0x9D9ADC4: ISCreateGeneral (general.c:529)
> > ==98001==by 0x9B431E6: VecCreateGhostWithArray (pbvec.c:692)
> > ==98001==by 0x9B43A33: VecCreateGhost (pbvec.c:748)
> >
> >
> > ==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely
> lost in loss record 34 of 54
> > =98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
> > ==98001==by 0xDAE1D5E: 

Re: [petsc-users] PetscAllreduceBarrierCheck is valgrind clean?

2021-01-13 Thread Barry Smith


  Fande,

 Look at 
https://scm.mvapich.cse.ohio-state.edu/svn/mpi/mvapich2/trunk/src/mpid/ch3/channels/common/src/detect/arch/mv2_arch_detect.c

 cpubind_set = hwloc_bitmap_alloc();

 but I don't find a corresponding hwloc_bitmap_free(cpubind_set ); in 
get_socket_bound_info().


  Barry


> 

> On Jan 13, 2021, at 12:32 PM, Fande Kong  wrote:
> 
> Hi All,
> 
> I ran valgrind with mvapich-2.3.5 for a moose simulation.  The motivation was 
> that we have a few non-deterministic parallel simulations in moose. I want to 
> check if we have any memory issues. I got some complaints from 
> PetscAllreduceBarrierCheck
> 
> Thanks,
> 
> 
> Fande
> 
> 
> 
> ==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely lost 
> in loss record 31 of 54
> ==98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
> ==98001==by 0xDAE1D5E: hwloc_bitmap_alloc (bitmap.c:74)
> ==98001==by 0xDA7523F: get_socket_bound_info (mv2_arch_detect.c:898)
> ==98001==by 0xD93C87A: create_intra_sock_comm (create_2level_comm.c:593)
> ==98001==by 0xD93BEBA: create_2level_comm (create_2level_comm.c:1762)
> ==98001==by 0xD59A894: mv2_increment_shmem_coll_counter 
> (ch3_shmem_coll.c:2183)
> ==98001==by 0xD4E4CBB: PMPI_Allreduce (allreduce.c:912)
> ==98001==by 0x99F1766: PetscAllreduceBarrierCheck (pbarrier.c:26)
> ==98001==by 0x99F70BE: PetscSplitOwnership (psplit.c:84)
> ==98001==by 0x9C5C26B: PetscLayoutSetUp (pmap.c:262)
> ==98001==by 0xA08C66B: MatMPIAdjSetPreallocation_MPIAdj (mpiadj.c:630)
> ==98001==by 0xA08EB9A: MatMPIAdjSetPreallocation (mpiadj.c:856)
> ==98001==by 0xA08F6D3: MatCreateMPIAdj (mpiadj.c:904)
> 
> 
> ==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely lost 
> in loss record 32 of 54
> ==98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
> ==98001==by 0xDAE1D5E: hwloc_bitmap_alloc (bitmap.c:74)
> ==98001==by 0xDA7523F: get_socket_bound_info (mv2_arch_detect.c:898)
> ==98001==by 0xD93C87A: create_intra_sock_comm (create_2level_comm.c:593)
> ==98001==by 0xD93BEBA: create_2level_comm (create_2level_comm.c:1762)
> ==98001==by 0xD59A9A4: mv2_increment_allgather_coll_counter 
> (ch3_shmem_coll.c:2218)
> ==98001==by 0xD4E4CE4: PMPI_Allreduce (allreduce.c:917)
> ==98001==by 0xCD9D74D: libparmetis__gkMPI_Allreduce (gkmpi.c:103)
> ==98001==by 0xCDBB663: libparmetis__ComputeParallelBalance (stat.c:87)
> ==98001==by 0xCDA4FE0: libparmetis__KWayFM (kwayrefine.c:352)
> ==98001==by 0xCDA21ED: libparmetis__Global_Partition (kmetis.c:222)
> ==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
> ==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
> ==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
> ==98001==by 0xCDA20B2: libparmetis__Global_Partition (kmetis.c:191)
> ==98001==by 0xCDA2748: ParMETIS_V3_PartKway (kmetis.c:94)
> ==98001==by 0xA2D6B39: MatPartitioningApply_Parmetis_Private 
> (pmetis.c:145)
> ==98001==by 0xA2D77D9: MatPartitioningApply_Parmetis (pmetis.c:219)
> ==98001==by 0xA2CD46A: MatPartitioningApply (partition.c:332)
> 
> 
> ==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely lost 
> in loss record 33 of 54
> ==98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
> ==98001==by 0xDAE1D5E: hwloc_bitmap_alloc (bitmap.c:74)
> ==98001==by 0xDA7523F: get_socket_bound_info (mv2_arch_detect.c:898)
> ==98001==by 0xD93C87A: create_intra_sock_comm (create_2level_comm.c:593)
> ==98001==by 0xD93BEBA: create_2level_comm (create_2level_comm.c:1762)
> ==98001==by 0xD59A894: mv2_increment_shmem_coll_counter 
> (ch3_shmem_coll.c:2183)
> ==98001==by 0xD4E4CBB: PMPI_Allreduce (allreduce.c:912)
> ==98001==by 0x99F1766: PetscAllreduceBarrierCheck (pbarrier.c:26)
> ==98001==by 0x99F733E: PetscSplitOwnership (psplit.c:91)
> ==98001==by 0x9C5C26B: PetscLayoutSetUp (pmap.c:262)
> ==98001==by 0x9C5DB0D: PetscLayoutCreateFromSizes (pmap.c:112)
> ==98001==by 0x9D9A018: ISGeneralSetIndices_General (general.c:568)
> ==98001==by 0x9D9AB44: ISGeneralSetIndices (general.c:554)
> ==98001==by 0x9D9ADC4: ISCreateGeneral (general.c:529)
> ==98001==by 0x9B431E6: VecCreateGhostWithArray (pbvec.c:692)
> ==98001==by 0x9B43A33: VecCreateGhost (pbvec.c:748)
> 
> 
> ==98001== 88 (24 direct, 64 indirect) bytes in 1 blocks are definitely lost 
> in loss record 34 of 54
> =98001==at 0x4C29F73: malloc (vg_replace_malloc.c:307)
> ==98001==by 0xDAE1D5E: hwloc_bitmap_alloc (bitmap.c:74)
> ==98001==by 0xDA7523F: get_socket_ 
> =
>  bound_info (mv2_arch_detect.c:898)
> ==98001==by 0xD93C87A: create_intra_sock_comm (create_2level_comm.c:593)
> ==98001==by 0xD93BEBA: create_2level_comm (create_2level_comm.c:1762)
> ==98001==by 0xD59A894: mv2_increment_shmem_coll_counter 
> (ch3_shmem_coll.c:2183)
> ==98001==by 0xD4E4CBB: PMPI_Allreduce