Yes, there are some PETSc objects or arrays that you are not freeing so they
are printed at the end of the run. For small runs this harmless but if new
objects/memory is allocated at each iteration and not suitably freed it will
eventually add up.
Run with -malloc_view (small problem with say 2 iterations) it will print
everything allocated and might be helpful.
Perhaps you are calling ISColoringGetIS() and not calling
ISColoringRestoreIS()?
It is also possible it is a leak in PETSc, but that is unlikely since we
test for them.
Are you using Fortran?
Barry
> On Aug 12, 2020, at 1:29 PM, Mark Lohry <[email protected]> wrote:
>
> Thanks Matt and Barry. At Matt's suggestion I ran a smaller representative
> case with valgrind and didn't see anything alarming (apart from a small leak
> in an older boost version I was using:
> https://github.com/boostorg/serialization/issues/104
> <https://github.com/boostorg/serialization/issues/104> although I don't
> think this was causing the issue).
>
> -malloc_debug dumps quite a lot, this is supposed to be empty right? Output
> pasted below. It looks like the same sequence of calls is repeated 8 times,
> which is how many nonlinear solves occurred in this particular run. Thoughts?
>
>
>
> [ 0]1408 bytes PetscSplitReductionCreate() line 63 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/vec/utils/comb.c
> [ 0]80 bytes PetscSplitReductionCreate() line 57 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/vec/utils/comb.c
> [ 0]16 bytes PetscCommBuildTwoSided_Allreduce() line 169 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/mpits.c
> [ 0]16 bytes ISGeneralSetIndices_General() line 578 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]272 bytes ISGeneralSetIndices_General() line 578 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]880 bytes ISGeneralSetIndices_General() line 578 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]960 bytes ISGeneralSetIndices_General() line 578 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]976 bytes ISGeneralSetIndices_General() line 578 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]1024 bytes ISGeneralSetIndices_General() line 578 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]1024 bytes ISGeneralSetIndices_General() line 578 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]1040 bytes ISGeneralSetIndices_General() line 578 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]16 bytes PetscLayoutSetUp() line 269 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]80 bytes PetscLayoutCreate() line 55 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/utils/pmap.c
> [ 0]16 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 255 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]32 bytes PetscStrallocpy() line 187 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/utils/str.c
> [ 0]32 bytes PetscFunctionListAdd_Private() line 222 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/dll/reg.c
> [ 0]16 bytes ISCreate_General() line 647 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/impls/general/general.c
> [ 0]896 bytes ISCreate() line 37 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/interface/isreg.c
> [ 0]64 bytes ISColoringGetIS() line 266 in
> /home/mlohry/dev/cmake-build/external/petsc/src/vec/is/is/utils/iscoloring.c
> [ 0]32 bytes PetscCommDuplicate() line 129 in
> /home/mlohry/dev/cmake-build/external/petsc/src/sys/objects/tagm.c
>
>
>
> On Wed, Aug 12, 2020 at 1:46 PM Barry Smith <[email protected]
> <mailto:[email protected]>> wrote:
>
> Mark.
>
> When valgrind is not feasible (like on many centrally controlled batch
> systems) you can run PETSc with an extra flag to do some memory error checks
> -malloc_debug
>
> this
>
> 1) fills all malloced memory with Nan so if the code is using uninitialized
> memory it may be detected and
> 2) checks the beginning and end of each alloced memory region for
> out-of-bounds writes at each malloc and free.
>
> it will slow the code down a little bit but generally not a huge amount.
>
> It is no where near as good as valgrind or other memory corruption tools but
> it has the advantage you can run it anywhere on any size job.
>
>
> Barry
>
>
>
>
>
>> On Aug 12, 2020, at 7:46 AM, Matthew Knepley <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> On Wed, Aug 12, 2020 at 7:53 AM Mark Lohry <[email protected]
>> <mailto:[email protected]>> wrote:
>> I'm getting seemingly random failures of late:
>> Caught signal number 7 BUS: Bus Error, possibly illegal memory access
>>
>> The first thing I would do is run valgrind on as wide an array of tests as
>> you can. This will find problems
>> on things that run completely fine.
>>
>> Thanks,
>>
>> Matt
>>
>> Symptoms:
>> 1) Seems to only happen (so far) on larger cases, 400-2000 cores
>> 2) It doesn't happen right away -- this was running happily for several
>> hours over several hundred time steps with no indication of bad health in
>> the numerics
>> 3) At least the total memory consumption seems to be within bounds, though
>> I'm not sure about individual processes. e.g. slurm here reported Memory
>> Efficiency: 75.23% of 1.76 TB (180.00 GB/node)
>> 4) running the same setup twice it fails at different points
>>
>> Any suggestions on what to look for? This is a bit painful to work on as I
>> can only reproduce it on large runs and then it's seemingly random.
>>
>>
>> Thanks,
>> Mark
>>
>>
>> --
>> What most experimenters take for granted before they begin their experiments
>> is infinitely more interesting than any results to which their experiments
>> lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
>