> On Jul 22, 2015, at 11:33 AM, Florian Lindner <[email protected]> wrote:
> 
> Am Dienstag, 21. Juli 2015, 18:32:02 schrieben Sie:
>> 
>>  Try putting a breakpoint in KSPSetUp_GMRES and check the values of all the 
>> pointers immediately after the
>> ierr = 
>> PetscCalloc5(hh,&gmres->hh_origin,hes,&gmres->hes_origin,rs,&gmres->rs_origin,cc,&gmres->cc_origin,cc,&gmres->ss_origin);CHKERRQ(ierr);
>> 
>> then put your second break point in KSPReset_GMRES and check all the 
>> pointers agin just before the 
>>> ierr = 
>>> PetscFree5(gmres->hh_origin,gmres->hes_origin,gmres->rs_origin,gmres->cc_origin,gmres->ss_origin);CHKERRQ(ierr);
>> 
>> Of course the pointers should be the same, are they?
> 
> Num     Type           Disp Enb Address            What
> 3       breakpoint     keep y   0x00007ffff6ff6cb5 in KSPReset_GMRES at 
> /home/florian/software/petsc/src/ksp/ksp/impls/gmres/gmres.c:258
> 4       breakpoint     keep y   0x00007ffff6ff49a1 in KSPSetUp_GMRES at 
> /home/florian/software/petsc/src/ksp/ksp/impls/gmres/gmres.c:54
> 
> The pointer gmres is the same. Just one function call later, at mal.c:72 it 
> crashes. The pointer that is freed is gmres->hh_origin which also hasn't 
> changed.
> 
> What confuses me is that:
> 
> Breakpoint 3, KSPReset_GMRES (ksp=0xe904b0) at 
> /home/florian/software/petsc/src/ksp/ksp/impls/gmres/gmres.c:258
> 258     ierr = 
> PetscFree5(gmres->hh_origin,gmres->hes_origin,gmres->rs_origin,gmres->cc_origin,gmres->ss_origin);CHKERRQ(ierr);
> (gdb) print gmres->hh_origin
> $24 = (PetscScalar *) 0xf10250
> 
> hh_origin is the first argument, I step into PetscFree5:
> 
> (gdb) s
> PetscFreeAlign (ptr=0xf15aa0, line=258, func=0x7ffff753c4c8 <__func__.20306> 
> "KSPReset_GMRES", file=0x7ffff753b8b0 
> "/home/florian/software/petsc/src/ksp/ksp/impls/gmres/gmres.c") at 
> /home/florian/software/petsc/src/sys/memory/mal.c:54
> 54      if (!ptr) return 0;
> (gdb) print ptr
> $25 = (void *) 0xf15aa0
> 
> Why have the value changed? I expect gmres->hh_origin == ptr.

   Definitely a problem here.

> Could this be a sign of stack corruption at same ealier stage?

   Could be, but valgrind usually finds such things. 

   You can do the following: edit $PETSC_DIR/$PETSC_ARCH/include/petscconf.h 
and add the lines

#if !defined(PETSC_USE_MALLOC_COALESCED)
#define PETSC_USE_MALLOC_COALESCED
#endif

then run 

make gnumake in the $PETSC_DIR directory.  Then relink your program and try 
running it.

  Barry




> 
> I was also trying to build petsc with clang for using its memory-sanitizer, 
> but without success. Same for precice.
> 
> 
>> If so you can run in the debugger and check the values at some points 
>> between the creation and destruction to see where they get changed to bad 
>> values. Normally, of course, valgrind would be very helpful in finding 
>> exactly when things go bad.
> 
> What do you mean with changing to bad? They are the same after Calloc and 
> before PetscFree5.
> 
> Best Regards,
> Florian
> 
>>  I'm afraid I'm going to have to give up on building this stuff myself; too 
>> painful.
> 
> Sorry about that. 
> 
>> 
>>  Barry
>> 
>> 
>>> On Jul 21, 2015, at 8:54 AM, Florian Lindner <[email protected]> wrote:
>>> 
>>> Hey Barry,
>>> 
>>> were you able to reproduce the error?
>>> 
>>> I tried to set a breakpoint at
>>> 
>>> PetscErrorCode KSPReset_GMRES(KSP ksp)
>>> {
>>> KSP_GMRES      *gmres = (KSP_GMRES*)ksp->data;
>>> PetscErrorCode ierr;
>>> PetscInt       i;
>>> 
>>> PetscFunctionBegin;
>>> /* Free the Hessenberg matrices */
>>> ierr = 
>>> PetscFree5(gmres->hh_origin,gmres->hes_origin,gmres->rs_origin,gmres->cc_origin,gmres->ss_origin);CHKERRQ(ierr);
>>> 
>>> in gmres.c, the last line produces the error...
>>> 
>>> Interestingly this piece of code is traversed only once, so at least no 
>>> double calling of the same code that frees the pointer...
>>> 
>>> Best Regards,
>>> Florian
>>> 
>>> 
>>> Am Donnerstag, 16. Juli 2015, 17:59:15 schrieben Sie:
>>>> 
>>>> I am on a mac, no idea what the 'lo' local host loop back should be
>>>> 
>>>> $  ./pmpi B
>>>> MPI rank 0 of 1
>>>> [PRECICE] Run in coupling mode
>>>> Mesh = [[1.19999999999999995559e-01, 0.00000000000000000000e+00], 
>>>> [3.20000000000000006661e-01, 0.00000000000000000000e+00], 
>>>> [5.20000000000000017764e-01, 0.00000000000000000000e+00], 
>>>> [7.20000000000000084377e-01, 0.00000000000000000000e+00], 
>>>> [9.20000000000000039968e-01, 0.00000000000000000000e+00]]
>>>> Setting up master communication to coupling partner/s 
>>>> (0)  [PRECICE] ERROR: Network "lo" not found for socket connection!
>>>> Run finished at Thu Jul 16 17:50:39 2015
>>>> Global runtime = 41ms / 0s
>>>> 
>>>> Event                Count    Total[ms]     Max[ms]     Min[ms]     
>>>> Avg[ms]   T%
>>>> --------------------------------------------------------------------------------
>>>> Properties from all Events, accumulated
>>>> ---------------------------------------
>>>> 
>>>> Abort trap: 6
>>>> ~/Src/prempi (master *=) arch-debug
>>>> $  ./pmpi B
>>>> MPI rank 0 of 1
>>>> [PRECICE] Run in coupling mode
>>>> Mesh = [[1.19999999999999995559e-01, 0.00000000000000000000e+00], 
>>>> [3.20000000000000006661e-01, 0.00000000000000000000e+00], 
>>>> [5.20000000000000017764e-01, 0.00000000000000000000e+00], 
>>>> [7.20000000000000084377e-01, 0.00000000000000000000e+00], 
>>>> [9.20000000000000039968e-01, 0.00000000000000000000e+00]]
>>>> Setting up master communication to coupling partner/s 
>>>> (0)  [PRECICE] ERROR: Network "localhost" not found for socket connection!
>>>> Run finished at Thu Jul 16 17:50:52 2015
>>>> Global runtime = 40ms / 0s
>>>> 
>>>> Event                Count    Total[ms]     Max[ms]     Min[ms]     
>>>> Avg[ms]   T%
>>>> --------------------------------------------------------------------------------
>>>> Properties from all Events, accumulated
>>>> ---------------------------------------
>>>> 
>>>> Abort trap: 6
>>>> ~/Src/prempi (master *=) arch-debug
>>>> $ hostname
>>>> Barrys-MacBook-Pro.local
>>>> ~/Src/prempi (master *=) arch-debug
>>>> $  ./pmpi B
>>>> MPI rank 0 of 1
>>>> [PRECICE] Run in coupling mode
>>>> Mesh = [[1.19999999999999995559e-01, 0.00000000000000000000e+00], 
>>>> [3.20000000000000006661e-01, 0.00000000000000000000e+00], 
>>>> [5.20000000000000017764e-01, 0.00000000000000000000e+00], 
>>>> [7.20000000000000084377e-01, 0.00000000000000000000e+00], 
>>>> [9.20000000000000039968e-01, 0.00000000000000000000e+00]]
>>>> Setting up master communication to coupling partner/s 
>>>> (0)  [PRECICE] ERROR: Network "Barrys-MacBook-Pro.local" not found for 
>>>> socket connection!
>>>> Run finished at Thu Jul 16 17:51:12 2015
>>>> Global runtime = 39ms / 0s
>>>> 
>>>> Event                Count    Total[ms]     Max[ms]     Min[ms]     
>>>> Avg[ms]   T%
>>>> --------------------------------------------------------------------------------
>>>> Properties from all Events, accumulated
>>>> ---------------------------------------
>>>> 
>>>> Abort trap: 6
>>>> ~/Src/prempi (master *=) arch-debug
>>>> $  ./pmpi B
>>>> MPI rank 0 of 1
>>>> [PRECICE] Run in coupling mode
>>>> Mesh = [[1.19999999999999995559e-01, 0.00000000000000000000e+00], 
>>>> [3.20000000000000006661e-01, 0.00000000000000000000e+00], 
>>>> [5.20000000000000017764e-01, 0.00000000000000000000e+00], 
>>>> [7.20000000000000084377e-01, 0.00000000000000000000e+00], 
>>>> [9.20000000000000039968e-01, 0.00000000000000000000e+00]]
>>>> Setting up master communication to coupling partner/s 
>>>> (0)  [PRECICE] ERROR: Network "10.0.1.2" not found for socket connection!
>>>> Run finished at Thu Jul 16 17:53:02 2015
>>>> Global runtime = 42ms / 0s
>>>> 
>>>> Event                Count    Total[ms]     Max[ms]     Min[ms]     
>>>> Avg[ms]   T%
>>>> --------------------------------------------------------------------------------
>>>> Properties from all Events, accumulated
>>>> ---------------------------------------
>>>> 
>>>> Abort trap: 6
>>>> ~/Src/prempi (master *=) arch-debug
>>>> 
>>>>> On Jul 15, 2015, at 1:53 AM, Florian Lindner <[email protected]> wrote:
>>>>> 
>>>>> Hey
>>>>> 
>>>>> Am Dienstag, 14. Juli 2015, 13:20:33 schrieben Sie:
>>>>>> 
>>>>>> How to install Eigen? I tried brew install eigen but it didn't help.
>>>>> 
>>>>> You may need to set the CPLUS_INCLUDE_PATH to something like 
>>>>> "/usr/include/eigen3"
>>>>> Easiest way however is probably to download eigen from 
>>>>> http://bitbucket.org/eigen/eigen/get/3.2.5.tar.bz2 and move the Eigen 
>>>>> folder from that archive to precice/src. 
>>>>> 
>>>>>> Also what about the PRECICE_MPI_ stuff. It sure doesn't point to 
>>>>>> anything valid.
>>>>> 
>>>>> You probably don't need to set it if you use a mpic++ or mpicxx compiler 
>>>>> wrapper that take care of that.
>>>>> 
>>>>> Thx,
>>>>> Florian
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Barry
>>>>>> 
>>>>>> $ MPI_CXX="clang++" scons -j 4 boost_inst=on python=off petsc=on mpi=on 
>>>>>> compiler=/Users/barrysmith/Src/petsc/arch-debug/bin/mpic++ build=debug
>>>>>> scons: Reading SConscript files ...
>>>>>> 
>>>>>> Build options ...
>>>>>> (default)  builddir                  = build      Directory holding 
>>>>>> build files. ( /path/to/builddir )
>>>>>> (default)  build                     = debug      Build type, either 
>>>>>> release or debug (release|debug)
>>>>>> (modified) compiler                  = 
>>>>>> /Users/barrysmith/Src/petsc/arch-debug/bin/mpic++   Compiler to use.
>>>>>> (modified) mpi                       = True       Enables MPI-based 
>>>>>> communication and running coupling tests. (yes|no)
>>>>>> (default)  sockets                   = True       Enables Socket-based 
>>>>>> communication. (yes|no)
>>>>>> (modified) boost_inst                = True       Enable if Boost is 
>>>>>> available compiled and installed. (yes|no)
>>>>>> (default)  spirit2                   = True       Used for parsing VRML 
>>>>>> file geometries and checkpointing. (yes|no)
>>>>>> (modified) petsc                     = True       Enable use of the 
>>>>>> Petsc linear algebra library. (yes|no)
>>>>>> (modified) python                    = False      Used for Python 
>>>>>> scripted solver actions. (yes|no)
>>>>>> (default)  gprof                     = False      Used in detailed 
>>>>>> performance analysis. (yes|no)
>>>>>> ... done
>>>>>> 
>>>>>> Environment variables used for this build ...
>>>>>> (have to be defined by the user to configure build)
>>>>>> (modified) PETSC_DIR                 = /Users/barrysmith/Src/PETSc
>>>>>> (modified) PETSC_ARCH                = arch-debug
>>>>>> (default)  PRECICE_BOOST_SYSTEM_LIB  = boost_system
>>>>>> (default)  PRECICE_BOOST_FILESYSTEM_LIB = boost_filesystem
>>>>>> (default)  PRECICE_MPI_LIB_PATH      = /usr/lib/
>>>>>> (default)  PRECICE_MPI_LIB           = mpich   
>>>>>> (default)  PRECICE_MPI_INC_PATH      = /usr/include/mpich2
>>>>>> (default)  PRECICE_PTHREAD_LIB_PATH  = /usr/lib
>>>>>> (default)  PRECICE_PTHREAD_LIB       = pthread 
>>>>>> (default)  PRECICE_PTHREAD_INC_PATH  = /usr/include
>>>>>> ... done
>>>>>> 
>>>>>> Configuring build variables ...
>>>>>> Checking whether the C++ compiler works... yes
>>>>>> Checking for C library petsc... yes
>>>>>> Checking for C++ header file Eigen/Dense... no
>>>>>> ERROR: Header 'Eigen/Dense' (needed for Eigen) not found or does not 
>>>>>> compile!
>>>>>> $ brew install eigen
>>>>>> ==> Downloading 
>>>>>> https://downloads.sf.net/project/machomebrew/Bottles/eigen-3.2.3.yosemite.bottle.tar.gz
>>>>>> ######################################################################## 
>>>>>> 100.0%
>>>>>> ==> Pouring eigen-3.2.3.yosemite.bottle.tar.gz
>>>>>> 🍺  /usr/local/Cellar/eigen/3.2.3: 361 files, 4.1M
>>>>>> ~/Src/precice (develop=) arch-debug
>>>>>> $ MPI_CXX="clang++" scons -j 4 boost_inst=on python=off petsc=on mpi=on 
>>>>>> compiler=/Users/barrysmith/Src/petsc/arch-debug/bin/mpic++ build=debug
>>>>>> scons: Reading SConscript files ...
>>>>>> 
>>>>>> Build options ...
>>>>>> (default)  builddir                  = build      Directory holding 
>>>>>> build files. ( /path/to/builddir )
>>>>>> (default)  build                     = debug      Build type, either 
>>>>>> release or debug (release|debug)
>>>>>> (modified) compiler                  = 
>>>>>> /Users/barrysmith/Src/petsc/arch-debug/bin/mpic++   Compiler to use.
>>>>>> (modified) mpi                       = True       Enables MPI-based 
>>>>>> communication and running coupling tests. (yes|no)
>>>>>> (default)  sockets                   = True       Enables Socket-based 
>>>>>> communication. (yes|no)
>>>>>> (modified) boost_inst                = True       Enable if Boost is 
>>>>>> available compiled and installed. (yes|no)
>>>>>> (default)  spirit2                   = True       Used for parsing VRML 
>>>>>> file geometries and checkpointing. (yes|no)
>>>>>> (modified) petsc                     = True       Enable use of the 
>>>>>> Petsc linear algebra library. (yes|no)
>>>>>> (modified) python                    = False      Used for Python 
>>>>>> scripted solver actions. (yes|no)
>>>>>> (default)  gprof                     = False      Used in detailed 
>>>>>> performance analysis. (yes|no)
>>>>>> ... done
>>>>>> 
>>>>>> Environment variables used for this build ...
>>>>>> (have to be defined by the user to configure build)
>>>>>> (modified) PETSC_DIR                 = /Users/barrysmith/Src/PETSc
>>>>>> (modified) PETSC_ARCH                = arch-debug
>>>>>> (default)  PRECICE_BOOST_SYSTEM_LIB  = boost_system
>>>>>> (default)  PRECICE_BOOST_FILESYSTEM_LIB = boost_filesystem
>>>>>> (default)  PRECICE_MPI_LIB_PATH      = /usr/lib/
>>>>>> (default)  PRECICE_MPI_LIB           = mpich   
>>>>>> (default)  PRECICE_MPI_INC_PATH      = /usr/include/mpich2
>>>>>> (default)  PRECICE_PTHREAD_LIB_PATH  = /usr/lib
>>>>>> (default)  PRECICE_PTHREAD_LIB       = pthread 
>>>>>> (default)  PRECICE_PTHREAD_INC_PATH  = /usr/include
>>>>>> ... done
>>>>>> 
>>>>>> Configuring build variables ...
>>>>>> Checking whether the C++ compiler works... yes
>>>>>> Checking for C library petsc... yes
>>>>>> Checking for C++ header file Eigen/Dense... no
>>>>>> ERROR: Header 'Eigen/Dense' (needed for Eigen) not found or does not 
>>>>>> compile!
>>>>>> ~/Src/precice (develop=) arch-debug
>>>>>> 
>>>>>> 
>>>>>>> On Jul 14, 2015, at 2:14 AM, Florian Lindner <[email protected]> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Hello,
>>>>>>> 
>>>>>>> Am Montag, 13. Juli 2015, 12:26:21 schrieb Barry Smith:
>>>>>>>> 
>>>>>>>> Run under valgrind first, see if it gives any more details about the 
>>>>>>>> memory issue 
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>>> 
>>>>>>> I tried running it like that:
>>>>>>> 
>>>>>>> valgrind --tool=memcheck ./pmpi A -malloc off 
>>>>>>> 
>>>>>>> (pmpi is my application, no mpirun)
>>>>>>> 
>>>>>>> but it reported no errors at all.
>>>>>>> 
>>>>>>>> Can you send the code that produces this problem?
>>>>>>> 
>>>>>>> I was not able to isolate that problem, you can of course have a look 
>>>>>>> at our application:
>>>>>>> 
>>>>>>> git clone [email protected]:precice/precice.git
>>>>>>> MPI_CXX="clang++" scons -j 4 boost_inst=on python=off petsc=on mpi=on 
>>>>>>> compiler=mpic++ build=debug
>>>>>>> 
>>>>>>> The test client:
>>>>>>> git clone [email protected]:floli/prempi.git
>>>>>>> you need to adapt line 5 in SConstruct: preciceRoot
>>>>>>> scons
>>>>>>> 
>>>>>>> Take one terminal run ./pmpi A, another to run ./pmpi B
>>>>>>> 
>>>>>>> Thanks for taking a look! Mail me if any problem with the build occurs.
>>>>>>> 
>>>>>>> Florian
>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Jul 13, 2015, at 10:56 AM, Florian Lindner <[email protected]> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> Hello,
>>>>>>>>> 
>>>>>>>>> our petsc application suffers from a memory error (double free or 
>>>>>>>>> corruption).
>>>>>>>>> 
>>>>>>>>> Situation is a like that:
>>>>>>>>> 
>>>>>>>>> A KSP is private member of a C++ class. In its constructor I call 
>>>>>>>>> KSPCreate. Inbetween it may haben that I call KSPREset. In the class' 
>>>>>>>>> destructor I call KSPDestroy. That's where the memory error appears:
>>>>>>>>> 
>>>>>>>>> gdb backtrace:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> #4  0x00007ffff490b8db in _int_free () from /usr/lib/libc.so.6
>>>>>>>>> #5  0x00007ffff6188c9c in PetscFreeAlign (ptr=0xfcd990, line=258, 
>>>>>>>>> func=0x7ffff753c4c8 <__func__.20304> "KSPReset_GMRES", 
>>>>>>>>> file=0x7ffff753b8b0 
>>>>>>>>> "/home/florian/software/petsc/src/ksp/ksp/impls/gmres/gmres.c")
>>>>>>>>> at /home/florian/software/petsc/src/sys/memory/mal.c:72
>>>>>>>>> #6  0x00007ffff6ff6cdc in KSPReset_GMRES (ksp=0xf48470) at 
>>>>>>>>> /home/florian/software/petsc/src/ksp/ksp/impls/gmres/gmres.c:258
>>>>>>>>> #7  0x00007ffff70ad804 in KSPReset (ksp=0xf48470) at 
>>>>>>>>> /home/florian/software/petsc/src/ksp/ksp/interface/itfunc.c:885
>>>>>>>>> #8  0x00007ffff70ae2e8 in KSPDestroy (ksp=0xeb89d8) at 
>>>>>>>>> /home/florian/software/petsc/src/ksp/ksp/interface/itfunc.c:933
>>>>>>>>> 
>>>>>>>>> #9  0x0000000000599b24 in 
>>>>>>>>> precice::mapping::PetRadialBasisFctMapping<precice::mapping::Gaussian>::~PetRadialBasisFctMapping
>>>>>>>>>  (this=0xeb8960) at src/mapping/PetRadialBasisFctMapping.hpp:148
>>>>>>>>> #10 0x0000000000599bc9 in 
>>>>>>>>> precice::mapping::PetRadialBasisFctMapping<precice::mapping::Gaussian>::~PetRadialBasisFctMapping
>>>>>>>>>  (this=0xeb8960) at src/mapping/PetRadialBasisFctMapping.hpp:146
>>>>>>>>> 
>>>>>>>>> Complete backtrace at http://pastebin.com/ASjibeNF
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Could it be a problem it objects set by KSPSetOperators are destroyed 
>>>>>>>>> afterwards? I don't think so, since KSPReset is called before.
>>>>>>>>> 
>>>>>>>>> I've wrapped a class (just a bunch of helper function, no 
>>>>>>>>> encapsulating wrapper) round Mat and Vec objects. Nothing fancy, the 
>>>>>>>>> ctor calls MatCreate, the dtor MatDestroy, you can have a look at 
>>>>>>>>> https://github.com/precice/precice/blob/develop/src/mapping/petnum.cpp
>>>>>>>>>  / .hpp.
>>>>>>>>> 
>>>>>>>>> These objects are also members of the same class like KSP, so their 
>>>>>>>>> dtor is called after KSPDestroy.
>>>>>>>>> 
>>>>>>>>> What could cause the memory corruption here?
>>>>>>>>> 
>>>>>>>>> Thanks a lot,
>>>>>>>>> Florian

Reply via email to