On Wed, Nov 11, 2015 at 6:14 PM, Barry Smith <[email protected]> wrote:
> > Hmm, you absolutely must be using an options file otherwise it would > never be doing all the stuff it is doing inside PetscOptionsInsertFile()! > > Yes, here it is: -log_summary #-help -options_left false -damping 1.15 -fp_trap #-on_error_attach_debugger /usr/local/bin/gdb #-on_error_attach_debugger /Users/markadams/homebrew/bin/gdb #-start_in_debugger /Users/markadams/homebrew/bin/gdb -debugger_nodes 1 #-malloc_debug #-malloc_dump > Please send me the options file. > > Barry > > Most of the reports are doing to vendor crimes but it possible that the > PetscTokenFind() code has a memory issue though I don't see how. > > Seriously the NERSc people should be pressuring Cray to have valgrind > clean code, this is disgraceful. > > > Conditional jump or move depends on uninitialised value(s) > ==2948== at 0x542EC7: PetscTokenFind (str.c:965) > ==2948== by 0x4F00B9: PetscOptionsInsertString (options.c:390) > ==2948== by 0x4F2F7B: PetscOptionsInsertFile (options.c:590) > ==2948== by 0x4F4ED7: PetscOptionsInsert (options.c:721) > ==2948== by 0x51A629: PetscInitialize (pinit.c:859) > ==2948== by 0x47B98D: main (in > /global/u2/m/madams/hpsr/src/hpsr.arch-xc30-dbg-intel.ex) > ==2948== > ==2948== Use of uninitialised value of size 8 > ==2948== at 0x542ECD: PetscTokenFind (str.c:965) > ==2948== by 0x4F00B9: PetscOptionsInsertString (options.c:390) > ==2948== by 0x4F2F7B: PetscOptionsInsertFile (options.c:590) > ==2948== by 0x4F4ED7: PetscOptionsInsert (options.c:721) > ==2948== by 0x51A629: PetscInitialize (pinit.c:859) > ==2948== by 0x47B98D: main (in > /global/u2/m/madams/hpsr/src/hpsr.arch-xc30-dbg-intel.ex) > ==2948== > ==2948== Conditional jump or move depends on uninitialised value(s) > ==2948== at 0x542F04: PetscTokenFind (str.c:966) > ==2948== by 0x4F00B9: PetscOptionsInsertString (options.c:390) > ==2948== by 0x4F2F7B: PetscOptionsInsertFile (options.c:590) > ==2948== by 0x4F4ED7: PetscOptionsInsert (options.c:721) > ==2948== by 0x51A629: PetscInitialize (pinit.c:859) > ==2948== by 0x47B98D: main (in > /global/u2/m/madams/hpsr/src/hpsr.arch-xc30-dbg-intel.ex) > ==2948== > ==2948== Use of uninitialised value of size 8 > ==2948== at 0x542F0E: PetscTokenFind (str.c:967) > ==2948== by 0x4F00B9: PetscOptionsInsertString (options.c:390) > ==2948== by 0x4F2F7B: PetscOptionsInsertFile (options.c:590) > ==2948== by 0x4F4ED7: PetscOptionsInsert (options.c:721) > ==2948== by 0x51A629: PetscInitialize (pinit.c:859) > ==2948== by 0x47B98D: main (in > /global/u2/m/madams/hpsr/src/hpsr.arch-xc30-dbg-intel.ex) > ==2948== > ==2948== Use of uninitialised value of size 8 > ==2948== at 0x542F77: PetscTokenFind (str.c:973) > ==2948== by 0x4F00B9: PetscOptionsInsertString (options.c:390) > ==2948== by 0x4F2F7B: PetscOptionsInsertFile (options.c:590) > ==2948== by 0x4F4ED7: PetscOptionsInsert (options.c:721) > ==2948== by 0x51A629: PetscInitialize (pinit.c:859) > ==2948== by 0x47B98D: main (in > /global/u2/m/madams/hpsr/src/hpsr.arch-xc30-dbg-intel.ex) > ==2948== > ==2948== Use of uninitialised value of size 8 > ==2948== at 0x542F2D: PetscTokenFind (str.c:968) > ==2948== by 0x4F00B9: PetscOptionsInsertString (options.c:390) > ==2948== by 0x4F2F7B: PetscOptionsInsertFile (options.c:590) > ==2948== by 0x4F4ED7: PetscOptionsInsert (options.c:721) > ==2948== by 0x51A629: PetscInitialize (pinit.c:859) > ==2948== by 0x47B98D: main (in > /global/u2/m/madams/hpsr/src/hpsr.arch-xc30-dbg-intel.ex) > > > On Nov 11, 2015, at 3:38 PM, Mark Adams <[email protected]> wrote: > > > > These are the only PETSc params that I used: > > > > -log_summary > > -options_left false > > -fp_trap > > > > I last update about 3 weeks ago and I am on a branch. I can redo this > with a current master. My repo seems to have been polluted: > > > > 13:35 edison12 master> ~/petsc$ git status > > # On branch master > > # Your branch is ahead of 'origin/master' by 262 commits. > > # > > nothing to commit (working directory clean) > > > > I trust this is OK but let me know if you would like me to clone a fresh > repo. > > > > Mark > > > > > > > > On Wed, Nov 11, 2015 at 11:21 AM, Barry Smith <[email protected]> > wrote: > > > > Thanks > > > > do you use a petscrc file or any file with PETSc options in it for > the run? > > > > Thanks please send me the exact PETSc commit you are built off so I > can see the line numbers in our source when things go bad. > > > > Barry > > > > > On Nov 11, 2015, at 7:36 AM, Mark Adams <[email protected]> wrote: > > > > > > > > > > > > On Tue, Nov 10, 2015 at 11:15 AM, Barry Smith <[email protected]> > wrote: > > > > > > Please send me the full output. This is nuts and should be reported > once we understand it better to NERSc as something to be fixed. When I pay > $60 million in taxes to a computing center I expect something that works > fine for free on my laptop to work also there. > > > > > > Barry > > > > > > > On Nov 10, 2015, at 7:51 AM, Mark Adams <[email protected]> wrote: > > > > > > > > I ran an 8 processor job on Edison of a small code for a short run > (just a linear solve) and got 37 Mb of output! > > > > > > > > Here is a 'Petsc' grep. > > > > > > > > Perhaps we should build an ignore file for things that we believe is > a false positive. > > > > > > > > On Tue, Nov 3, 2015 at 11:55 AM, Barry Smith <[email protected]> > wrote: > > > > > > > > I am more optimistic about valgrind than Mark. I first try > valgrind and if that fails to be helpful then use the debugger. valgrind > has the advantage that it finds the FIRST place that something is wrong, > while in the debugger it is kind of late at the crash. > > > > > > > > Valgrind should not be noisy, if it is then the > applications/libraries should be cleaned up so that they are valgrind clean > and then valgrind is useful. > > > > > > > > Barry > > > > > > > > > > > > > > > > > On Nov 3, 2015, at 7:47 AM, Mark Adams <[email protected]> wrote: > > > > > > > > > > BTW, I think that our advice for segv is use a debugger. DDT or > Totalview, and gdb if need be, will get you right to the source code and > will get 90% of bugs diagnosed. Valgrind is noisy and cumbersome to use > but can diagnose 90% of the other 10%. > > > > > > > > > > On Tue, Nov 3, 2015 at 7:32 AM, Denis Davydov <[email protected]> > wrote: > > > > > Hi Jose, > > > > > > > > > > > On 3 Nov 2015, at 12:20, Jose E. Roman <[email protected]> > wrote: > > > > > > > > > > > > I am answering the SLEPc-related questions: > > > > > > - Having different number of iterations when changing the number > of processes is normal. > > > > > the change in iterations i mentioned are for different > preconditioners, but the same number of MPI processes. > > > > > > > > > > > > > > > > - Yes, if you do not destroy the EPS solver, then the > preconditioner would be reused. > > > > > > > > > > > > Regarding the segmentation fault, I have no clue. Not sure if > this is related to GAMG or not. Maybe running under valgrind could provide > more information. > > > > > will try that. > > > > > > > > > > Denis. > > > > > > > > > > > > > > > > > <petsc_val.gz> > > > > > > > > > <outval.gz> > > > > > >
