Are you sure bfort got rebuilt witht he change? I cant get this valgrind output..
>>>>>>. asterix:/home/balay/tmp/spetsc/src/mat/utils>valgrind --tool=memcheck bfort -dir `pwd`/ftn-auto -mnative -ansi -nomsgs -noprofile -anyname -mapptr -mpi -mpi2 -ferr -ptrprefix Petsc -ptr64 PETSC_USE_POINTER_CONVERSION -fcaps PETSC_HAVE_FORTRAN_CAPS -fuscore PETSC_HAVE_FORTRAN_UNDERSCORE -f90mod_skip_header matio.c convert.c gcreate.c freespace.c getcolv.c ptap.c compressedrow.c matstash.c multequal.c axpy.c freespace.h zerodiag.c matstashspace.c ==2289== Memcheck, a memory error detector ==2289== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al. ==2289== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info ==2289== Command: bfort -dir /home/balay/tmp/spetsc/src/mat/utils/ftn-auto -mnative -ansi -nomsgs -noprofile -anyname -mapptr -mpi -mpi2 -ferr -ptrprefix Petsc -ptr64 PETSC_USE_POINTER_CONVERSION -fcaps PETSC_HAVE_FORTRAN_CAPS -fuscore PETSC_HAVE_FORTRAN_UNDERSCORE -f90mod_skip_header matio.c convert.c gcreate.c freespace.c getcolv.c ptap.c compressedrow.c matstash.c multequal.c axpy.c freespace.h zerodiag.c matstashspace.c ==2289== ==2289== ==2289== HEAP SUMMARY: ==2289== in use at exit: 5,093 bytes in 50 blocks ==2289== total heap usage: 71 allocs, 21 frees, 17,021 bytes allocated ==2289== ==2289== LEAK SUMMARY: ==2289== definitely lost: 5,093 bytes in 50 blocks ==2289== indirectly lost: 0 bytes in 0 blocks ==2289== possibly lost: 0 bytes in 0 blocks ==2289== still reachable: 0 bytes in 0 blocks ==2289== suppressed: 0 bytes in 0 blocks ==2289== Rerun with --leak-check=full to see details of leaked memory ==2289== ==2289== For counts of detected and suppressed errors, rerun with: -v ==2289== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 6 from 6) asterix:/home/balay/tmp/spetsc/src/mat/utils> <<<<<<<<<< I'll send in the bugrepot to Bill later today - as it gets toggled with the following change http://petsc.cs.iit.edu/petsc/externalpackages/sowing-1.1.11/rev/e591c037e500 But regarding your machine - somthing changed on it a couple of days back - thats triggering this issue. You haven't mentioned how I can reporduce it ['stack smashing detected' mesg etc..] Satish On Fri, 25 Dec 2009, Matthew Knepley wrote: > Here is the valgrind for your +100 fix: > > knepley at khan:/PETSc3/petsc/petsc-dev/src/mat/utils$ valgrind > /PETSc3/petsc/petsc-dev/linux-gnu-cxx-debug/bin/bfort -dir > /PETSc3/petsc/petsc-dev/src/mat/utils/ftn-auto -mnative -ansi -nomsgs > -noprofile -anyname -mapptr -mpi -mpi2 -ferr -ptrprefix Petsc -ptr64 > PETSC_USE_POINTER_CONVERSION -fcaps PETSC_HAVE_FORTRAN_CAPS -fuscore > PETSC_HAVE_FORTRAN_UNDERSCORE -f90mod_skip_header matio.c convert.c > gcreate.c freespace.c getcolv.c ptap.c compressedrow.c matstash.c > multequal.c axpy.c freespace.h zerodiag.c matstashspace.c > ==20868== Memcheck, a memory error detector. > ==20868== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al. > ==20868== Using LibVEX rev 1804, a library for dynamic binary translation. > ==20868== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP. > ==20868== Using valgrind-3.3.0-Debian, a dynamic binary instrumentation > framework. > ==20868== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al. > ==20868== For more details, rerun with: -v > ==20868== > ==20868== Conditional jump or move depends on uninitialised value(s) > ==20868== at 0x804C2BB: PrintBody (bfort.c:1362) > ==20868== by 0x804A622: OutputRoutine (bfort.c:575) > ==20868== by 0x804A0B2: main (bfort.c:475) > ==20868== > ==20868== Conditional jump or move depends on uninitialised value(s) > ==20868== at 0x804C293: PrintBody (bfort.c:1363) > ==20868== by 0x804A622: OutputRoutine (bfort.c:575) > ==20868== by 0x804A0B2: main (bfort.c:475) > ==20868== > ==20868== Conditional jump or move depends on uninitialised value(s) > ==20868== at 0x804C5E7: PrintBody (bfort.c:1384) > ==20868== by 0x804A622: OutputRoutine (bfort.c:575) > ==20868== by 0x804A0B2: main (bfort.c:475) > ==20868== > ==20868== Conditional jump or move depends on uninitialised value(s) > ==20868== at 0x804C396: PrintBody (bfort.c:1385) > ==20868== by 0x804A622: OutputRoutine (bfort.c:575) > ==20868== by 0x804A0B2: main (bfort.c:475) > ==20868== > ==20868== Conditional jump or move depends on uninitialised value(s) > ==20868== at 0x804C3CE: PrintBody (bfort.c:1387) > ==20868== by 0x804A622: OutputRoutine (bfort.c:575) > ==20868== by 0x804A0B2: main (bfort.c:475) > ==20868== > ==20868== Conditional jump or move depends on uninitialised value(s) > ==20868== at 0x804C3E9: PrintBody (bfort.c:1387) > ==20868== by 0x804A622: OutputRoutine (bfort.c:575) > ==20868== by 0x804A0B2: main (bfort.c:475) > ==20868== > ==20868== Conditional jump or move depends on uninitialised value(s) > ==20868== at 0x804C589: PrintBody (bfort.c:1406) > ==20868== by 0x804A622: OutputRoutine (bfort.c:575) > ==20868== by 0x804A0B2: main (bfort.c:475) > ==20868== > ==20868== Use of uninitialised value of size 4 > ==20868== at 0x40239D8: strlen (mc_replace_strmem.c:242) > ==20868== by 0x4198127: fputs (in /lib/tls/i686/cmov/libc-2.7.so) > ==20868== by 0x804C5BE: PrintBody (bfort.c:1408) > ==20868== by 0x804A622: OutputRoutine (bfort.c:575) > ==20868== by 0x804A0B2: main (bfort.c:475) > ==20868== > ==20868== Invalid read of size 1 > ==20868== at 0x40239D8: strlen (mc_replace_strmem.c:242) > ==20868== by 0x4198127: fputs (in /lib/tls/i686/cmov/libc-2.7.so) > ==20868== by 0x804C5BE: PrintBody (bfort.c:1408) > ==20868== by 0x804A622: OutputRoutine (bfort.c:575) > ==20868== by 0x804A0B2: main (bfort.c:475) > ==20868== Address 0x0 is not stack'd, malloc'd or (recently) free'd > ==20868== > ==20868== Process terminating with default action of signal 11 (SIGSEGV) > ==20868== Access not within mapped region at address 0x0 > ==20868== at 0x40239D8: strlen (mc_replace_strmem.c:242) > ==20868== by 0x4198127: fputs (in /lib/tls/i686/cmov/libc-2.7.so) > ==20868== by 0x804C5BE: PrintBody (bfort.c:1408) > ==20868== by 0x804A622: OutputRoutine (bfort.c:575) > ==20868== by 0x804A0B2: main (bfort.c:475) > ==20868== > ==20868== ERROR SUMMARY: 72 errors from 9 contexts (suppressed: 17 from 1) > ==20868== malloc/free: in use at exit: 1,056 bytes in 3 blocks. > ==20868== malloc/free: 6 allocs, 3 frees, 2,112 bytes allocated. > ==20868== For counts of detected errors, rerun with: -v > ==20868== searching for pointers to 3 not-freed blocks. > ==20868== checked 243,864 bytes. > ==20868== > ==20868== LEAK SUMMARY: > ==20868== definitely lost: 0 bytes in 0 blocks. > ==20868== possibly lost: 0 bytes in 0 blocks. > ==20868== still reachable: 1,056 bytes in 3 blocks. > ==20868== suppressed: 0 bytes in 0 blocks. > ==20868== Rerun with --leak-check=full to see details of leaked memory. > Segmentation fault > > The problem is that argument lists are just not parsed correctly for > gcreate.c. You can send that to Bill. > > Matt > > On Fri, Dec 25, 2009 at 10:55 AM, Satish Balay <balay at mcs.anl.gov> wrote: > > > One more thing. If I remove this patch from sowing - the valgrind log > > is clean. > > > > > > http://petsc.cs.iit.edu/petsc/externalpackages/sowing-1.1.11/rev/e591c037e500 > > > > Perhaps you can find the bug in this change. If not - I'll send a bug > > report to Bill. > > > > Satish > > > > On Fri, 25 Dec 2009, Satish Balay wrote: > > > > > Can you send me the valgrind.log - with the patch applied to the > > > unmodified sowing-1.1.11-a.tar.gz? > > > > > > Also the command you are using to generate this log? > > > > > > > > > I've used the following: > > > valgrind --tool=memcheck -q --log-file=valgrind.log bfort -dir > > `pwd`/ftn-auto -ansi -nomsgs -noprofile -anyname -mapptr -mpi -mpi2 -ferr > > -ptrprefix Petsc -ptr64 PETSC_USE_POINTER_CONVERSION -fcaps > > PETSC_HAVE_FORTRAN_CAPS -fuscore PETSC_HAVE_FORTRAN_UNDERSCORE matrix.c > > > > > > Satish > > > > > > On Fri, 25 Dec 2009, Matthew Knepley wrote: > > > > > > > Valgrind is not clean for me with the change. > > > > > > > > Matt > > > > > > > > On Fri, Dec 25, 2009 at 10:28 AM, Satish Balay <balay at mcs.anl.gov> > > wrote: > > > > > > > > > Well - normally the first step with detecting the bugs is to report > > > > > them to the author - and ask for a fix.. > > > > > > > > > > Satish > > > > > > > > > > On Fri, 25 Dec 2009, Matthew Knepley wrote: > > > > > > > > > > > I can try, but I still think replacement is the only real > > alternative. > > > > > This > > > > > > is not > > > > > > able to be debugged, or you would not recommend sticking in random > > > > > numbers > > > > > > in malloc() and I would be able to see where an SEGV occurs with > > gdb. > > > > > > > > > > > > Matt > > > > > > > > > > > > On Fri, Dec 25, 2009 at 10:16 AM, Satish Balay <balay at > > > > > > mcs.anl.gov> > > > > > wrote: > > > > > > > > > > > > > BTW: What linux are you using? ubuntu version? i686 or x86_64? > > etc... > > > > > > > > > > > > > > also try: > > > > > > > > > > > > > > arg->name = (char *)MALLOC( strlen(p) + 100 ); > > > > > > > > > > > > > > satish > > > > > > > > > > > > > > > > > > > > > On Fri, 25 Dec 2009, Satish Balay wrote: > > > > > > > > > > > > > > > Did my suggested change not work for you? > > > > > > > > > > > > > > > > Satish > > > > > > > > > > > > > > > > On Thu, 24 Dec 2009, Matthew Knepley wrote: > > > > > > > > > > > > > > > > > I spent a bunch of time on this today. This shit is > > hopelessly > > > > > broken. > > > > > > > It > > > > > > > > > sucks completely. > > > > > > > > > I cannot get it to run, nor see why it is causing stack > > overruns > > > > > and > > > > > > > SEGVs. > > > > > > > > > If anyone does > > > > > > > > > not think it is hopeless, speak up now. This is a complete > > fucking > > > > > > > > > embarrassment. > > > > > > > > > > > > > > > > > > Matt > > > > > > > > > > > > > > > > > > On Mon, Dec 21, 2009 at 4:42 PM, Matthew Knepley < > > > > > knepley at gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > This does not make any sense to me because it would be a > > heap > > > > > > > violation, > > > > > > > > > > not a stack smash. > > > > > > > > > > > > > > > > > > > > Matt > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 21, 2009 at 4:30 PM, Satish Balay < > > balay at mcs.anl.gov > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > >> [I don't know the correct fix for this - but ] The > > following > > > > > change > > > > > > > is > > > > > > > > > >> getting rid of valgrind messages for me. Maybe you can use > > this, > > > > > > > build > > > > > > > > > >> sowing separately - and continue.. > > > > > > > > > >> > > > > > > > > > >> Satish > > > > > > > > > >> > > > > > > > > > >> ---------- > > > > > > > > > >> > > > > > > > > > >> diff -r dbe25084c0e4 src/bfort/bfort.c > > > > > > > > > >> --- a/src/bfort/bfort.c Mon Dec 15 22:20:58 2008 -0600 > > > > > > > > > >> +++ b/src/bfort/bfort.c Mon Dec 21 16:29:09 2009 -0600 > > > > > > > > > >> @@ -2157,7 +2157,7 @@ > > > > > > > > > >> > > > > > > > > > >> /* Current token is name */ > > > > > > > > > >> arg->has_star = (nstar > 0); > > > > > > > > > >> - arg->name = (char *)MALLOC( strlen(p) + 1 ); > > > > > > > > > >> + arg->name = (char *)MALLOC( strlen(p) + 10 ); > > > > > > > > > >> strcpy( arg->name, p ); > > > > > > > > > >> > > > > > > > > > >> /* We can't output the name just yet, because if it is > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> On Mon, 21 Dec 2009, Matthew Knepley wrote: > > > > > > > > > >> > > > > > > > > > >> > The problem appears to be in OutputRoutine() in bfort.c, > > but > > > > > that > > > > > > > code > > > > > > > > > >> is > > > > > > > > > >> > impossible > > > > > > > > > >> > to debug. I can't see where something is getting > > overwritten, > > > > > and > > > > > > > it > > > > > > > > > >> looks > > > > > > > > > >> > like the check > > > > > > > > > >> > only happens when the routine returns. bfort is such > > crap. > > > > > > > > > >> > > > > > > > > > > >> > Matt > > > > > > > > > >> > > > > > > > > > > >> > On Mon, Dec 21, 2009 at 3:25 PM, Matthew Knepley < > > > > > > > knepley at gmail.com> > > > > > > > > > >> wrote: > > > > > > > > > >> > > > > > > > > > > >> > > On Mon, Dec 21, 2009 at 3:21 PM, Satish Balay < > > > > > > > balay at mcs.anl.gov> > > > > > > > > > >> wrote: > > > > > > > > > >> > > > > > > > > > > > >> > >> On Mon, 21 Dec 2009, Lisandro Dalc?n wrote: > > > > > > > > > >> > >> > > > > > > > > > >> > >> > On Mon, Dec 21, 2009 at 5:37 PM, Matthew Knepley < > > > > > > > > > >> knepley at gmail.com> > > > > > > > > > >> > >> wrote: > > > > > > > > > >> > >> > > > > > > > > > > > >> > >> > > It says there is a stack smash and no other info. > > This > > > > > is > > > > > > > > > >> completely > > > > > > > > > >> > >> fucking > > > > > > > > > >> > >> > > my development right now. > > > > > > > > > >> > >> > > > > > > > > > > > >> > >> > > > > > > > > > > >> > >> > Any chance bfort was built with -fstack-protector > > flag? > > > > > This > > > > > > > > > >> failure > > > > > > > > > >> > >> > could could be signaling an actual old bug in > > bfort... I > > > > > > > would > > > > > > > > > >> > >> > re-build bfort with debug and re-run under > > valgrind... > > > > > > > > > >> > >> > > > > > > > > > >> > >> That must be it. > > > > > > > > > >> > >> > > > > > > > > > >> > >> I just ran my build [which is without > > -fstack-protector] - > > > > > and > > > > > > > > > >> > >> valgrind does flag a bunch of things with bfort. > > > > > > > > > >> > >> > > > > > > > > > >> > > > > > > > > > > > >> > > 1) That flag is nowhere in my build. > > > > > > > > > >> > > > > > > > > > > > >> > > 2) Something changed > > > > > > > > > >> > > > > > > > > > > > >> > > Matt > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > >> I normally install sowing separately and have it in > > my PATH > > > > > - > > > > > > > so that > > > > > > > > > >> > >> it doesn't have to be rebuilt each time I build > > petsc. > > > > > > > > > >> > >> > > > > > > > > > >> > >> I guess we should sync up [our patches] with latest > > sowing > > > > > and > > > > > > > make > > > > > > > > > >> > >> sure its valgrind clean aswell. > > > > > > > > > >> > >> > > > > > > > > > >> > >> Satish > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > -- > > > > > > > > > >> > > What most experimenters take for granted before they > > begin > > > > > their > > > > > > > > > >> > > experiments is infinitely more interesting than any > > results > > > > > to > > > > > > > which > > > > > > > > > >> their > > > > > > > > > >> > > experiments lead. > > > > > > > > > >> > > -- Norbert Wiener > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > What most experimenters take for granted before they begin > > their > > > > > > > > > > experiments is infinitely more interesting than any results > > to > > > > > which > > > > > > > their > > > > > > > > > > experiments lead. > > > > > > > > > > -- Norbert Wiener > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
