I've been a long time user of valgrind, but am having serious problems with 
recent versions.  I don't know if it is the switch from ia32 to x86_64, the 
switch from lam 7 to openmpi, or the switch from valgrind 2 to valgrind 3, 
but here is my problem:

I have a small sample program that has definite, obvious errors in it. 
When I build and compile it on my ia32 system, with lam 7, valgrind 2.2 
correctly reports the errors, when compiled with MPI or without MPI.

When I build the program WITHOUT MPI at all, on my x86_64 system with Intel 
Fortran and GCC, valgrind 3.4 also correctly reports errors.

However, if I build the program with openmpi (or hp-mpi) on my x86_64 
system, valgrind 3.4 reports no errors at all.  This is a serious problem 
for me, as in the past few weeks I've run into a few problems that crash 
with openmpi/x86_64, but I can't debug them with valgrind.  When I move the 
code to the old IA32 system and use valgrind there, and find and fix the 
errors, the resulting code runs fine on the openmpi/x86_64 system.  This 
says to me that the errors detected on the IA32 system are in fact causing 
problems on the x86_64 system (usually resulting in an error in free() or 
malloc() because the memory structures are corrupt).  But valgrind isn't 
seeing them at all....

I'm also getting TONS of "uninitialized value" errors with HP-MPI that I 
never got before (and some of which I have carefully tracked down, and they 
are bogus, the values are clearly initialized), but that is another issue....

Any suggestions or info would be greatly appreciated.

(note: for compiling/testing on ia32 machine, change integer*8 to integer*4 
and "long long" to "long", since pointers are 4 bytes long)

Here are my sample programs/makefile:

*******************     makefile   ****************************

all: tst_mpi tst_nompi
tst_nompi: tst_nompi.o mtst_nompi.o
         ifort -g -o tst_nompi tst_nompi.o mtst_nompi.o
tst_nompi.o: tst.F
         ifort -g -c tst.F -o tst_nompi.o
mtst_nompi.o: mtst.c
         cc -g -c mtst.c -o mtst_nompi.o
tst_mpi: tst_mpi.o mtst_mpi.o
         mpif77 -g -DUSEMPI -o tst_mpi tst_mpi.o mtst_mpi.o
tst_mpi.o: tst.F
         mpif77 -g -DUSEMPI -c tst.F -o tst_mpi.o
mtst_mpi.o: mtst.c
         mpicc -g -c mtst.c -o mtst_mpi.o


********************* tst.F ***************************************
       program test
       common /mem/ mp
       integer ia(1)
       pointer (mp,ia)
       integer*8 lmalloc
       external lmalloc
c
#ifdef USEMPI
       include 'mpif.h'
       call mpi_init(ierr)
       call mpi_comm_rank(mpi_comm_world,iam,ierr)
       call mpi_comm_size(mpi_comm_world,numproc,ierr)
#endif
c
       nwords = 100000000
       mp=lmalloc(nwords)
       call subtest(ia,nwords)
#ifdef USEMPI
       call mpi_finalize(ierr)
#endif
       end
c
       subroutine subtest(iw,nwords)
       integer iw(*)
c
       iw(10)=10
c write to word BEFORE BEGINNING of allocated memory
       iw(0)=10
       iw(nwords)=10
c write to word AFTER END of allocated memory
       iw(nwords+1)=10
c
       return
       end

********************** mtst.c *******************************
#include <stdlib.h>
long long lmalloc_(int *nwords)
{
   printf("Sanity check: 8 = %d\n",sizeof(long long));
   return (long long) (void *) calloc(*nwords,sizeof(int));
}








-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to