When you configured MPICH did you use the flag --enable-g=meminit so it would not generate its own valgrind errors?
Barry > On Mar 2, 2016, at 4:11 PM, Xiaoye S. Li <[email protected]> wrote: > > I check that file, it also show not stripped. Not sure why it doesn't work. > Now I am using static library build to run valgrind, which works fine. > > Now on to the valgrind output, I see quite a few warnings are unnecessary. > For example, > > ==13292== Conditional jump or move depends on uninitialised value(s) > ==13292== at 0x5452D86: MPIC_Waitall (in > /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0) > ==13292== by 0x53AB23F: MPIR_Alltoall_intra (in > /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0) > ==13292== by 0x53ABFD4: MPIR_Alltoall (in > /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0) > ==13292== by 0x53AC08D: MPIR_Alltoall_impl (in > /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0) > ==13292== by 0x53AC896: PMPI_Alltoall (in > /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0) > ==13292== by 0x418161: dReDistribute_A (pddistribute.c:108) > ==13292== by 0x41950B: pddistribute (pddistribute.c:450) > ==13292== by 0x407D6A: pdgssvx (pdgssvx.c:1080) > ==13292== by 0x4027E5: main (pddrive.c:171) > > The line in pddistribute.c: 108 is this: > > MPI_Alltoall( nnzToSend, 1, mpi_int_t, nnzToRecv, 1, mpi_int_t, > grid->comm); > > For both buffers nnzToSend and nnzToRecv, I use "calloc" version to allocate > memory, i.e., malloc first, followed by zeroing the buffer. > mpi_int_t is defined as MPI_INT. > Why does it complain about uninitialized values? > > > Sherry > > > > > On Tue, Mar 1, 2016 at 8:27 PM, Satish Balay <[email protected]> wrote: > sometimes 'cmake' does a 'strip' during install of the library [which > can delete the debug symbols]. We had to track this down for one of > the cmake packages. I don't remember what we did to workarround it.. > > >> > petsc@es:/scratch/petsc/petsc/arch-linux-pkgs-valgrind/lib$ file > libsuperlu_dist.so.5.0.0 > libsuperlu_dist.so.5.0.0: ELF 64-bit LSB shared object, x86-64, version 1 > (SYSV), dynamically linked, not stripped > << > > looks like superlu_dist installed by petsc is not stripped. Perhaps > you can try: > > file > /home/xiaoye/Dropbox/Codes/SuperLU/superlu_dist.git/lib/libsuperlu_dist.so.5.0.0 > > Satish > > On Tue, 1 Mar 2016, Barry Smith wrote: > > > > > Satish will know far better than me. I only use Linux when my Mac OS > > fails me :-( > > > > > > > On Mar 1, 2016, at 8:41 PM, Xiaoye S. Li <[email protected]> wrote: > > > > > > This is on linux (ubunto). I did compile with -g, but only the example > > > driver (which is outside library) shows the line number, the routine in > > > the *.so does not show line number, see this: > > > > > > ==31609== Conditional jump or move depends on uninitialised value(s) > > > ==31609== at 0x51EED86: MPIC_Waitall (in > > > /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0) > > > ==31609== by 0x5148F99: MPIR_Alltoallv_intra (in > > > /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0) > > > ==31609== by 0x5149916: MPIR_Alltoallv (in > > > /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0) > > > ==31609== by 0x51499F6: MPIR_Alltoallv_impl (in > > > /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0) > > > ==31609== by 0x514A0C7: PMPI_Alltoallv (in > > > /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0) > > > ==31609== by 0x4E7C56A: pdCompRow_loc_to_CompCol_global (in > > > /home/xiaoye/Dropbox/Codes/SuperLU/superlu_dist.git/lib/libsupe\ > > > rlu_dist.so.5.0.0) > > > ==31609== by 0x4E71761: pdgssvx (in > > > /home/xiaoye/Dropbox/Codes/SuperLU/superlu_dist.git/lib/libsuperlu_dist.so.5.0.0) > > > ==31609== by 0x401400: main (pddrive.c:171) > > > > > > > > > Here are the flags: > > > > > > C_FLAGS = -DUSE_VENDOR_BLAS -DAdd_ -DDEBUGlevel=0 -DPRNTlevel=0 -std=c99 > > > -g -fPIC -I/home/xiaoye/Dropbox/Codes/SuperLU/superl\ > > > u_dist.git/SRC -I/home/xiaoye/lib/parmetis-4.0.3/include > > > -I/home/xiaoye/lib/parmetis-4.0.3/metis/include -I/home/xiaoye/mpich-\ > > > install/include > > > > > > > > > Any idea? > > > Sherry > > > > > > > > > On Tue, Mar 1, 2016 at 6:00 PM, Barry Smith <[email protected]> wrote: > > > > > > > On Mar 1, 2016, at 7:41 PM, Xiaoye S. Li <[email protected]> wrote: > > > > > > > > Barry, > > > > > > > > I am cleaning up the valgrind errors. I did a build with shared library > > > > option, but valgrind doesn't give me the source code line number. Is > > > > it true that I need to build as static library? > > > > > > No but if you are running on an Apple you may need the additional > > > valgrind option --dsymutil=yes (yes it is totally goofy that it doesn't > > > just do this automatically). Also, of course, the source code needs be > > > compiled with the -g option. > > > > > > Barry > > > > > > > > > > > Sherry > > > > > > > > > > > > > > > >
