I'd like to add my concern to the thread at
http://www.open-mpi.org/community/lists/users/2009/03/8661.php that the
latest 1.3 series produces far too much memory-checker noise.
We use Valgrind extensively during debugging, and although I'm using the
latest snapshot (1.3.2a1r20901) and latest Valgrind, and have
--with-valgrind turned on to suppress the PLPA-check related errors, I'm
still getting far too many issues from the following simple test:
#include <iostream>
#include <mpi.h>
int
main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
int myRank;
if(!MPI_Comm_rank(MPI_COMM_WORLD, &myRank)) {
std::cout << "Hello World from " << myRank << std::endl;
}
MPI_Finalize();
return 0;
}
Running this via "mpirun -np 2 valgrind hello_mpi" gives:
==16829== Memcheck, a memory error detector.
==16829== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al.
==16829== Using LibVEX rev 1884, a library for dynamic binary translation.
==16829== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP.
==16829== Using valgrind-3.4.1, a dynamic binary instrumentation framework.
==16829== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.
==16829== For more details, rerun with: -v
==16829==
==16830== Memcheck, a memory error detector.
==16830== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al.
==16830== Using LibVEX rev 1884, a library for dynamic binary translation.
==16830== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP.
==16830== Using valgrind-3.4.1, a dynamic binary instrumentation framework.
==16830== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.
==16830== For more details, rerun with: -v
==16830==
==16830== Syscall param writev(vector[...]) points to uninitialised byte(s)
==16830== at 0x34DE2C9F0C: writev (in /lib64/libc-2.6.so)
==16830== by 0x5CD213: mca_oob_tcp_msg_send_handler (oob_tcp_msg.c:265)
==16830== by 0x5C5B6A: mca_oob_tcp_peer_send (oob_tcp_peer.c:197)
==16830== by 0x5CB958: mca_oob_tcp_send_nb (oob_tcp_send.c:167)
==16830== by 0x5DB136: orte_rml_oob_send (rml_oob_send.c:137)
==16830== by 0x5DBBBB: orte_rml_oob_send_buffer (rml_oob_send.c:269)
==16830== by 0x5AFF7E: allgather (grpcomm_bad_module.c:369)
==16830== by 0x5B0805: modex (grpcomm_bad_module.c:497)
==16830== by 0x453518: ompi_mpi_init (ompi_mpi_init.c:626)
==16830== by 0x476CF8: PMPI_Init (pinit.c:80)
==16830== by 0x423DE0: main (helloMPI.cpp:8)
==16830== Address 0x4e9e383 is 107 bytes inside a block of size 128 alloc'd
==16830== at 0x4A05FBB: malloc (vg_replace_malloc.c:207)
==16830== by 0x61684E: opal_dss_buffer_extend
(dss_internal_functions.c:68)
==16830== by 0x5F36CE: opal_dss_pack_byte (dss_pack.c:198)
==16830== by 0x616974: opal_dss_store_data_type
(dss_internal_functions.c:117)
==16830== by 0x5F31FF: opal_dss_pack (dss_pack.c:37)
==16830== by 0x5AFD65: allgather (grpcomm_bad_module.c:351)
==16830== by 0x5B0805: modex (grpcomm_bad_module.c:497)
==16830== by 0x453518: ompi_mpi_init (ompi_mpi_init.c:626)
==16830== by 0x476CF8: PMPI_Init (pinit.c:80)
==16830== by 0x423DE0: main (helloMPI.cpp:8)
==16829== Syscall param writev(vector[...]) points to uninitialised byte(s)
==16829== at 0x34DE2C9F0C: writev (in /lib64/libc-2.6.so)
==16829== by 0x5CD213: mca_oob_tcp_msg_send_handler (oob_tcp_msg.c:265)
==16829== by 0x5C5B6A: mca_oob_tcp_peer_send (oob_tcp_peer.c:197)
==16829== by 0x5CB958: mca_oob_tcp_send_nb (oob_tcp_send.c:167)
==16829== by 0x5DB136: orte_rml_oob_send (rml_oob_send.c:137)
==16829== by 0x5DBBBB: orte_rml_oob_send_buffer (rml_oob_send.c:269)
==16829== by 0x5AFF7E: allgather (grpcomm_bad_module.c:369)
==16829== by 0x5B0805: modex (grpcomm_bad_module.c:497)
==16829== by 0x453518: ompi_mpi_init (ompi_mpi_init.c:626)
==16829== by 0x476CF8: PMPI_Init (pinit.c:80)
==16829== by 0x423DE0: main (helloMPI.cpp:8)
==16829== Address 0x4e9e63b is 107 bytes inside a block of size 256 alloc'd
==16829== at 0x4A06092: realloc (vg_replace_malloc.c:429)
==16829== by 0x61681C: opal_dss_buffer_extend
(dss_internal_functions.c:63)
==16829== by 0x6181D2: opal_dss_copy_payload (dss_load_unload.c:164)
==16829== by 0x5AFEC9: allgather (grpcomm_bad_module.c:363)
==16829== by 0x5B0805: modex (grpcomm_bad_module.c:497)
==16829== by 0x453518: ompi_mpi_init (ompi_mpi_init.c:626)
==16829== by 0x476CF8: PMPI_Init (pinit.c:80)
==16829== by 0x423DE0: main (helloMPI.cpp:8)
==16829==
==16829== Conditional jump or move depends on uninitialised value(s)
==16829== at 0x4A5F4C: mca_mpool_sm_alloc (mpool_sm_module.c:79)
==16829== by 0x4F3585: mpool_calloc (btl_sm.c:108)
==16829== by 0x4F3E3B: sm_btl_first_time_init (btl_sm.c:307)
==16829== by 0x4F436F: mca_btl_sm_add_procs (btl_sm.c:484)
==16829== by 0x54ECFB: mca_bml_r2_add_procs (bml_r2.c:206)
==16829== by 0x4C2DC4: mca_pml_ob1_add_procs (pml_ob1.c:308)
==16829== by 0x45362A: ompi_mpi_init (ompi_mpi_init.c:667)
==16829== by 0x476CF8: PMPI_Init (pinit.c:80)
==16829== by 0x423DE0: main (helloMPI.cpp:8)
==16829==
==16829== Conditional jump or move depends on uninitialised value(s)
==16829== at 0x4A5F4C: mca_mpool_sm_alloc (mpool_sm_module.c:79)
==16829== by 0x4D81E2: ompi_free_list_grow (ompi_free_list.c:198)
==16829== by 0x4D8015: ompi_free_list_init_ex_new (ompi_free_list.c:163)
==16829== by 0x4F40C3: ompi_free_list_init_new (ompi_free_list.h:169)
==16829== by 0x4F3F57: sm_btl_first_time_init (btl_sm.c:333)
==16829== by 0x4F436F: mca_btl_sm_add_procs (btl_sm.c:484)
==16829== by 0x54ECFB: mca_bml_r2_add_procs (bml_r2.c:206)
==16829== by 0x4C2DC4: mca_pml_ob1_add_procs (pml_ob1.c:308)
==16829== by 0x45362A: ompi_mpi_init (ompi_mpi_init.c:667)
==16829== by 0x476CF8: PMPI_Init (pinit.c:80)
==16829== by 0x423DE0: main (helloMPI.cpp:8)
==16830==
==16830== Conditional jump or move depends on uninitialised value(s)
==16830== at 0x4A5F4C: mca_mpool_sm_alloc (mpool_sm_module.c:79)
==16830== by 0x4F3585: mpool_calloc (btl_sm.c:108)
==16830== by 0x4F3E3B: sm_btl_first_time_init (btl_sm.c:307)
==16830== by 0x4F436F: mca_btl_sm_add_procs (btl_sm.c:484)
==16830== by 0x54ECFB: mca_bml_r2_add_procs (bml_r2.c:206)
==16830== by 0x4C2DC4: mca_pml_ob1_add_procs (pml_ob1.c:308)
==16830== by 0x45362A: ompi_mpi_init (ompi_mpi_init.c:667)
==16830== by 0x476CF8: PMPI_Init (pinit.c:80)
==16830== by 0x423DE0: main (helloMPI.cpp:8)
==16829==
==16829== Conditional jump or move depends on uninitialised value(s)
==16829== at 0x4A5F4C: mca_mpool_sm_alloc (mpool_sm_module.c:79)
==16829== by 0x4F4619: sm_fifo_init (btl_sm.h:213)
==16829== by 0x4F4459: mca_btl_sm_add_procs (btl_sm.c:510)
==16829== by 0x54ECFB: mca_bml_r2_add_procs (bml_r2.c:206)
==16829== by 0x4C2DC4: mca_pml_ob1_add_procs (pml_ob1.c:308)
==16829== by 0x45362A: ompi_mpi_init (ompi_mpi_init.c:667)
==16829== by 0x476CF8: PMPI_Init (pinit.c:80)
==16829== by 0x423DE0: main (helloMPI.cpp:8)
==16830==
==16830== Conditional jump or move depends on uninitialised value(s)
==16830== at 0x4A5F4C: mca_mpool_sm_alloc (mpool_sm_module.c:79)
==16830== by 0x4D81E2: ompi_free_list_grow (ompi_free_list.c:198)
==16830== by 0x4D8015: ompi_free_list_init_ex_new (ompi_free_list.c:163)
==16830== by 0x4F40C3: ompi_free_list_init_new (ompi_free_list.h:169)
==16830== by 0x4F3F57: sm_btl_first_time_init (btl_sm.c:333)
==16830== by 0x4F436F: mca_btl_sm_add_procs (btl_sm.c:484)
==16830== by 0x54ECFB: mca_bml_r2_add_procs (bml_r2.c:206)
==16830== by 0x4C2DC4: mca_pml_ob1_add_procs (pml_ob1.c:308)
==16830== by 0x45362A: ompi_mpi_init (ompi_mpi_init.c:667)
==16830== by 0x476CF8: PMPI_Init (pinit.c:80)
==16830== by 0x423DE0: main (helloMPI.cpp:8)
==16830==
==16830== Conditional jump or move depends on uninitialised value(s)
==16830== at 0x4A5F4C: mca_mpool_sm_alloc (mpool_sm_module.c:79)
==16830== by 0x4F4619: sm_fifo_init (btl_sm.h:213)
==16830== by 0x4F4459: mca_btl_sm_add_procs (btl_sm.c:510)
==16830== by 0x54ECFB: mca_bml_r2_add_procs (bml_r2.c:206)
==16830== by 0x4C2DC4: mca_pml_ob1_add_procs (pml_ob1.c:308)
==16830== by 0x45362A: ompi_mpi_init (ompi_mpi_init.c:667)
==16830== by 0x476CF8: PMPI_Init (pinit.c:80)
==16830== by 0x423DE0: main (helloMPI.cpp:8)
Hello World from 1
Hello World from 0