Hi Allen,
Sorry for the confusion, your application doesn't use non-blocking
communications, so the receive buffers are still valid after you call
MPI_Recv_init, that's why the first two printf didn't complain. But in
MPI_Wait, it still checks the buffer, and make it invalid after packing
the message, that's because blocking and non-blocking communications
share some common code bases, and somehow memchecker can't distinguish
them. So for your case, I suggest you disable memchecker. And I'll find
a better solution for handling memchecker for both cases.
Thanks,
Shiqing
Allen Barnett wrote:
Hi Shiqing:
That is very clever to invalidate the buffer memory until the comm
completes! However, I guess I'm still confused by my results. Lines 30
and 31 identified by valgrind are the lines after the Wait, and, if I
comment out the prints before the Wait, I still get the valgrind errors
on the "After wait" prints.
If I add prints after the Request_free calls, then I no longer receive
the valgrind errors when accessing "buffer_in" from that point on. So,
it appears that the buffer is marked invalid until the request is freed.
Perhaps I don't understand the sequence of events in MPI. I thought the
buffer was ok to use after the Wait, and requests could be safely
recycled.
Or maybe valgrind is pointing to the wrong lines, however the addresses
which it reports as invalid are exactly those in the buffer which are
being accessed in the post-Wait prints. Here is snippet of a more
instrumented example program with line numbers.
----------------------------------------------
25 MPI_Recv_init( buffer_in, 100, MPI_CHAR, 1, 123, MPI_COMM_WORLD,
&req_in );
26 printf( "Before start: %p: %d\n", &buffer_in[0], buffer_in[0] );
27 printf( "Before start: %p: %d\n", &buffer_in[1], buffer_in[1] );
28 MPI_Start( &req_in );
29 printf( "Before wait: %p: %d\n", &buffer_in[2], buffer_in[2] );
30 printf( "Before wait: %p: %d\n", &buffer_in[3], buffer_in[3] );
31 MPI_Wait( &req_in, &status );
32 printf( "After wait: %p: %d\n", &buffer_in[4], buffer_in[4] );
33 printf( "After wait: %p: %d\n", &buffer_in[5], buffer_in[5] );
34 MPI_Request_free( &req_in );
35 printf( "After free: %p: %d\n", &buffer_in[6], buffer_in[6] );
36 printf( "After free: %p: %d\n", &buffer_in[7], buffer_in[7] );
--------------------------------------------------
And the valgrind output
Before start: 0x7ff0003c0: 1
Before start: 0x7ff0003c1: 1
Before wait: 0x7ff0003c2: 1
Before wait: 0x7ff0003c3: 1
==17395==
==17395== Invalid read of size 1
==17395== at 0x400CB7: main (waittest.c:32)
==17395== Address 0x7ff0003c4 is on thread 1's stack
After wait: 0x7ff0003c4: 2
==17395==
==17395== Invalid read of size 1
==17395== at 0x400CDB: main (waittest.c:33)
==17395== Address 0x7ff0003c5 is on thread 1's stack
After wait: 0x7ff0003c5: 2
After free: 0x7ff0003c6: 2
After free: 0x7ff0003c7: 2
Here valgrind is complaining about the prints on line 32 and 33 and the
memory addresses are consistent with buffer_in[4] and buffer_in[5]. So,
I'm still puzzled.
Thanks,
Allen
On Wed, 2009-08-12 at 10:31 +0200, Shiqing Fan wrote:
Hi Allen,
The invalid reads come from line 30 and 31 of your code, and I guess
they are the two 'printf's before MPI_Wait.
In Open MPI, when memchecker is enabled, OMPI marks the receive buffer
as invalid internally, immediately after receive starts for MPI semantic
checks, in this case, it just warns the users that they are accessing
the receive buffer before the receive has finished, which is not allowed
according to the MPI standard.
For a non-blocking receive, the communication only completes after
MPI_Wait is called. After that point, the user buffers are declared
valid again, that's why the 'printf's after MPI_Wait don't cause any
warnings from Valgrind. Hope this helps. :-)
Regards,
Shiqing
Allen Barnett wrote:
Hi:
I'm trying to use the memchecker/valgrind capability of OpenMPI 1.3.3 to
help debug my MPI application. I noticed a rather odd thing: After
Waiting on a Recv Request, valgrind declares my receive buffer as
invalid memory. Is this just a fluke of valgrind, or is OMPI doing
something internally?
This is on a 64-bit RHEL 5 system using GCC 4.3.2 and Valgrind 3.4.1.
Here is an example:
----------------------------------------------------------
#include <stdio.h>
#include <string.h>
#include "mpi.h"
int main(int argc, char *argv[])
{
int rank, size;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if ( size != 2 ) {
if ( rank == 0 )
printf("Please run with 2 processes.\n");
MPI_Finalize();
return 1;
}
if (rank == 0) {
char buffer_in[100];
MPI_Request req_in;
MPI_Status status;
memset( buffer_in, 1, sizeof(buffer_in) );
MPI_Recv_init( buffer_in, 100, MPI_CHAR, 1, 123, MPI_COMM_WORLD,
&req_in );
MPI_Start( &req_in );
printf( "Before wait: %p: %d\n", buffer_in, buffer_in[3] );
printf( "Before wait: %p: %d\n", buffer_in, buffer_in[4] );
MPI_Wait( &req_in, &status );
printf( "After wait: %p: %d\n", buffer_in, buffer_in[3] );
printf( "After wait: %p: %d\n", buffer_in, buffer_in[4] );
MPI_Request_free( &req_in );
}
else {
char buffer_out[100];
memset( buffer_out, 2, sizeof(buffer_out) );
MPI_Send( buffer_out, 100, MPI_CHAR, 0, 123, MPI_COMM_WORLD );
}
MPI_Finalize();
return 0;
}
----------------------------------------------------------
Doing "mpirun -np 2 -mca btl ^sm valgrind ./a.out" yields:
Before wait: 0x7ff0003b0: 1
Before wait: 0x7ff0003b0: 1
==15487==
==15487== Invalid read of size 1
==15487== at 0x400C6B: main (waittest.c:30)
==15487== Address 0x7ff0003b3 is on thread 1's stack
After wait: 0x7ff0003b0: 2
==15487==
==15487== Invalid read of size 1
==15487== at 0x400C8B: main (waittest.c:31)
==15487== Address 0x7ff0003b4 is on thread 1's stack
After wait: 0x7ff0003b0: 2
Also, if I run this program with the shared memory BTL active, valgrind
reports several "conditional jump or move depends on uninitialized
value"s in the SM BTL and about 24k lost bytes at the end (mostly from
allocations in MPI_Init).
Thanks,
Allen
--
--------------------------------------------------------------
Shiqing Fan http://www.hlrs.de/people/fan
High Performance Computing Tel.: +49 711 685 87234
Center Stuttgart (HLRS) Fax.: +49 711 685 65832
Address:Allmandring 30 email: f...@hlrs.de
70569 Stuttgart