Hello,

I have observed what seems to be false positives running under Valgrind when 
Open MPI is build with --enable-memchecker
(at least with versions 1.10.4 and 2.0.1).

Attached is a simple test case (extracted from larger code) that sends one int 
to rank r+1, and receives from rank r-1
(using MPI_COMM_NULL to handle ranks below 0 or above comm size).

Using:

~/opt/openmpi-2.0/bin/mpicc -DVARIANT_1 vg_mpi.c
~/opt/openmpi-2.0/bin/mpiexec -output-filename vg_log -n 2 valgrind ./a.out

I get the following Valgrind error for rank 1:

==8382== Invalid read of size 4
==8382==    at 0x400A00: main (in /home/yvan/test/a.out)
==8382==  Address 0xffefffe70 is on thread 1's stack
==8382==  in frame #0, created by main (???:)


Using:

~/opt/openmpi-2.0/bin/mpicc -DVARIANT_2 vg_mpi.c
~/opt/openmpi-2.0/bin/mpiexec -output-filename vg_log -n 2 valgrind ./a.out

I get the following Valgrind error for rank 1:

==8322== Invalid read of size 4
==8322==    at 0x400A6C: main (in /home/yvan/test/a.out)
==8322==  Address 0xcb6f9a0 is 0 bytes inside a block of size 4 alloc'd
==8322==    at 0x4C29BBE: malloc (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8322==    by 0x400998: main (in /home/yvan/test/a.out)

I get no error for the default variant (no -D_VARIANT...) with either Open MPI 
2.0.1, or 1.10.4,
but de get an error similar to variant 1 on the parent code from which the 
example was extracted...

is given below. Running under Valgrind's gdb server, for the parent code of 
variant 1,
it even seems the value received on rank 1 is uninitialized, then Valgrind 
complains
with the given message.

The code fails to work as intended when run under Valgrind when OpenMPI is 
built with --enable-memchecker,
while it works fine when run with the same build but not under Valgrind,
or when run under Valgrind with Open MPI built without memchecker.

I'm running under Arch Linux (whosed packaged Open MPI 1.10.4 is built with 
memchecker enabled,
rendering it unusable under Valgrind).

Did anybody else encounter this type of issue, or I does my code contain an 
obvious mistake that I am missing ?
I initially though of possible alignment issues, but saw nothing in the 
standard that requires that,
and the "malloc"-base variant exhibits the same behavior,while I assume
alignment to 64-bits for allocated arrays is the default.

Best regards,

  Yvan Fournier
#include <stdio.h>
#include <stdlib.h>

#include <mpi.h>

int main(int argc, char *argv[])
{
  MPI_Status status;

  int l = 5, l_prev = 0;
  int rank_next = MPI_PROC_NULL, rank_prev = MPI_PROC_NULL;
  int rank_id = 0, n_ranks = 1, tag = 1;

  MPI_Init(&argc, &argv);

  MPI_Comm_rank(MPI_COMM_WORLD, &rank_id);
  MPI_Comm_size(MPI_COMM_WORLD, &n_ranks);
  if (rank_id > 0)
    rank_prev = rank_id -1;
  if (rank_id + 1 < n_ranks)
    rank_next = rank_id + 1;

#if defined(VARIANT_1)

  int sendbuf[1] = {l};
  int recvbuf[1] = {0};

  if (rank_id %2 == 0) {
    MPI_Send(sendbuf, 1, MPI_INT, rank_next, tag, MPI_COMM_WORLD);
    MPI_Recv(recvbuf, 1, MPI_INT, rank_prev, tag, MPI_COMM_WORLD, &status);
  }
  else {
    MPI_Recv(recvbuf, 1, MPI_INT, rank_prev, tag, MPI_COMM_WORLD, &status);
    MPI_Send(sendbuf, 1, MPI_INT, rank_next, tag, MPI_COMM_WORLD);
  }

  l_prev = recvbuf[0];

#elif defined(VARIANT_2)

  int *sendbuf = malloc(sizeof(int));
  int *recvbuf = malloc(sizeof(int));

  sendbuf[0] = l;

  if (rank_id %2 == 0) {
    MPI_Send(sendbuf, 1, MPI_INT, rank_next, tag, MPI_COMM_WORLD);
    MPI_Recv(recvbuf, 1, MPI_INT, rank_prev, tag, MPI_COMM_WORLD, &status);
  }
  else {
    MPI_Recv(recvbuf, 1, MPI_INT, rank_prev, tag, MPI_COMM_WORLD, &status);
    MPI_Send(sendbuf, 1, MPI_INT, rank_next, tag, MPI_COMM_WORLD);
  }

  l_prev = recvbuf[0];

#else

  if (rank_id %2 == 0) {
    MPI_Send(&l, 1, MPI_INT, rank_next, tag, MPI_COMM_WORLD);
    MPI_Recv(&l_prev, 1, MPI_INT, rank_prev, tag, MPI_COMM_WORLD, &status);
  }
  else {
    MPI_Recv(&l_prev, 1, MPI_INT, rank_prev, tag, MPI_COMM_WORLD, &status);
    MPI_Send(&l, 1, MPI_INT, rank_next, tag, MPI_COMM_WORLD);
  }

#endif

  printf("r%d, l=%d\n");

  MPI_Finalize();
  exit(0);
}

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to