Hi,
We have been experiencing strange crashes in our application that mostly
works on memory allocated through MPI_Win_allocate and
MPI_Win_allocate_shared. We eventually realized that the application
crashes if it is compiled with -O3 or -Ofast and run with an odd number
of processors on our x86_64 machines.
After some debugging we found that the minimum alignment of the memory
returned by MPI_Win_allocate is 4 Bytes, which is fine for 32b data
types but causes problems with 64b data types (such as size_t) and
automatic loop vectorization (tested with GCC 5.3.0). Here the compiler
assumes a natural alignment, which should be at least 8 Byte on x86_64
and is guaranteed by malloc and new.
Interestingly, the alignment of the returned memory depends on the
number of processes running. I am attaching a small reproducer that
prints the alignments of memory returned by MPI_Win_alloc,
MPI_Win_alloc_shared, and MPI_Alloc_mem (the latter seems to be fine).
Example for 2 processes (correct alignment):
[MPI_Alloc_mem] Alignment of baseptr=0x260ac60: 32
[MPI_Win_allocate] Alignment of baseptr=0x7f94d7aa30a8: 40
[MPI_Win_allocate_shared] Alignment of baseptr=0x7f94d7aa30a8: 40
Example for 3 processes (alignment 4 Bytes even with 8 Byte displacement
unit):
[MPI_Alloc_mem] Alignment of baseptr=0x115e970: 48
[MPI_Win_allocate] Alignment of baseptr=0x7f685f50f0c4: 4
[MPI_Win_allocate_shared] Alignment of baseptr=0x7fec618bc0c4: 4
Is this a known issue? I expect users to rely on basic alignment
guarantees made by malloc/new to be true for any function providing
malloc-like behavior, even more so as a hint on the alignment
requirements is passed to MPI_Win_alloc in the form of the disp_unit
argument.
I was able to reproduce this issue in both OpenMPI 1.10.5 and 2.0.2. I
also tested with MPICH, which provides correct alignment.
Cheers,
Joseph
--
Dipl.-Inf. Joseph Schuchart
High Performance Computing Center Stuttgart (HLRS)
Nobelstr. 19
D-70569 Stuttgart
Tel.: +49(0)711-68565890
Fax: +49(0)711-6856832
E-Mail: schuch...@hlrs.de
#include <mpi.h>
#include <stdio.h>
#include <stdint.h>
static void
test_allocmem()
{
char *baseptr;
MPI_Info win_info;
MPI_Info_create(&win_info);
MPI_Info_set(win_info, "alloc_shared_noncontig", "true");
MPI_Alloc_mem(sizeof(uint64_t), win_info, &baseptr);
printf("[MPI_Alloc_mem] Alignment of baseptr=%p: %li\n", baseptr, ((uint64_t)baseptr) % 64);
MPI_Info_free(&win_info);
MPI_Free_mem(baseptr);
}
static void
test_allocate()
{
char *baseptr;
MPI_Win win;
MPI_Info win_info;
MPI_Info_create(&win_info);
MPI_Info_set(win_info, "alloc_shared_noncontig", "true");
MPI_Win_allocate(
sizeof(uint64_t),
sizeof(uint64_t),
win_info,
MPI_COMM_WORLD,
&baseptr,
&win);
printf("[MPI_Win_allocate] Alignment of baseptr=%p: %li\n", baseptr, ((uint64_t)baseptr) % 64);
MPI_Win_free(&win);
MPI_Info_free(&win_info);
}
static void
test_allocate_shared()
{
char *baseptr;
MPI_Win win;
MPI_Comm sharedmem_comm;
MPI_Group sharedmem_group, group_all;
MPI_Comm_split_type(
MPI_COMM_WORLD,
MPI_COMM_TYPE_SHARED,
1,
MPI_INFO_NULL,
&sharedmem_comm);
MPI_Info win_info;
MPI_Info_create(&win_info);
MPI_Info_set(win_info, "alloc_shared_noncontig", "true");
MPI_Win_allocate_shared(
sizeof(uint64_t),
sizeof(uint64_t),
win_info,
sharedmem_comm,
&baseptr,
&win);
printf("[MPI_Win_allocate_shared] Alignment of baseptr=%p: %li\n", baseptr, ((uint64_t)baseptr) % 64);
MPI_Win_free(&win);
MPI_Info_free(&win_info);
}
int main(int argc, char **argv)
{
MPI_Init(&argc, &argv);
test_allocmem();
MPI_Barrier(MPI_COMM_WORLD);
test_allocate();
MPI_Barrier(MPI_COMM_WORLD);
test_allocate_shared();
MPI_Finalize();
return 0;
}
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users