Hi again,
Using MPI_Type_get_true_extent(), I changed the way of reporting type
size and extent to:
int typesize;
long typeextent, typelb;
MPI_Type_size(this->datatype,&typesize);
MPI_Type_get_true_extent(this->datatype,&typelb,&typeextent);
//MPI_Type_lb(this->datatype,&typelb);
//MPI_Type_extent(this->datatype,&typeextent);
printf("\ntype size for process rank (%d,%d) is %d doubles, type
extent is %d doubles (up to %d), range is [%d, %d].\n",pr,pc,typesize/
(int)sizeof(double),(int)(typeextent/sizeof(double)),nx*ny,(int)
(typelb/sizeof(double)),(int)((typelb+typeextent)/sizeof(double)));
Which now is giving me the correct answers for both situations. For
the first one (works):
type size for process rank (1,0) is 20 doubles, type extent is 60
doubles (up to 91), range is [28, 88].
type size for process rank (0,0) is 32 doubles, type extent is 81
doubles (up to 91), range is [0, 81].
type size for process rank (0,1) is 24 doubles, type extent is 80
doubles (up to 91), range is [4, 84].
type size for process rank (1,1) is 15 doubles, type extent is 59
doubles (up to 91), range is [32, 91].
For the second one (before getting the same double free error with
MPI_File_set_view):
type size for process rank (1,0) is 20 doubles, type extent is 48
doubles (up to 91), range is [4, 52].
type size for process rank (0,0) is 32 doubles, type extent is 51
doubles (up to 91), range is [0, 51].
type size for process rank (0,1) is 24 doubles, type extent is 38
doubles (up to 91), range is [52, 90].
type size for process rank (1,1) is 15 doubles, type extent is 35
doubles (up to 91), range is [56, 91].
Can anybody give me a hint here? Is there a bug in
MPI_Type_create_darray I should be aware of?
Best,
A
On Oct 30, 2008, at 5:21 PM, Antonio Molins wrote:
Hi all,
I am having some trouble with this function. I want to map data to a
2x2 block-cyclic configuration in C, using the code:
MPI_Barrier(blacs_comm);
// size of each matrix
int *array_of_gsizes = new int[2];
array_of_gsizes[0]=this->nx;
array_of_gsizes[1]=this->ny;
// block-cyclic distritution used by ScaLAPACK
int *array_of_distrs = new int[2];
array_of_distrs[0]=MPI_DISTRIBUTE_CYCLIC;
array_of_distrs[1]=MPI_DISTRIBUTE_CYCLIC;
int *array_of_dargs = new int[2];
array_of_dargs[0]=BLOCK_SIZE;
array_of_dargs[1]=BLOCK_SIZE;
int *array_of_psizes = new int[2];
array_of_psizes[0]=Pr;
array_of_psizes[1]=Pc;
int rank = pc+pr*Pc;
MPI_Type_create_darray(Pr*Pc,rank,
2,array_of_gsizes,array_of_distrs,array_of_dargs,
array_of_psizes,MPI_ORDER_C,MPI_DOUBLE,&this->datatype);
MPI_Type_commit(&this->datatype);
int typesize;
long typeextent;
MPI_Type_size(this->datatype,&typesize);
MPI_Type_extent(this->datatype,&typeextent);
printf("type size for process rank (%d,%d) is %d doubles, type
extent is %d doubles (up to %d).",pr,pc,typesize/(int)sizeof(double),
(int)(typeextent/sizeof(double)),nx*ny);
MPI_File_open(blacs_comm,(char*)filename, MPI_MODE_RDWR,
MPI_INFO_NULL, &this->fid);
MPI_File_set_view(this->fid,this->offset
+i*nx*ny*sizeof(double),MPI_DOUBLE,this-
>datatype,"native",MPI_INFO_NULL);
This works well when used like this, but problem is that the matrix
itself is written in disk column-major fashion, so I would want to
use the code as if I was reading it transposed, that is:
MPI_Barrier(blacs_comm);
// size of each matrix
int *array_of_gsizes = new int[2];
array_of_gsizes[0]=this->ny;
array_of_gsizes[1]=this->nx;
// block-cyclic distritution used by ScaLAPACK
int *array_of_distrs = new int[2];
array_of_distrs[0]=MPI_DISTRIBUTE_CYCLIC;
array_of_distrs[1]=MPI_DISTRIBUTE_CYCLIC;
int *array_of_dargs = new int[2];
array_of_dargs[0]=BLOCK_SIZE;
array_of_dargs[1]=BLOCK_SIZE;
int *array_of_psizes = new int[2];
array_of_psizes[0]=Pr;
array_of_psizes[1]=Pc;
int rank = pr+pc*Pr;
MPI_Type_create_darray(Pr*Pc,rank,
2,array_of_gsizes,array_of_distrs,array_of_dargs,
array_of_psizes,MPI_ORDER_C,MPI_DOUBLE,&this->datatype);
MPI_Type_commit(&this->datatype);
MPI_Type_size(this->datatype,&typesize);
MPI_Type_extent(this->datatype,&typeextent);
printf("type size for process rank (%d,%d) is %d doubles, type
extent is %d doubles (up to %d).",pr,pc,typesize/(int)sizeof(double),
(int)(typeextent/sizeof(double)),nx*ny);
MPI_File_open(blacs_comm,(char*)filename, MPI_MODE_RDWR,
MPI_INFO_NULL, &this->fid);
MPI_File_set_view(this->fid,this->offset
+i*nx*ny*sizeof(double),MPI_DOUBLE,this-
>datatype,"native",MPI_INFO_NULL);
To my surprise, this code crashes while calling
MPI_File_set_view()!!! And before you ask, I did try switching
MPI_ORDER_C to MPI_ORDER_FORTRAN, I got the same results I am
reporting here.
Also, I am quite intrigued by the text output of each of these
programs: the first one will report:
type size for process rank (0,0) is 32 doubles, type extent is 91
doubles (up to 91).
type size for process rank (1,0) is 20 doubles, type extent is 119
doubles (up to 91).
type size for process rank (0,1) is 24 doubles, type extent is 95
doubles (up to 91).
type size for process rank (1,1) is 15 doubles, type extent is 123
doubles (up to 91).
Anybody know why the extents are not equal???
Even weirder, the second one will report:
type size for process rank (0,0) is 32 doubles, type extent is 91
doubles (up to 91).
type size for process rank (1,0) is 20 doubles, type extent is 95
doubles (up to 91).
type size for process rank (0,1) is 24 doubles, type extent is 143
doubles (up to 91).
type size for process rank (1,1) is 15 doubles, type extent is 147
doubles (up to 91).
The extent changed! I think this is somehow related to the posterior
crash of MPI_File_set_view(), but that's as far as I can understand...
Any clue about what is happening? I attach the trace below.
Best,
A
--------------------------------------------------------------------------------
Antonio Molins, PhD Candidate
Medical Engineering and Medical Physics
Harvard - MIT Division of Health Sciences and Technology
--
"When a traveler reaches a fork in the road,
the ℓ1 -norm tells him to take either one way or the other,
but the ℓ2 -norm instructs him to head off into the bushes. "
John F. Claerbout and Francis Muir, 1973
--------------------------------------------------------------------------------
*** glibc detected *** double free or corruption (!prev):
0x0000000000cf4130 ***
[login4:26709] *** Process received signal ***
[login4:26708] *** Process received signal ***
[login4:26708] Signal: Aborted (6)
[login4:26708] Signal code: (-6)
[login4:26709] Signal: Segmentation fault (11)
[login4:26709] Signal code: Address not mapped (1)
[login4:26709] Failing at address: 0x18
[login4:26708] [ 0] /lib64/tls/libpthread.so.0 [0x36ff10c5b0]
[login4:26708] [ 1] /lib64/tls/libc.so.6(gsignal+0x3d) [0x36fe62e26d]
[login4:26708] [ 2] /lib64/tls/libc.so.6(abort+0xfe) [0x36fe62fa6e]
[login4:26708] [ 3] /lib64/tls/libc.so.6 [0x36fe6635f1]
[login4:26708] [ 4] /lib64/tls/libc.so.6 [0x36fe6691fe]
[login4:26708] [ 5] /lib64/tls/libc.so.6(__libc_free+0x76)
[0x36fe669596]
[login4:26708] [ 6] /opt/apps/intel10_1/openmpi/1.3/lib/libmpi.so.0
[0x2a962cc4ae]
[login4:26708] [ 7] /opt/apps/intel10_1/openmpi/1.3/lib/libmpi.so.
0(ompi_ddt_destroy+0x65) [0x2a962cd31d]
[login4:26708] [ 8] /opt/apps/intel10_1/openmpi/1.3/lib/libmpi.so.
0(MPI_Type_free+0x5b) [0x2a962f654f]
[login4:26708] [ 9] /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/
mca_io_romio.so(ADIOI_Flatten+0x1804) [0x2aa4603612]
[login4:26708] [10] /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/
mca_io_romio.so(ADIOI_Flatten_datatype+0xe7) [0x2aa46017fd]
[login4:26708] [11] /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/
mca_io_romio.so(ADIO_Set_view+0x14f) [0x2aa45ecb57]
[login4:26708] [12] /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/
mca_io_romio.so(mca_io_romio_dist_MPI_File_set_view+0x1dd)
[0x2aa46088a9]
[login4:26708] [13] /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/
mca_io_romio.so [0x2aa45ec288]
[login4:26708] [14] /opt/apps/intel10_1/openmpi/1.3/lib/libmpi.so.
0(MPI_File_set_view+0x53) [0x2a963002ff]
[login4:26708] [15] ./bin/test2(_ZN14pMatCollection3getEiP7pMatrix
+0xc3) [0x42a50b]
[login4:26708] [16] ./bin/test2(main+0xc2e) [0x43014a]
[login4:26708] [17] /lib64/tls/libc.so.6(__libc_start_main+0xdb)
[0x36fe61c40b]
[login4:26708] [18] ./bin/test2(_ZNSt8ios_base4InitD1Ev+0x42)
[0x41563a]
[login4:26708] *** End of error message ***
[login4:26709] [ 0] /lib64/tls/libpthread.so.0 [0x36ff10c5b0]
[login4:26709] [ 1] /lib64/tls/libc.so.6 [0x36fe66882b]
[login4:26709] [ 2] /lib64/tls/libc.so.6 [0x36fe668f8d]
[login4:26709] [ 3] /lib64/tls/libc.so.6(__libc_free+0x76)
[0x36fe669596]
[login4:26709] [ 4] /opt/apps/intel10_1/openmpi/1.3/lib/libmpi.so.0
[0x2a962cc4ae]
[login4:26709] [ 5] /opt/apps/intel10_1/openmpi/1.3/lib/libmpi.so.
0(ompi_ddt_release_args+0x93) [0x2a962d5641]
[login4:26709] [ 6] /opt/apps/intel10_1/openmpi/1.3/lib/libmpi.so.0
[0x2a962cc514]
[login4:26709] [ 7] /opt/apps/intel10_1/openmpi/1.3/lib/libmpi.so.
0(ompi_ddt_release_args+0x93) [0x2a962d5641]
[login4:26709] [ 8] /opt/apps/intel10_1/openmpi/1.3/lib/libmpi.so.0
[0x2a962cc514]
[login4:26709] [ 9] /opt/apps/intel10_1/openmpi/1.3/lib/libmpi.so.
0(ompi_ddt_destroy+0x65) [0x2a962cd31d]
[login4:26709] [10] /opt/apps/intel10_1/openmpi/1.3/lib/libmpi.so.
0(MPI_Type_free+0x5b) [0x2a962f654f]
[login4:26709] [11] /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/
mca_io_romio.so(ADIOI_Flatten+0x147) [0x2aa4601f55]
[login4:26709] [12] /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/
mca_io_romio.so(ADIOI_Flatten+0x1569) [0x2aa4603377]
[login4:26709] [13] /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/
mca_io_romio.so(ADIOI_Flatten_datatype+0xe7) [0x2aa46017fd]
[login4:26709] [14] /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/
mca_io_romio.so(ADIO_Set_view+0x14f) [0x2aa45ecb57]
[login4:26709] [15] /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/
mca_io_romio.so(mca_io_romio_dist_MPI_File_set_view+0x1dd)
[0x2aa46088a9]
[login4:26709] [16] /opt/apps/intel10_1/openmpi/1.3/lib/openmpi/
mca_io_romio.so [0x2aa45ec288]
[login4:26709] [17] /opt/apps/intel10_1/openmpi/1.3/lib/libmpi.so.
0(MPI_File_set_view+0x53) [0x2a963002ff]
[login4:26709] [18] ./bin/test2(_ZN14pMatCollection3getEiP7pMatrix
+0xc3) [0x42a50b]
[login4:26709] [19] ./bin/test2(main+0xc2e) [0x43014a]
[login4:26709] [20] /lib64/tls/libc.so.6(__libc_start_main+0xdb)
[0x36fe61c40b]
[login4:26709] [21] ./bin/test2(_ZNSt8ios_base4InitD1Ev+0x42)
[0x41563a]
[login4:26709] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 26708 on node
login4.ranger.tacc.utexas.edu exited on signal 6 (Aborted).
--------------------------------------------------------------------------
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--------------------------------------------------------------------------------
Antonio Molins, PhD Candidate
Medical Engineering and Medical Physics
Harvard - MIT Division of Health Sciences and Technology
--
"Y así del poco dormir y del mucho leer,
se le secó el cerebro de manera que vino
a perder el juicio".
Miguel de Cervantes
--------------------------------------------------------------------------------