George, Thanks. My colleague has verified your commit. This commit will make datatype code a bit simpler...
Regards, Takahiro Kawashima, MPI development team, Fujitsu > Takahiro, > > I used your second patch the one that remove the copy of the description in > the OMPI level (r28553). Thanks for your help and your patience in > investigating this issues. > > George. > > > On May 22, 2013, at 02:05 , "Kawashima, Takahiro" > <t-kawash...@jp.fujitsu.com> wrote: > > > George, > > > > Thanks for your quick response. > > Your fix seemed good to me last week, but this week my colleague > > found it's not sufficient. There are two issues. > > > > (A) We should update opt_desc too. > > > > In current ompi_datatype_init, we copy OPAL desc to OMPI desc. > > But opt_desc still points to OPAL desc. We should update > > opt_desc to point copied OMPI desc. > > > > (B) Fortran desc is not properly set. > > > > See the attached result-before.txt. It is the output of the > > attached show_ompi_datatype.c. Fortran basic datatypes, > > like MPI_CHARACTER, MPI_REAL, MPI_DOUBLE_PRECISION, have > > wrong desc_t. > > > > It is because these datatypes are statically initialized with > > OMPI_DATATYPE_INIT_PREDEFINED_BASIC_TYPE_FORTRAN macro and > > desc and opt_desc point to one element of > > ompi_datatype_predefined_elem_desc array with an OPAL index. > > For example, desc of ompi_mpi_character points to > > ompi_datatype_predefined_elem_desc[OPAL_DATATYPE_INT1]. > > If we use ompi_datatype_predefined_elem_desc, we should use > > an OMPI datatype index (OMPI_DATATYPE_MPI_INT8_T etc.) and not > > an OPAL datatype index (OPAL_DATATYPE_INT1 etc.). > > > > Therefore the condition (pDesc != datatype->super.desc.desc) > > in ompi_datatype_init becomes true and we copy desc from the > > wrong part currently. > > i.e. copy from ompi_datatype_predefined_elem_desc[OPAL_DATATYPE_INT1] > > to > > ompi_datatype_predefined_elem_desc[OMPI_DATATYPE_MPI_CHARACTER]. > > > > The initialization part of ompi_mpi_character in > > ompi_datatype_internal.h and ompi_datatype_module.c: > > ---------------------------------------------------------------- > > ompi_predefined_datatype_t ompi_mpi_character = > > OMPI_DATATYPE_INIT_PREDEFINED_BASIC_TYPE_FORTRAN (INT, CHARACTER, 1, > > OPAL_ALIGNMENT_CHAR, 0 ); > > > > #define OMPI_DATATYPE_INITIALIZER_FORTRAN( TYPE, NAME, SIZE, ALIGN, FLAGS ) > > \ > > { > > \ > > OPAL_OBJ_STATIC_INIT(opal_datatype_t), > > \ > > OPAL_DATATYPE_FLAG_BASIC | > > \ > > OMPI_DATATYPE_FLAG_PREDEFINED | > > \ > > OMPI_DATATYPE_FLAG_DATA_FORTRAN | (FLAGS) /*flag*/, > > \ > > OPAL_DATATYPE_ ## TYPE ## SIZE /*id*/, > > \ > > (((uint32_t)1)<<(OPAL_DATATYPE_ ## TYPE ## SIZE)) /*bdt_used*/, > > \ > > SIZE /*size*/, > > \ > > 0 /*true_lb*/, SIZE /*true_ub*/, 0 /*lb*/, SIZE /*ub*/, > > \ > > (ALIGN) /*align*/, > > \ > > 1 /*nbElems*/, > > \ > > OPAL_DATATYPE_INIT_NAME(TYPE ## SIZE) /*name*/, > > \ > > OMPI_DATATYPE_INIT_DESC_PREDEFINED(TYPE, SIZE) /*desc*/, > > \ > > OMPI_DATATYPE_INIT_DESC_PREDEFINED(TYPE, SIZE) /*opt_desc*/, > > \ > > OPAL_DATATYPE_INIT_BTYPES_ARRAY_ ## TYPE ## SIZE /*btypes*/ > > \ > > > > #define OMPI_DATATYPE_INIT_DESC_PREDEFINED(TYPE, SIZE) > > \ > > { > > \ > > 1 /*length*/, 1 /*used*/, > > \ > > &(ompi_datatype_predefined_elem_desc[2 * OPAL_DATATYPE_ ## TYPE ## > > SIZE]) /*desc*/ \ > > } > > > > int32_t ompi_datatype_init( void ) > > { > > int32_t i; > > > > for( i = 0; i < OMPI_DATATYPE_MPI_MAX_PREDEFINED; i++ ) { > > ompi_datatype_t* datatype = > > (ompi_datatype_t*)ompi_datatype_basicDatatypes[i]; > > dt_elem_desc_t* pDesc; > > > > if( 0 == datatype->super.size ) continue; > > > > /** > > * Most of the OMPI datatypes have been initialized with the basic > > desc of the > > * OPAL datatypes. Thus don't modify the desc, instead rebase the > > desc back into > > * the OMPI predefined_elem_desc and update the fields there. > > */ > > pDesc = &ompi_datatype_predefined_elem_desc[2 * i]; > > if( pDesc != datatype->super.desc.desc ) { > > memcpy(pDesc, datatype->super.desc.desc, 2 * > > sizeof(dt_elem_desc_t)); > > datatype->super.desc.desc = pDesc; > > } else { > > datatype->super.desc.desc[0].elem.common.flags = > > OPAL_DATATYPE_FLAG_PREDEFINED | > > > > OPAL_DATATYPE_FLAG_DATA | > > > > OPAL_DATATYPE_FLAG_CONTIGUOUS | > > > > OPAL_DATATYPE_FLAG_NO_GAPS; > > datatype->super.desc.desc[0].elem.common.type = i; > > .... > > ---------------------------------------------------------------- > > > > Do you intend to it goes to the else-block in ompi_datatype_init > > for Fortran datatypes? If so, we should use an OMPI index in > > OMPI_DATATYPE_INIT_DESC_PREDEFINED. > > > > But in the else-block, desc[0].elem.common.type is set to an OMPI > > datatype index. And it seems that this 'type' is treated as an > > OPAL datatype index in other parts. > > # OMPI_DATATYPE_MPI_CHARACTER and OPAL_DATATYPE_COMPLEX8 has > > # same value (0x13) but how to distinguish them? > > > > I wonder whether Fortran datatypes really need separate desc. > > Though OPAL does not have *identical* datatypes, it always has > > *corresponding* datatypes. It is obvious because we currently > > translate an OMPI Fortran datatype to a corresponding OPAL > > datatype index in OMPI_DATATYPE_INIT_DESC_PREDEFINED as > > "OPAL_DATATYPE_ ## TYPE ## SIZE". > > > > If Fortran datatypes don't need separate desc, this issue > > may be fixed by my attached datatype-init-1.patch. > > It also fixes the opt_desc issue described first. > > > > Furthermore, do we need to copy desc for OMPI datatypes? > > If not, use my attached datatype-init-2.patch instead. > > It don't copy desc and OMPI desc points OPAL desc. > > I'm not sure this is a correct solution. > > > > The attached result-after.txt is the output of the attached > > show_ompi_datatype.c with my patch. I think this output is > > correct. > > > > Regards, > > Takahiro Kawashima, > > MPI development team, > > Fujitsu > > > >> Takahiro, > >> > >> Nice catch, I really wonder how this one survived for soo long. I pushed a > >> patch in r28535 addressing this issue. It is not the best solution, but it > >> provide an easy way to address the issue. > >> > >> A little bit of history. A datatype is composed by (let's keep it short) 2 > >> component, a high-level description containing among others the size and > >> the name of the datatype and a low level description (the desc_t part) > >> containing the basic predefined elements in the datatype. As most of the > >> predefined datatypes defined in the MPI layer are synonyms to some basic > >> predefined datatypes (such as the equivalent POSIX types MPI_INT32_T), the > >> design of the datatype allowed for the sharing of the desc_t part between > >> datatypes. This approach allows us to have similar datatypes (MPI_INT and > >> MPI_INT32_T) with different names but with the same backend internal > >> description. However, when we split the datatype engine in two, we > >> duplicate this common description (in OPAL and OMPI). The OMPI desc_t was > >> pointing to OPAL desc_t for almost everything … except the datatypes that > >> were not defined by OPAL such as the Fortran one. This turned the > >> management of the common desc_t into a ni! > ghtmare … with the effect you noticed few days ago. Too bad for the > optimization part. I now duplicate the desc_t between the two layers, and all > OMPI datatypes have now their own desc_t. > >> > >> Thanks for finding and analyzing so deeply this issue. > >> George. > >> > >> > >> > >> > >> On May 16, 2013, at 12:04 , KAWASHIMA Takahiro <rivis.kawash...@nifty.com> > >> wrote: > >> > >>> Hi, > >>> > >>> I'm reading the datatype code in Open MPI trunk and have a question. > >>> A bit long. > >>> > >>> See the following program. > >>> > >>> ---------------------------------------------------------------- > >>> #include <stdio.h> > >>> #include <mpi.h> > >>> > >>> struct opal_datatype_t; > >>> extern int opal_init(int *pargc, char ***pargv); > >>> extern int opal_finalize(void); > >>> extern void opal_datatype_dump(struct opal_datatype_t *type); > >>> extern struct opal_datatype_t opal_datatype_int8; > >>> > >>> int main(int argc, char **argv) > >>> { > >>> opal_init(NULL, NULL); > >>> opal_datatype_dump(&opal_datatype_int8); > >>> MPI_Init(NULL, NULL); > >>> opal_datatype_dump(&opal_datatype_int8); > >>> MPI_Finalize(); > >>> opal_finalize(); > >>> return 0; > >>> } > >>> ---------------------------------------------------------------- > >>> > >>> All variables/functions declared as 'extern' are defined in OPAL. > >>> opal_datatype_dump() function outputs internal data of a datatype. > >>> I expect the same output on two opal_datatype_dump() calls. > >>> But when I run it on an x86_64 machine, I get the following output. > >>> > >>> ---------------------------------------------------------------- > >>> ompi-trunk/opal-datatype-dump && ompiexec -n 1 > >>> ompi-trunk/opal-datatype-dump > >>> [ppc.rivis.jp:27886] Datatype 0x600c60[OPAL_INT8] size 8 align 8 id 7 > >>> length 1 used 1 > >>> true_lb 0 true_ub 8 (true_extent 8) lb 0 ub 8 (extent 8) > >>> nbElems 1 loops 0 flags 136 (commited contiguous )-cC---P-DB-[---][---] > >>> contain OPAL_INT8 > >>> --C---P-D--[---][---] OPAL_INT8 count 1 disp 0x0 (0) extent 8 (size > >>> 8) > >>> No optimized description > >>> > >>> [ppc.rivis.jp:27886] Datatype 0x600c60[OPAL_INT8] size 8 align 8 id 7 > >>> length 1 used 1 > >>> true_lb 0 true_ub 8 (true_extent 8) lb 0 ub 8 (extent 8) > >>> nbElems 1 loops 0 flags 136 (commited contiguous )-cC---P-DB-[---][---] > >>> contain OPAL_INT8 > >>> --C---P-D--[---][---] count 1 disp 0x0 (0) extent 8 (size > >>> 8971008) > >>> No optimized description > >>> ---------------------------------------------------------------- > >>> > >>> The former output is what I expected. But the latter one is not > >>> identical to the former one and its content datatype has no name > >>> and a very large size. > >>> > >>> This line is output in opal_datatype_dump_data_desc() function in > >>> opal/datatype/opal_datatype_dump.c file. It refers > >>> opal_datatype_basicDatatypes[pDesc->elem.common.type]->name and > >>> opal_datatype_basicDatatypes[pDesc->elem.common.type]->size for > >>> the content datatype. > >>> > >>> In this case, pDesc->elem.common.type is > >>> opal_datatype_int8.desc.desc[0].elem.common.type and is initialized to 7 > >>> in opal_datatype_init() function in opal/datatype/opal_datatype_module.c > >>> file, which is called during opal_init() function. > >>> opal_datatype_int8.desc.desc points > >>> &opal_datatype_predefined_elem_desc[7*2]. > >>> > >>> But if we call MPI_Init() function, the value is overwritten. > >>> ompi_datatype_init() function in ompi/datatype/ompi_datatype_module.c > >>> file, which is called during MPI_Init() function, has similar > >>> procedure to initialize OMPI datatypes. > >>> > >>> On initializing ompi_mpi_aint in it, ompi_mpi_aint.dt.super.desc.desc > >>> points &opal_datatype_predefined_elem_desc[7*2], which is also pointed > >>> by opal_datatype_int8, because ompi_mpi_aint is defined by > >>> OMPI_DATATYPE_INIT_PREDEFINED_BASIC_TYPE macro and it uses > >>> OPAL_DATATYPE_INITIALIZER_INT8 macro. So > >>> opal_datatype_int8.desc.desc[0].elem.common.type is overwritten > >>> to 37. > >>> > >>> Therefore in the second opal_datatype_dump() function call in my > >>> program, opal_datatype_basicDatatypes[37] is accessed. > >>> But the array length of opal_datatype_basicDatatypes is 25. > >>> > >>> Summarize: > >>> > >>> static initializer: > >>> opal_datatype_predefined_elem_desc[25] = {{0, ...}, ...}; > >>> opal_datatype_int8.desc.desc = &opal_datatype_predefined_elem_desc[7*2]; > >>> ompi_mpi_aint.dt.super.desc.desc = > >>> &opal_datatype_predefined_elem_desc[7*2]; > >>> > >>> opal_init: > >>> opal_datatype_int8.desc.desc.elem.common.type = 7; > >>> > >>> MPI_Init: > >>> ompi_mpi_aint.dt.super.desc.desc.elem.common.type = 37; > >>> > >>> opal_datatype_dump: > >>> access to opal_datatype_predefined_elem_desc[37] > >>> > >>> While opal_datatype_dump() function might not be called from > >>> user's programs, breaking opal_datatype_predefined_elem_desc > >>> array in ompi_datatype_init() function is not good. > >>> > >>> Though the above is described for opal_datatype_int8 and ompi_mpi_aint, > >>> the same thing happens to other datatypes. > >>> > >>> Though I tried to fix this problem, I could not figure out the > >>> correct solution. > >>> > >>> - The first loop in ompi_datatype_init() function should be removed? > >>> But OMPI Fortran datatypes should be initialized in it? > >>> > >>> - All OMPI datatypes should point ompi_datatype_predefined_elem_desc > >>> array? But having same 'type' value in OPAL datatypes and OMPI > >>> datatypes is allowed?