George,

Thanks. My colleague has verified your commit.
This commit will make datatype code a bit simpler...

Regards,
Takahiro Kawashima,
MPI development team,
Fujitsu

> Takahiro,
> 
> I used your second patch the one that remove the copy of the description in 
> the OMPI level (r28553). Thanks for your help and your patience in 
> investigating this issues.
> 
>   George.
> 
> 
> On May 22, 2013, at 02:05 , "Kawashima, Takahiro" 
> <t-kawash...@jp.fujitsu.com> wrote:
> 
> > George,
> > 
> > Thanks for your quick response.
> > Your fix seemed good to me last week, but this week my colleague
> > found it's not sufficient. There are two issues.
> > 
> > (A) We should update opt_desc too.
> > 
> >    In current ompi_datatype_init, we copy OPAL desc to OMPI desc.
> >    But opt_desc still points to OPAL desc. We should update
> >    opt_desc to point copied OMPI desc.
> > 
> > (B) Fortran desc is not properly set.
> > 
> >    See the attached result-before.txt. It is the output of the
> >    attached show_ompi_datatype.c. Fortran basic datatypes,
> >    like MPI_CHARACTER, MPI_REAL, MPI_DOUBLE_PRECISION, have
> >    wrong desc_t.
> > 
> >    It is because these datatypes are statically initialized with
> >    OMPI_DATATYPE_INIT_PREDEFINED_BASIC_TYPE_FORTRAN macro and
> >    desc and opt_desc point to one element of
> >    ompi_datatype_predefined_elem_desc array with an OPAL index.
> >    For example, desc of ompi_mpi_character points to
> >    ompi_datatype_predefined_elem_desc[OPAL_DATATYPE_INT1].
> >    If we use ompi_datatype_predefined_elem_desc, we should use
> >    an OMPI datatype index (OMPI_DATATYPE_MPI_INT8_T etc.) and not
> >    an OPAL datatype index (OPAL_DATATYPE_INT1 etc.).
> > 
> >    Therefore the condition (pDesc != datatype->super.desc.desc)
> >    in ompi_datatype_init becomes true and we copy desc from the
> >    wrong part currently.
> >    i.e. copy from ompi_datatype_predefined_elem_desc[OPAL_DATATYPE_INT1]
> >              to   
> > ompi_datatype_predefined_elem_desc[OMPI_DATATYPE_MPI_CHARACTER].
> > 
> > The initialization part of ompi_mpi_character in
> > ompi_datatype_internal.h and ompi_datatype_module.c:
> > ----------------------------------------------------------------
> > ompi_predefined_datatype_t ompi_mpi_character =      
> > OMPI_DATATYPE_INIT_PREDEFINED_BASIC_TYPE_FORTRAN (INT, CHARACTER, 1, 
> > OPAL_ALIGNMENT_CHAR, 0 );
> > 
> > #define OMPI_DATATYPE_INITIALIZER_FORTRAN( TYPE, NAME, SIZE, ALIGN, FLAGS ) 
> >          \
> >    {                                                                        
> >         \
> >        OPAL_OBJ_STATIC_INIT(opal_datatype_t),                               
> >         \
> >        OPAL_DATATYPE_FLAG_BASIC |                                           
> >         \
> >            OMPI_DATATYPE_FLAG_PREDEFINED |                                  
> >         \
> >            OMPI_DATATYPE_FLAG_DATA_FORTRAN | (FLAGS) /*flag*/,              
> >         \
> >        OPAL_DATATYPE_ ## TYPE ## SIZE /*id*/,                               
> >         \
> >        (((uint32_t)1)<<(OPAL_DATATYPE_ ## TYPE ## SIZE)) /*bdt_used*/,      
> >         \
> >        SIZE /*size*/,                                                       
> >         \
> >        0 /*true_lb*/, SIZE /*true_ub*/, 0 /*lb*/, SIZE /*ub*/,              
> >         \
> >        (ALIGN) /*align*/,                                                   
> >         \
> >        1 /*nbElems*/,                                                       
> >         \
> >        OPAL_DATATYPE_INIT_NAME(TYPE ## SIZE) /*name*/,                      
> >         \
> >        OMPI_DATATYPE_INIT_DESC_PREDEFINED(TYPE, SIZE) /*desc*/,             
> >         \
> >        OMPI_DATATYPE_INIT_DESC_PREDEFINED(TYPE, SIZE) /*opt_desc*/,         
> >         \
> >        OPAL_DATATYPE_INIT_BTYPES_ARRAY_ ## TYPE ## SIZE /*btypes*/          
> >         \
> > 
> > #define OMPI_DATATYPE_INIT_DESC_PREDEFINED(TYPE, SIZE)                      
> >          \
> >    {                                                                        
> >         \
> >        1 /*length*/, 1 /*used*/,                                            
> >         \
> >        &(ompi_datatype_predefined_elem_desc[2 * OPAL_DATATYPE_ ## TYPE ## 
> > SIZE]) /*desc*/ \
> >    }
> > 
> > int32_t ompi_datatype_init( void )
> > {
> >    int32_t i;
> > 
> >    for( i = 0; i < OMPI_DATATYPE_MPI_MAX_PREDEFINED; i++ ) {
> >        ompi_datatype_t* datatype = 
> > (ompi_datatype_t*)ompi_datatype_basicDatatypes[i];
> >        dt_elem_desc_t* pDesc;
> > 
> >        if( 0 == datatype->super.size ) continue;
> > 
> >        /**
> >         * Most of the OMPI datatypes have been initialized with the basic 
> > desc of the
> >         * OPAL datatypes. Thus don't modify the desc, instead rebase the 
> > desc back into
> >         * the OMPI predefined_elem_desc and update the fields there.
> >         */
> >        pDesc = &ompi_datatype_predefined_elem_desc[2 * i];
> >        if( pDesc != datatype->super.desc.desc ) {
> >            memcpy(pDesc, datatype->super.desc.desc, 2 * 
> > sizeof(dt_elem_desc_t));
> >            datatype->super.desc.desc = pDesc;
> >        } else {
> >            datatype->super.desc.desc[0].elem.common.flags = 
> > OPAL_DATATYPE_FLAG_PREDEFINED |
> >                                                             
> > OPAL_DATATYPE_FLAG_DATA |
> >                                                             
> > OPAL_DATATYPE_FLAG_CONTIGUOUS |
> >                                                             
> > OPAL_DATATYPE_FLAG_NO_GAPS;
> >            datatype->super.desc.desc[0].elem.common.type  = i;
> > ....
> > ----------------------------------------------------------------
> > 
> > Do you intend to it goes to the else-block in ompi_datatype_init
> > for Fortran datatypes? If so, we should use an OMPI index in
> > OMPI_DATATYPE_INIT_DESC_PREDEFINED.
> > 
> > But in the else-block, desc[0].elem.common.type is set to an OMPI
> > datatype index. And it seems that this 'type' is treated as an
> > OPAL datatype index in other parts. 
> > # OMPI_DATATYPE_MPI_CHARACTER and OPAL_DATATYPE_COMPLEX8 has
> > # same value (0x13) but how to distinguish them?
> > 
> > I wonder whether Fortran datatypes really need separate desc.
> > Though OPAL does not have *identical* datatypes, it always has
> > *corresponding* datatypes. It is obvious because we currently
> > translate an OMPI Fortran datatype to a corresponding OPAL
> > datatype index in OMPI_DATATYPE_INIT_DESC_PREDEFINED as
> > "OPAL_DATATYPE_ ## TYPE ## SIZE".
> > 
> > If Fortran datatypes don't need separate desc, this issue
> > may be fixed by my attached datatype-init-1.patch.
> > It also fixes the opt_desc issue described first.
> > 
> > Furthermore, do we need to copy desc for OMPI datatypes?
> > If not, use my attached datatype-init-2.patch instead.
> > It don't copy desc and OMPI desc points OPAL desc.
> > I'm not sure this is a correct solution.
> > 
> > The attached result-after.txt is the output of the attached
> > show_ompi_datatype.c with my patch. I think this output is
> > correct.
> > 
> > Regards,
> > Takahiro Kawashima,
> > MPI development team,
> > Fujitsu
> > 
> >> Takahiro,
> >> 
> >> Nice catch, I really wonder how this one survived for soo long. I pushed a 
> >> patch in r28535 addressing this issue. It is not the best solution, but it 
> >> provide an easy way to address the issue.
> >> 
> >> A little bit of history. A datatype is composed by (let's keep it short) 2 
> >> component, a high-level description containing among others the size and 
> >> the name of the datatype and a low level description (the desc_t part) 
> >> containing the basic predefined elements in the datatype. As most of the 
> >> predefined datatypes defined in the MPI layer are synonyms to some basic 
> >> predefined datatypes (such as the equivalent POSIX types MPI_INT32_T), the 
> >> design of the datatype allowed for the sharing of the desc_t part between 
> >> datatypes. This approach allows us to have similar datatypes (MPI_INT and 
> >> MPI_INT32_T) with different names but with the same backend internal 
> >> description. However, when we split the datatype engine in two, we 
> >> duplicate this common description (in OPAL and OMPI). The OMPI desc_t was 
> >> pointing to OPAL desc_t for almost everything … except the datatypes that 
> >> were not defined by OPAL such as the Fortran one. This turned the 
> >> management of the common desc_t into a ni!
>  ghtmare … with the effect you noticed few days ago. Too bad for the 
> optimization part. I now duplicate the desc_t between the two layers, and all 
> OMPI datatypes have now their own desc_t.
> >> 
> >> Thanks for finding and analyzing so deeply this issue.
> >>  George.
> >> 
> >> 
> >> 
> >> 
> >> On May 16, 2013, at 12:04 , KAWASHIMA Takahiro <rivis.kawash...@nifty.com> 
> >> wrote:
> >> 
> >>> Hi,
> >>> 
> >>> I'm reading the datatype code in Open MPI trunk and have a question.
> >>> A bit long.
> >>> 
> >>> See the following program.
> >>> 
> >>> ----------------------------------------------------------------
> >>> #include <stdio.h>
> >>> #include <mpi.h>
> >>> 
> >>> struct opal_datatype_t;
> >>> extern int opal_init(int *pargc, char ***pargv);
> >>> extern int opal_finalize(void);
> >>> extern void opal_datatype_dump(struct opal_datatype_t *type);
> >>> extern struct opal_datatype_t opal_datatype_int8;
> >>> 
> >>> int main(int argc, char **argv)
> >>> {
> >>>   opal_init(NULL, NULL);
> >>>   opal_datatype_dump(&opal_datatype_int8);
> >>>   MPI_Init(NULL, NULL);
> >>>   opal_datatype_dump(&opal_datatype_int8);
> >>>   MPI_Finalize();
> >>>   opal_finalize();
> >>>   return 0;
> >>> }
> >>> ----------------------------------------------------------------
> >>> 
> >>> All variables/functions declared as 'extern' are defined in OPAL.
> >>> opal_datatype_dump() function outputs internal data of a datatype.
> >>> I expect the same output on two opal_datatype_dump() calls.
> >>> But when I run it on an x86_64 machine, I get the following output.
> >>> 
> >>> ----------------------------------------------------------------
> >>> ompi-trunk/opal-datatype-dump && ompiexec -n 1 
> >>> ompi-trunk/opal-datatype-dump
> >>> [ppc.rivis.jp:27886] Datatype 0x600c60[OPAL_INT8] size 8 align 8 id 7 
> >>> length 1 used 1
> >>> true_lb 0 true_ub 8 (true_extent 8) lb 0 ub 8 (extent 8)
> >>> nbElems 1 loops 0 flags 136 (commited contiguous )-cC---P-DB-[---][---]
> >>>  contain OPAL_INT8
> >>> --C---P-D--[---][---]      OPAL_INT8 count 1 disp 0x0 (0) extent 8 (size 
> >>> 8)
> >>> No optimized description
> >>> 
> >>> [ppc.rivis.jp:27886] Datatype 0x600c60[OPAL_INT8] size 8 align 8 id 7 
> >>> length 1 used 1
> >>> true_lb 0 true_ub 8 (true_extent 8) lb 0 ub 8 (extent 8)
> >>> nbElems 1 loops 0 flags 136 (commited contiguous )-cC---P-DB-[---][---]
> >>>  contain OPAL_INT8
> >>> --C---P-D--[---][---]               count 1 disp 0x0 (0) extent 8 (size 
> >>> 8971008)
> >>> No optimized description
> >>> ----------------------------------------------------------------
> >>> 
> >>> The former output is what I expected. But the latter one is not
> >>> identical to the former one and its content datatype has no name
> >>> and a very large size.
> >>> 
> >>> This line is output in opal_datatype_dump_data_desc() function in
> >>> opal/datatype/opal_datatype_dump.c file. It refers
> >>> opal_datatype_basicDatatypes[pDesc->elem.common.type]->name and
> >>> opal_datatype_basicDatatypes[pDesc->elem.common.type]->size for
> >>> the content datatype.
> >>> 
> >>> In this case, pDesc->elem.common.type is
> >>> opal_datatype_int8.desc.desc[0].elem.common.type and is initialized to 7
> >>> in opal_datatype_init() function in opal/datatype/opal_datatype_module.c
> >>> file, which is called during opal_init() function.
> >>> opal_datatype_int8.desc.desc points 
> >>> &opal_datatype_predefined_elem_desc[7*2].
> >>> 
> >>> But if we call MPI_Init() function, the value is overwritten.
> >>> ompi_datatype_init() function in ompi/datatype/ompi_datatype_module.c
> >>> file, which is called during MPI_Init() function, has similar
> >>> procedure to initialize OMPI datatypes.
> >>> 
> >>> On initializing ompi_mpi_aint in it, ompi_mpi_aint.dt.super.desc.desc
> >>> points &opal_datatype_predefined_elem_desc[7*2], which is also pointed
> >>> by opal_datatype_int8, because ompi_mpi_aint is defined by
> >>> OMPI_DATATYPE_INIT_PREDEFINED_BASIC_TYPE macro and it uses
> >>> OPAL_DATATYPE_INITIALIZER_INT8 macro. So
> >>> opal_datatype_int8.desc.desc[0].elem.common.type is overwritten
> >>> to 37.
> >>> 
> >>> Therefore in the second opal_datatype_dump() function call in my
> >>> program, opal_datatype_basicDatatypes[37] is accessed.
> >>> But the array length of opal_datatype_basicDatatypes is 25.
> >>> 
> >>> Summarize:
> >>> 
> >>> static initializer:
> >>>   opal_datatype_predefined_elem_desc[25] = {{0, ...}, ...};
> >>>   opal_datatype_int8.desc.desc = &opal_datatype_predefined_elem_desc[7*2];
> >>>   ompi_mpi_aint.dt.super.desc.desc = 
> >>> &opal_datatype_predefined_elem_desc[7*2];
> >>> 
> >>> opal_init:
> >>>   opal_datatype_int8.desc.desc.elem.common.type = 7;
> >>> 
> >>> MPI_Init:
> >>>   ompi_mpi_aint.dt.super.desc.desc.elem.common.type = 37;
> >>> 
> >>> opal_datatype_dump:
> >>>   access to opal_datatype_predefined_elem_desc[37]
> >>> 
> >>> While opal_datatype_dump() function might not be called from
> >>> user's programs, breaking opal_datatype_predefined_elem_desc
> >>> array in ompi_datatype_init() function is not good.
> >>> 
> >>> Though the above is described for opal_datatype_int8 and ompi_mpi_aint,
> >>> the same thing happens to other datatypes.
> >>> 
> >>> Though I tried to fix this problem, I could not figure out the
> >>> correct solution.
> >>> 
> >>> - The first loop in ompi_datatype_init() function should be removed?
> >>>   But OMPI Fortran datatypes should be initialized in it?
> >>> 
> >>> - All OMPI datatypes should point ompi_datatype_predefined_elem_desc
> >>>   array? But having same 'type' value in OPAL datatypes and OMPI
> >>>   datatypes is allowed?

Reply via email to