[hwloc-devel] Create success (hwloc r1.8a1r5632)

2013-05-17 Thread MPI Team
Creating nightly hwloc snapshot SVN tarball was a success.

Snapshot:   hwloc 1.8a1r5632
Start time: Fri May 17 21:01:01 EDT 2013
End time:   Fri May 17 21:05:11 EDT 2013

Your friendly daemon,
Cyrador


Re: [OMPI devel] RFC: dead code removal

2013-05-17 Thread Ralph Castain
not really - don't think it is ever used as i don't see where it would get 
propagated

On May 17, 2013, at 9:14 AM, Jeff Squyres (jsquyres)  wrote:

> Do you have any concerns about removing the username from the rmaps rank_file 
> component?
> 
> 
> On May 16, 2013, at 11:27 AM, Ralph Castain  wrote:
> 
>> okay, i went thru this - found a couple of places where a deeper error was 
>> involved. i've committed those changes, so as far as i'm concerned you can 
>> update the patch and commit
>> 
>> 
>> On May 15, 2013, at 5:43 PM, Jeff Squyres (jsquyres)  
>> wrote:
>> 
>>> Sure, no problem.
>>> 
>>> 
>>> On May 15, 2013, at 8:41 PM, Ralph Castain  wrote:
>>> 
 Hmmm...some of this doesn't look right to me. It could be that some of the 
 code changed and stale things didn't get removed, but the snippets of 
 logic in your patch raise alarms in some cases.
 
 Can you allow a bit more time? I need to apply the patch and actually look 
 at the total code path to understand *why* some of these variables are no 
 longer being used. My fear is that there are cmd line options that may not 
 be working correctly (but rarely get used/tested) because (a) the variable 
 is correct, but (b) somehow the rest of the code is in error.
 
 
 On May 15, 2013, at 5:24 PM, Jeff Squyres (jsquyres)  
 wrote:
 
> WHAT: Remove a bunch of "set but not used" variables / dead code
> 
> WHY: Because it's dead code
> 
> WHERE: All over, but NOT the BTL ALLOC macros (per prior 
> argu^H^H^H^Hdiscussion)
> 
> WHEN: Tomorrow (16 May 2013), COB
> 
> More detail:
> 
> gcc 4.7.x squawks a lot about "set but unused" variables.  I took a sweep 
> through and removed a bunch of them -- they're all obviously dead code.
> 
> I did *not*, however, remove the setting of rc in the various BTL/OOB 
> ALLOC_FRAG macros, per prior disagreements in emails about this.  Perhaps 
> someone else will find a compromise for that someday -- this patch is not 
> about fixing those warnings.  This patch is only about removing the 
> obvious set-but-really-never-used variables.
> 
> Short timeout because this is actually pretty trivial, but it does touch 
> other people's code, so I wanted people to see it / get a heads-up before 
> I committed.  Patch attached.
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
 
 
 ___
 devel mailing list
 de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> 
>>> -- 
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> For corporate legal information go to: 
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>> 
>>> 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] RFC: dead code removal

2013-05-17 Thread Jeff Squyres (jsquyres)
Do you have any concerns about removing the username from the rmaps rank_file 
component?


On May 16, 2013, at 11:27 AM, Ralph Castain  wrote:

> okay, i went thru this - found a couple of places where a deeper error was 
> involved. i've committed those changes, so as far as i'm concerned you can 
> update the patch and commit
> 
> 
> On May 15, 2013, at 5:43 PM, Jeff Squyres (jsquyres)  
> wrote:
> 
>> Sure, no problem.
>> 
>> 
>> On May 15, 2013, at 8:41 PM, Ralph Castain  wrote:
>> 
>>> Hmmm...some of this doesn't look right to me. It could be that some of the 
>>> code changed and stale things didn't get removed, but the snippets of logic 
>>> in your patch raise alarms in some cases.
>>> 
>>> Can you allow a bit more time? I need to apply the patch and actually look 
>>> at the total code path to understand *why* some of these variables are no 
>>> longer being used. My fear is that there are cmd line options that may not 
>>> be working correctly (but rarely get used/tested) because (a) the variable 
>>> is correct, but (b) somehow the rest of the code is in error.
>>> 
>>> 
>>> On May 15, 2013, at 5:24 PM, Jeff Squyres (jsquyres)  
>>> wrote:
>>> 
 WHAT: Remove a bunch of "set but not used" variables / dead code
 
 WHY: Because it's dead code
 
 WHERE: All over, but NOT the BTL ALLOC macros (per prior 
 argu^H^H^H^Hdiscussion)
 
 WHEN: Tomorrow (16 May 2013), COB
 
 More detail:
 
 gcc 4.7.x squawks a lot about "set but unused" variables.  I took a sweep 
 through and removed a bunch of them -- they're all obviously dead code.
 
 I did *not*, however, remove the setting of rc in the various BTL/OOB 
 ALLOC_FRAG macros, per prior disagreements in emails about this.  Perhaps 
 someone else will find a compromise for that someday -- this patch is not 
 about fixing those warnings.  This patch is only about removing the 
 obvious set-but-really-never-used variables.
 
 Short timeout because this is actually pretty trivial, but it does touch 
 other people's code, so I wanted people to see it / get a heads-up before 
 I committed.  Patch attached.
 
 -- 
 Jeff Squyres
 jsquy...@cisco.com
 For corporate legal information go to: 
 http://www.cisco.com/web/about/doing_business/legal/cri/
 ___
 devel mailing list
 de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] Datatype initialization bug?

2013-05-17 Thread George Bosilca
Takahiro,

Nice catch, I really wonder how this one survived for soo long. I pushed a 
patch in r28535 addressing this issue. It is not the best solution, but it 
provide an easy way to address the issue.

A little bit of history. A datatype is composed by (let's keep it short) 2 
component, a high-level description containing among others the size and the 
name of the datatype and a low level description (the desc_t part) containing 
the basic predefined elements in the datatype. As most of the predefined 
datatypes defined in the MPI layer are synonyms to some basic predefined 
datatypes (such as the equivalent POSIX types MPI_INT32_T), the design of the 
datatype allowed for the sharing of the desc_t part between datatypes. This 
approach allows us to have similar datatypes (MPI_INT and MPI_INT32_T) with 
different names but with the same backend internal description. However, when 
we split the datatype engine in two, we duplicate this common description (in 
OPAL and OMPI). The OMPI desc_t was pointing to OPAL desc_t for almost 
everything … except the datatypes that were not defined by OPAL such as the 
Fortran one. This turned the management of the common desc_t into a nightmare … 
with the effect you noticed few days ago. Too bad for the optimization part. I 
now duplicate the desc_t between the two layers, and all OMPI datatypes have 
now their own desc_t.

Thanks for finding and analyzing so deeply this issue.
  George.




On May 16, 2013, at 12:04 , KAWASHIMA Takahiro  
wrote:

> Hi,
> 
> I'm reading the datatype code in Open MPI trunk and have a question.
> A bit long.
> 
> See the following program.
> 
> 
> #include 
> #include 
> 
> struct opal_datatype_t;
> extern int opal_init(int *pargc, char ***pargv);
> extern int opal_finalize(void);
> extern void opal_datatype_dump(struct opal_datatype_t *type);
> extern struct opal_datatype_t opal_datatype_int8;
> 
> int main(int argc, char **argv)
> {
>opal_init(NULL, NULL);
>opal_datatype_dump(_datatype_int8);
>MPI_Init(NULL, NULL);
>opal_datatype_dump(_datatype_int8);
>MPI_Finalize();
>opal_finalize();
>return 0;
> }
> 
> 
> All variables/functions declared as 'extern' are defined in OPAL.
> opal_datatype_dump() function outputs internal data of a datatype.
> I expect the same output on two opal_datatype_dump() calls.
> But when I run it on an x86_64 machine, I get the following output.
> 
> 
> ompi-trunk/opal-datatype-dump && ompiexec -n 1 ompi-trunk/opal-datatype-dump
> [ppc.rivis.jp:27886] Datatype 0x600c60[OPAL_INT8] size 8 align 8 id 7 length 
> 1 used 1
> true_lb 0 true_ub 8 (true_extent 8) lb 0 ub 8 (extent 8)
> nbElems 1 loops 0 flags 136 (commited contiguous )-cC---P-DB-[---][---]
>   contain OPAL_INT8
> --C---P-D--[---][---]  OPAL_INT8 count 1 disp 0x0 (0) extent 8 (size 8)
> No optimized description
> 
> [ppc.rivis.jp:27886] Datatype 0x600c60[OPAL_INT8] size 8 align 8 id 7 length 
> 1 used 1
> true_lb 0 true_ub 8 (true_extent 8) lb 0 ub 8 (extent 8)
> nbElems 1 loops 0 flags 136 (commited contiguous )-cC---P-DB-[---][---]
>   contain OPAL_INT8
> --C---P-D--[---][---]   count 1 disp 0x0 (0) extent 8 (size 
> 8971008)
> No optimized description
> 
> 
> The former output is what I expected. But the latter one is not
> identical to the former one and its content datatype has no name
> and a very large size.
> 
> This line is output in opal_datatype_dump_data_desc() function in
> opal/datatype/opal_datatype_dump.c file. It refers
> opal_datatype_basicDatatypes[pDesc->elem.common.type]->name and
> opal_datatype_basicDatatypes[pDesc->elem.common.type]->size for
> the content datatype.
> 
> In this case, pDesc->elem.common.type is
> opal_datatype_int8.desc.desc[0].elem.common.type and is initialized to 7
> in opal_datatype_init() function in opal/datatype/opal_datatype_module.c
> file, which is called during opal_init() function.
> opal_datatype_int8.desc.desc points _datatype_predefined_elem_desc[7*2].
> 
> But if we call MPI_Init() function, the value is overwritten.
> ompi_datatype_init() function in ompi/datatype/ompi_datatype_module.c
> file, which is called during MPI_Init() function, has similar
> procedure to initialize OMPI datatypes.
> 
> On initializing ompi_mpi_aint in it, ompi_mpi_aint.dt.super.desc.desc
> points _datatype_predefined_elem_desc[7*2], which is also pointed
> by opal_datatype_int8, because ompi_mpi_aint is defined by
> OMPI_DATATYPE_INIT_PREDEFINED_BASIC_TYPE macro and it uses
> OPAL_DATATYPE_INITIALIZER_INT8 macro. So
> opal_datatype_int8.desc.desc[0].elem.common.type is overwritten
> to 37.
> 
> Therefore in the second opal_datatype_dump() function call in my
> program,