subject:"\"\\\[OMPI devel\\\] RFC\\\: add support for large counts using derived datatypes\""

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-17 Thread Jeff Squyres (jsquyres)

On Jul 17, 2013, at 11:07 AM, Nathan Hjelm  wrote:

> Ugh. Thats unfortunate. I guess I could add a type_size.h and put the static 
> inline function in there then put the definions of MPI_Type_size_x and 
> MPI_Type_size in their own files. This way I can still avoid the extra code.

Or move the back-ends to ompi/datatype/foo.c.  We usually have very little 
heavy-lifting in the top-level ompi/mpi/c/*.c files.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-17 Thread Nathan Hjelm

On Wed, Jul 17, 2013 at 03:02:16PM +, Jeff Squyres (jsquyres) wrote:
> On Jul 17, 2013, at 10:48 AM, Nathan Hjelm  wrote:
> 
> > I must be missing something here. type_size.c contains MPI_Type_size and 
> > MPI_Type_size_x and I see all the MPI and PMPI variants in the resulting 
> > .so, .dylib, and .a.
> 
> 
> If you have a nathan.c file with:
> 
> -
> void MPI_foo() { ... }
> void MPI_bar() { ... }
> -
> 
> This will result in defining both symbols in that nathan.o file, which ends 
> up in libmpi.so.
> 
> Then if someone writes a code like this:
> 
> -
> int main() {
> MPI_Init();
> MPI_Foo();
> MPI_Bar();
> MPI_Finalize();
> return 0;
> }
> -
> 
> And then they interpose their own version of MPI_Bar() with their 
> libinterposition.so, *it won't work* (meaning their version of MPI_Bar() 
> won't be called).  
> 
> This happens because the linker will first see MPI_Foo() in main and resolves 
> it.  When it resolves the MPI_Foo symbol, it pulls *all* symbols out of the 
> .o from where MPI_Foo came (i.e., nathan.o in libmpi.so) -- i.e., including 
> MPI_Bar.  
> 
> So when MPI_Bar goes to get executed, it's *already been resolved* to the one 
> in nathan.o/libmpi.so, not the one from libinterposition.so.
> 
> Even worse, if they reversed the order of foo/bar in main, then the linker 
> would likely give you a duplicate symbol error because it will first resolve 
> MPI_Bar from libinterposition.so, and then later resolve MPI_Foo from 
> libmpi.so, but it will also pull MPI_Bar from libmpi.so -- kaboom.
> 
> Linkers are insanely complicated.


Ugh. Thats unfortunate. I guess I could add a type_size.h and put the static 
inline function in there then put the definions of MPI_Type_size_x and 
MPI_Type_size in their own files. This way I can still avoid the extra code.

-Nathan

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-17 Thread Jeff Squyres (jsquyres)

On Jul 17, 2013, at 10:48 AM, Nathan Hjelm  wrote:

> I must be missing something here. type_size.c contains MPI_Type_size and 
> MPI_Type_size_x and I see all the MPI and PMPI variants in the resulting .so, 
> .dylib, and .a.

If you have a nathan.c file with:

-
void MPI_foo() { ... }
void MPI_bar() { ... }
-

This will result in defining both symbols in that nathan.o file, which ends up 
in libmpi.so.

Then if someone writes a code like this:

-
int main() {
MPI_Init();
MPI_Foo();
MPI_Bar();
MPI_Finalize();
return 0;
}
-

And then they interpose their own version of MPI_Bar() with their 
libinterposition.so, *it won't work* (meaning their version of MPI_Bar() won't 
be called).  

This happens because the linker will first see MPI_Foo() in main and resolves 
it.  When it resolves the MPI_Foo symbol, it pulls *all* symbols out of the .o 
from where MPI_Foo came (i.e., nathan.o in libmpi.so) -- i.e., including 
MPI_Bar.  

So when MPI_Bar goes to get executed, it's *already been resolved* to the one 
in nathan.o/libmpi.so, not the one from libinterposition.so.

Even worse, if they reversed the order of foo/bar in main, then the linker 
would likely give you a duplicate symbol error because it will first resolve 
MPI_Bar from libinterposition.so, and then later resolve MPI_Foo from 
libmpi.so, but it will also pull MPI_Bar from libmpi.so -- kaboom.

Linkers are insanely complicated.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-17 Thread Nathan Hjelm

On Tue, Jul 16, 2013 at 09:03:22PM +, Jeff Squyres (jsquyres) wrote:
> On Jul 16, 2013, at 4:54 PM, Nathan Hjelm  wrote:
> 
> >> 3. We had a policy that we only export one single MPI level function per 
> >> file in the mpi directory. You changed this as some of the files exports 
> >> now two function (the original function together with the _x version).
> > 
> > I was trying to avoid having too much duplicate code. If including both 
> > functions in the same file is not ok I will move the _x functions to their 
> > own .c files.
> 
> This is an unfortunate side-effect of the PMPI mandate (be able to override 
> any individual MPI symbol).  Because of the way linkers work, you can only 
> put 1 MPI symbol in any given .o file.  :-(

I must be missing something here. type_size.c contains MPI_Type_size and 
MPI_Type_size_x and I see all the MPI and PMPI variants in the resulting .so, 
.dylib, and .a.

-Nathan

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread George Bosilca


On Jul 16, 2013, at 23:11 , "David Goodell (dgoodell)"  
wrote:

> On Jul 16, 2013, at 4:03 PM, George Bosilca 
> wrote:
> 
>> On Jul 16, 2013, at 22:29 , Jeff Squyres (jsquyres)  
>> wrote:
>> 
>>> On Jul 16, 2013, at 4:22 PM, George Bosilca  wrote:
>>> 
 Btw, I have a question to you fellow MPI Forum attendees. I just can't 
 remember why the MPI forum felt there was a need for the 
 MPI_Type_get[_true]_extent_x? MPI_Count can't be bigger than MPI_Aint,
>>> 
>>> Yes, it can -- it has to be the largest integer type (i.e., it even has to 
>>> be able to handle an MPI_Offset).
>> 
>> Technicalities! In the entire standard MPI_Offset is only used to access 
>> files, not to build datatypes. As such there is no way to have the extent of 
>> an datatype bigger than MPI_Aint.
> 
> That's not true.  You can obtain a datatype with an extent outside the range 
> of an MPI_Aint by nesting types.  Just create a config of size 1, then create 
> a type a very large extent from your contig with MPI_Type_create_resized, 
> then create a second contig of that resized with a count >1.

Sure. But the only reason you create such a nested type is to access files 
(otherwise you can't go over the MPI_Aint boundary safely). Thus I would have 
expected the limit to be similar to MPI_Offset and not a new type MPI_Count …

Oh I see now. MPI_Aint is the largest difference in memory and MPI_Offset is 
the largest difference for files. Thus, MPI_Count is the largest of the two, so 
it can adapt in all cases. I'm happy with this conclusion … Thanks everyone.

  George.

> 
>> Thus, these accessors returning MPI_Count are a useless overkill, as they 
>> cannot offer more precision that what the version returning MPI_Aint is 
>> already offering.
>> 
>> George.
>> 
>> PS: I hope nobody has the idea to define the MPI_Offset as a signed type …
> 
> Not sure if you're joking here... MPI_Offset must also be signed, again, for 
> Fortran interoperability.
> 
> -Dave
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread Jeff Squyres (jsquyres)

Er... changing that value will have ABI implications... :-(


On Jul 16, 2013, at 5:12 PM, Nathan Hjelm  wrote:

> Ugh, that isn't what I wanted to hear. MPI_Count can have the value of 
> MPI_UNDEFINED which we define as -32766. Do we have to redefine this value to 
> ensure there are no problems?
> 
> -Nathan
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread Nathan Hjelm

On Tue, Jul 16, 2013 at 11:08:32PM +0200, George Bosilca wrote:
> 
> On Jul 16, 2013, at 23:03 , Nathan Hjelm  wrote:
> 
> > On Tue, Jul 16, 2013 at 10:22:33PM +0200, George Bosilca wrote:
> >> Nathan,
> >> 
> >> I read your code and it's definitively looking good. I have however few 
> >> minor issues with your patch.
> >> 
> >> 1. MPI_Aint is unsigned as it must represent the difference between two 
> >> memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go 
> >> through size_t possibly reducing it's extent. I would suggest you used 
> >> ssize_t instead.
> >> 2. In several other locations size_t is used as a conversion base. In some 
> >> of these location there is even a comment talking about ssize_t ? 
> > 
> > I looked at the code in question and there shouldn't be an issue. Where we 
> > want to return MPI_Aint it is never converted to a size_t. The size_t is to 
> > ensure that if we return an MPI_Count that the value is never larger than 
> > SSIZE_MAX or negative. Am I wrong in assuming MPI_Count can never be 
> > negative?
> 
> Based on the standard it is both a size and a displacement (including 
> relative) in a file, so my understanding is that it can be negative.

Ugh, that isn't what I wanted to hear. MPI_Count can have the value of 
MPI_UNDEFINED which we define as -32766. Do we have to redefine this value to 
ensure there are no problems?

-Nathan

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread George Bosilca


On Jul 16, 2013, at 23:07 , "Jeff Squyres (jsquyres)"  
wrote:

> On Jul 16, 2013, at 5:03 PM, George Bosilca  wrote:
> 
>>> Yes, it can -- it has to be the largest integer type (i.e., it even has to 
>>> be able to handle an MPI_Offset).
>> 
>> Technicalities! In the entire standard MPI_Offset is only used to access 
>> files, not to build datatypes. As such there is no way to have the extent of 
>> an datatype bigger than MPI_Aint.
> 
> Datatypes are used in FILE_SET_VIEW.

Doesn't matter. There you don't create a datatype, you force one on the view 
you have of the file. I guess the forum was a little overzealous …

  George.


> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread David Goodell (dgoodell)

On Jul 16, 2013, at 4:03 PM, George Bosilca 
 wrote:

> On Jul 16, 2013, at 22:29 , Jeff Squyres (jsquyres)  
> wrote:
> 
>> On Jul 16, 2013, at 4:22 PM, George Bosilca  wrote:
>> 
>>> Btw, I have a question to you fellow MPI Forum attendees. I just can't 
>>> remember why the MPI forum felt there was a need for the 
>>> MPI_Type_get[_true]_extent_x? MPI_Count can't be bigger than MPI_Aint,
>> 
>> Yes, it can -- it has to be the largest integer type (i.e., it even has to 
>> be able to handle an MPI_Offset).
> 
> Technicalities! In the entire standard MPI_Offset is only used to access 
> files, not to build datatypes. As such there is no way to have the extent of 
> an datatype bigger than MPI_Aint.

That's not true.  You can obtain a datatype with an extent outside the range of 
an MPI_Aint by nesting types.  Just create a contig of size 1, then create a 
type a very large extent from your contig with MPI_Type_create_resized, then 
create a second contig of that resized with a count >1.

> Thus, these accessors returning MPI_Count are a useless overkill, as they 
> cannot offer more precision that what the version returning MPI_Aint is 
> already offering.
> 
>  George.
> 
> PS: I hope nobody has the idea to define the MPI_Offset as a signed type …

Not sure if you're joking here... MPI_Offset must also be signed, again, for 
Fortran interoperability.

-Dave

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread Nathan Hjelm

On Tue, Jul 16, 2013 at 11:03:27PM +0200, George Bosilca wrote:
> 
> On Jul 16, 2013, at 22:29 , Jeff Squyres (jsquyres)  
> wrote:
> 
> > On Jul 16, 2013, at 4:22 PM, George Bosilca  wrote:
> > 
> >> Btw, I have a question to you fellow MPI Forum attendees. I just can't 
> >> remember why the MPI forum felt there was a need for the 
> >> MPI_Type_get[_true]_extent_x? MPI_Count can't be bigger than MPI_Aint,
> > 
> > Yes, it can -- it has to be the largest integer type (i.e., it even has to 
> > be able to handle an MPI_Offset).
> 
> Technicalities! In the entire standard MPI_Offset is only used to access 
> files, not to build datatypes. As such there is no way to have the extent of 
> an datatype bigger than MPI_Aint. Thus, these accessors returning MPI_Count 
> are a useless overkill, as they cannot offer more precision that what the 
> version returning MPI_Aint is already offering.
> 
>   George.
> 
> PS: I hope nobody has the idea to define the MPI_Offset as a signed type ?

Externally MPI_Offset is defines as a signed type (long long, long, or int) but 
internally it is treated as unsigned. I will update MPI_Count to have the same 
treatment (since it can be MPI_UNDEFINED which is a negative number).

-Nathan

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread George Bosilca


On Jul 16, 2013, at 23:03 , Nathan Hjelm  wrote:

> On Tue, Jul 16, 2013 at 10:22:33PM +0200, George Bosilca wrote:
>> Nathan,
>> 
>> I read your code and it's definitively looking good. I have however few 
>> minor issues with your patch.
>> 
>> 1. MPI_Aint is unsigned as it must represent the difference between two 
>> memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go 
>> through size_t possibly reducing it's extent. I would suggest you used 
>> ssize_t instead.
>> 2. In several other locations size_t is used as a conversion base. In some 
>> of these location there is even a comment talking about ssize_t ? 
> 
> I looked at the code in question and there shouldn't be an issue. Where we 
> want to return MPI_Aint it is never converted to a size_t. The size_t is to 
> ensure that if we return an MPI_Count that the value is never larger than 
> SSIZE_MAX or negative. Am I wrong in assuming MPI_Count can never be negative?

Based on the standard it is both a size and a displacement (including relative) 
in a file, so my understanding is that it can be negative.

  George.

> If so I can change the checks in MPI_Type_get_[true_]_extent_x to not loose 
> this value.
> 
> The other places that use size_t (MPI_Get_elements for example) are in places 
> where I beleive the value will never legally be negative so it is safe to 
> assume the returned value should be MPI_UNDEFINED in those cases. Is there a 
> particular case I should look at?
> 
> -Nathan
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread Jeff Squyres (jsquyres)

On Jul 16, 2013, at 5:03 PM, George Bosilca  wrote:

>> Yes, it can -- it has to be the largest integer type (i.e., it even has to 
>> be able to handle an MPI_Offset).
> 
> Technicalities! In the entire standard MPI_Offset is only used to access 
> files, not to build datatypes. As such there is no way to have the extent of 
> an datatype bigger than MPI_Aint.

Datatypes are used in FILE_SET_VIEW.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread George Bosilca

Apparently I just can't type that freaking word. Thanks Nathan for pointing out 
the truth ;)

  George.

On Jul 16, 2013, at 22:56 , Nathan Hjelm  wrote:

> I think you meant signed. It is signed in both configure.ac and 
> ompi_datatype_module.c.
> 
> -Nathan
> 
> On Tue, Jul 16, 2013 at 10:48:12PM +0200, George Bosilca wrote:
>> It's a typo, MPI_Aint is of course unsigned.
>> 
>>  George.
>> 
>> On Jul 16, 2013, at 22:37 , David Goodell (dgoodell)  
>> wrote:
>> 
>>> On Jul 16, 2013, at 3:22 PM, George Bosilca  wrote:
>>> 
 I read your code and it's definitively looking good. I have however few 
 minor issues with your patch.
 
 1. MPI_Aint is unsigned as it must represent the difference between two 
 memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go 
 through size_t possibly reducing it's extent. I would suggest you used 
 ssize_t instead.
>>> 
>>> MPI_Aint must be signed for Fortran compatibility (among other reasons).  
>>> If OMPI's MPI_Aint is unsigned then that's a bug in OMPI.
>>> 
>>> -Dave
>>> 
>>> 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread Nathan Hjelm

On Tue, Jul 16, 2013 at 10:22:33PM +0200, George Bosilca wrote:
> Nathan,
> 
> I read your code and it's definitively looking good. I have however few minor 
> issues with your patch.
> 
> 1. MPI_Aint is unsigned as it must represent the difference between two 
> memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go 
> through size_t possibly reducing it's extent. I would suggest you used 
> ssize_t instead.
> 2. In several other locations size_t is used as a conversion base. In some of 
> these location there is even a comment talking about ssize_t ? 

I looked at the code in question and there shouldn't be an issue. Where we want 
to return MPI_Aint it is never converted to a size_t. The size_t is to ensure 
that if we return an MPI_Count that the value is never larger than SSIZE_MAX or 
negative. Am I wrong in assuming MPI_Count can never be negative? If so I can 
change the checks in MPI_Type_get_[true_]_extent_x to not loose this value.

The other places that use size_t (MPI_Get_elements for example) are in places 
where I beleive the value will never legally be negative so it is safe to 
assume the returned value should be MPI_UNDEFINED in those cases. Is there a 
particular case I should look at?

-Nathan

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread George Bosilca


On Jul 16, 2013, at 22:29 , Jeff Squyres (jsquyres)  wrote:

> On Jul 16, 2013, at 4:22 PM, George Bosilca  wrote:
> 
>> Btw, I have a question to you fellow MPI Forum attendees. I just can't 
>> remember why the MPI forum felt there was a need for the 
>> MPI_Type_get[_true]_extent_x? MPI_Count can't be bigger than MPI_Aint,
> 
> Yes, it can -- it has to be the largest integer type (i.e., it even has to be 
> able to handle an MPI_Offset).

Technicalities! In the entire standard MPI_Offset is only used to access files, 
not to build datatypes. As such there is no way to have the extent of an 
datatype bigger than MPI_Aint. Thus, these accessors returning MPI_Count are a 
useless overkill, as they cannot offer more precision that what the version 
returning MPI_Aint is already offering.

  George.

PS: I hope nobody has the idea to define the MPI_Offset as a signed type …


>> so I don't see what is the benefit of extending the 
>> MPI_Type_get_true_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) and 
>> MPI_Type_get_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) with the 
>> corresponding _X versions?
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread Jeff Squyres (jsquyres)

On Jul 16, 2013, at 4:54 PM, Nathan Hjelm  wrote:

>> 3. We had a policy that we only export one single MPI level function per 
>> file in the mpi directory. You changed this as some of the files exports now 
>> two function (the original function together with the _x version).
> 
> I was trying to avoid having too much duplicate code. If including both 
> functions in the same file is not ok I will move the _x functions to their 
> own .c files.

This is an unfortunate side-effect of the PMPI mandate (be able to override any 
individual MPI symbol).  Because of the way linkers work, you can only put 1 
MPI symbol in any given .o file.  :-(

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread Nathan Hjelm

I think you meant signed. It is signed in both configure.ac and 
ompi_datatype_module.c.

-Nathan

On Tue, Jul 16, 2013 at 10:48:12PM +0200, George Bosilca wrote:
> It's a typo, MPI_Aint is of course unsigned.
> 
>   George.
> 
> On Jul 16, 2013, at 22:37 , David Goodell (dgoodell)  
> wrote:
> 
> > On Jul 16, 2013, at 3:22 PM, George Bosilca  wrote:
> > 
> >> I read your code and it's definitively looking good. I have however few 
> >> minor issues with your patch.
> >> 
> >> 1. MPI_Aint is unsigned as it must represent the difference between two 
> >> memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go 
> >> through size_t possibly reducing it's extent. I would suggest you used 
> >> ssize_t instead.
> > 
> > MPI_Aint must be signed for Fortran compatibility (among other reasons).  
> > If OMPI's MPI_Aint is unsigned then that's a bug in OMPI.
> > 
> > -Dave
> > 
> > 
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread Nathan Hjelm

On Tue, Jul 16, 2013 at 10:22:33PM +0200, George Bosilca wrote:
> Nathan,
> 
> I read your code and it's definitively looking good. I have however few minor 
> issues with your patch.
> 
> 1. MPI_Aint is unsigned as it must represent the difference between two 
> memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go 
> through size_t possibly reducing it's extent. I would suggest you used 
> ssize_t instead.

Ah, yes. That is correct will fix that and update my repository now.

> 2. In several other locations size_t is used as a conversion base. In some of 
> these location there is even a comment talking about ssize_t ?

Will fix this as well.

> 3. We had a policy that we only export one single MPI level function per file 
> in the mpi directory. You changed this as some of the files exports now two 
> function (the original function together with the _x version).

I was trying to avoid having too much duplicate code. If including both 
functions in the same file is not ok I will move the _x functions to their own 
.c files.

> 4. In the OPAL datatype stuff sometimes you use size_t and sometimes ssize_t 
> for the same type of logic (set and get count as an example). Why?

I replaced uint32_t with size_t and int32_t with ssize_t to be consistent with 
the original code.

> 5. You change the comments in the opal_datatype.h with "question marks"? the 
> cache boundary must be known, it can't be somewhere between x-y bytes ago ?

The problem is size_t can be either 4 or 8 bytes so there are two possible 
places for the cache boundary. If you prefer I can change those to use int64_t 
and uint64_t instead so we will know where the cache boundaries are. (or leave 
them as 32-bit if that is the correct answer).

> 6. I'm not sure the change of nbElems from uint32_t to size_t (in 
> opal/datatype/opal_datatype.h) is doing what you expect?

Admittedly, I changed the size of nbElems early on. I left it as 64-bit (32-bit 
on 32-bit platforms) to allow the creation of datatypes with more than 2^32 
elements. Not sure this senario will ever occur though.

> 
> 
> Btw, I have a question to you fellow MPI Forum attendees. I just can't 
> remember why the MPI forum felt there was a need for the 
> MPI_Type_get[_true]_extent_x? MPI_Count can't be bigger than MPI_Aint, so I 
> don't see what is the benefit of extending the 
> MPI_Type_get_true_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) and 
> MPI_Type_get_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) with the 
> corresponding _X versions?

It was before my involement with the forum. Jeff knows better why this was done.

Thanks for taking a look. I will let you know when I have fixed the 
ssize_t/size_t issue.

-Nathan

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread George Bosilca

It's a typo, MPI_Aint is of course unsigned.

  George.

On Jul 16, 2013, at 22:37 , David Goodell (dgoodell)  wrote:

> On Jul 16, 2013, at 3:22 PM, George Bosilca  wrote:
> 
>> I read your code and it's definitively looking good. I have however few 
>> minor issues with your patch.
>> 
>> 1. MPI_Aint is unsigned as it must represent the difference between two 
>> memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go 
>> through size_t possibly reducing it's extent. I would suggest you used 
>> ssize_t instead.
> 
> MPI_Aint must be signed for Fortran compatibility (among other reasons).  If 
> OMPI's MPI_Aint is unsigned then that's a bug in OMPI.
> 
> -Dave
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread David Goodell (dgoodell)

On Jul 16, 2013, at 3:22 PM, George Bosilca  wrote:

> I read your code and it's definitively looking good. I have however few minor 
> issues with your patch.
> 
> 1. MPI_Aint is unsigned as it must represent the difference between two 
> memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go 
> through size_t possibly reducing it's extent. I would suggest you used 
> ssize_t instead.

MPI_Aint must be signed for Fortran compatibility (among other reasons).  If 
OMPI's MPI_Aint is unsigned then that's a bug in OMPI.

-Dave

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread Jeff Squyres (jsquyres)

On Jul 16, 2013, at 4:22 PM, George Bosilca  wrote:

> Btw, I have a question to you fellow MPI Forum attendees. I just can't 
> remember why the MPI forum felt there was a need for the 
> MPI_Type_get[_true]_extent_x? MPI_Count can't be bigger than MPI_Aint,

Yes, it can -- it has to be the largest integer type (i.e., it even has to be 
able to handle an MPI_Offset).

> so I don't see what is the benefit of extending the 
> MPI_Type_get_true_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) and 
> MPI_Type_get_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) with the 
> corresponding _X versions?


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread George Bosilca

Nathan,

I read your code and it's definitively looking good. I have however few minor 
issues with your patch.

1. MPI_Aint is unsigned as it must represent the difference between two memory 
arbitrary locations. In your MPI_Type_get_[true_]extent_x you go through size_t 
possibly reducing it's extent. I would suggest you used ssize_t instead.
2. In several other locations size_t is used as a conversion base. In some of 
these location there is even a comment talking about ssize_t … 
3. We had a policy that we only export one single MPI level function per file 
in the mpi directory. You changed this as some of the files exports now two 
function (the original function together with the _x version).
4. In the OPAL datatype stuff sometimes you use size_t and sometimes ssize_t 
for the same type of logic (set and get count as an example). Why?
5. You change the comments in the opal_datatype.h with "question marks"? the 
cache boundary must be known, it can't be somewhere between x-y bytes ago …
6. I'm not sure the change of nbElems from uint32_t to size_t (in 
opal/datatype/opal_datatype.h) is doing what you expect…

Btw, I have a question to you fellow MPI Forum attendees. I just can't remember 
why the MPI forum felt there was a need for the MPI_Type_get[_true]_extent_x? 
MPI_Count can't be bigger than MPI_Aint, so I don't see what is the benefit of 
extending the MPI_Type_get_true_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) and 
MPI_Type_get_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) with the corresponding 
_X versions?

George.

On Jul 16, 2013, at 21:14 , Nathan Hjelm  wrote:

> What: Add support for the MPI-3 MPI_Count datatype and functions: 
> MPI_Get_elements_x, MPI_Status_set_elements_x, MPI_Type_get_extent_x, 
> MPI_Type_get_true_extent_x, and MPI_Type_size_x. This will be CMR'd to 1.7.3 
> if there are no objections.
> 
> Why: MPI_Count is required by the MPI 3.0 standard. This will add another 
> checkmark by MPI 3 support.
> 
> When: Setting a short timeout of one week (Tues, July 23, 2013). Most of the 
> changes add the new functionality but there are some changes that affect the 
> datatype engine.
> 
> Details follow.
> 
> -Nathan
> 
> Repository @ github: https://github.com/hjelmn/ompi-count.git
> 
> Relevant commits:
> General support: 
> https://github.com/hjelmn/ompi-count/commit/db54d13404a241642fa783d5b3cc74edcb1103f2
> Fortran support: 
> https://github.com/hjelmn/ompi-count/commit/293adf84be52c2cd8acfe31be19cfe0afe14752d
> Others: 
> https://github.com/hjelmn/ompi-count/commit/6c6ca8e539da675632c249c891ff93fdbc9d8de8
>
> https://github.com/hjelmn/ompi-count/commit/9638ef1f245f12bb98abbf5f47e1ecfd1a018862
>
> https://github.com/hjelmn/ompi-count/commit/e158aa152d122e554b89498f5a71284ce1361a99
> 
> Add support for MPI_Count type and MPI_COUNT datatype and add the required
> MPI-3 functions MPI_Get_elements_x, MPI_Status_set_elements_x,
> MPI_Type_get_extent_x, MPI_Type_get_true_extent_x, and MPI_Type_size_x.
> This commit adds only the C bindings. Fortran bindins will be added in
> another commit. For now the MPI_Count type is define to have the same size
> as MPI_Offset. The type is required to be at least as large as MPI_Offset
> and MPI_Aint. The type was initially intended to be a ssize_t (if it was
> the same size as a long long) but there were issues compiling romio with
> that definition (despite the inclusion of stddef.h).
> 
> I updated the datatype engine to use size_t instead of uint32_t to support
> large datatypes. This will require some review to make sure that 1) the
> changes are beneficial, 2) nothing was broken by the change (I doubt
> anything was), and 3) there are no performance regressions due to this
> change.
> 
> George, please look over these changes and let me know if you see anything 
> wrong with my updates to the datatype engine.
> 
> -Nathan
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

[OMPI devel] RFC: add support for large counts using derived datatypes

2013-07-16 Thread Nathan Hjelm

What: Add support for the MPI-3 MPI_Count datatype and functions: 
MPI_Get_elements_x, MPI_Status_set_elements_x, MPI_Type_get_extent_x, 
MPI_Type_get_true_extent_x, and MPI_Type_size_x. This will be CMR'd to 1.7.3 if 
there are no objections.

Why: MPI_Count is required by the MPI 3.0 standard. This will add another 
checkmark by MPI 3 support.

When: Setting a short timeout of one week (Tues, July 23, 2013). Most of the 
changes add the new functionality but there are some changes that affect the 
datatype engine.

Details follow.

-Nathan

Repository @ github: https://github.com/hjelmn/ompi-count.git

Relevant commits:
General support: 
https://github.com/hjelmn/ompi-count/commit/db54d13404a241642fa783d5b3cc74edcb1103f2
Fortran support: 
https://github.com/hjelmn/ompi-count/commit/293adf84be52c2cd8acfe31be19cfe0afe14752d
Others: 
https://github.com/hjelmn/ompi-count/commit/6c6ca8e539da675632c249c891ff93fdbc9d8de8

https://github.com/hjelmn/ompi-count/commit/9638ef1f245f12bb98abbf5f47e1ecfd1a018862

https://github.com/hjelmn/ompi-count/commit/e158aa152d122e554b89498f5a71284ce1361a99

Add support for MPI_Count type and MPI_COUNT datatype and add the required
MPI-3 functions MPI_Get_elements_x, MPI_Status_set_elements_x,
MPI_Type_get_extent_x, MPI_Type_get_true_extent_x, and MPI_Type_size_x.
This commit adds only the C bindings. Fortran bindins will be added in
another commit. For now the MPI_Count type is define to have the same size
as MPI_Offset. The type is required to be at least as large as MPI_Offset
and MPI_Aint. The type was initially intended to be a ssize_t (if it was
the same size as a long long) but there were issues compiling romio with
that definition (despite the inclusion of stddef.h).

I updated the datatype engine to use size_t instead of uint32_t to support
large datatypes. This will require some review to make sure that 1) the
changes are beneficial, 2) nothing was broken by the change (I doubt
anything was), and 3) there are no performance regressions due to this
change.

George, please look over these changes and let me know if you see anything 
wrong with my updates to the datatype engine.

-Nathan

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

Re: [OMPI devel] RFC: add support for large counts using derived datatypes

[OMPI devel] RFC: add support for large counts using derived datatypes

23 matches

Site Navigation

Mail list logo

Footer information