Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Jul 17, 2013, at 11:07 AM, Nathan Hjelm wrote: > Ugh. Thats unfortunate. I guess I could add a type_size.h and put the static > inline function in there then put the definions of MPI_Type_size_x and > MPI_Type_size in their own files. This way I can still avoid the extra code. Or move the back-ends to ompi/datatype/foo.c. We usually have very little heavy-lifting in the top-level ompi/mpi/c/*.c files. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Wed, Jul 17, 2013 at 03:02:16PM +, Jeff Squyres (jsquyres) wrote: > On Jul 17, 2013, at 10:48 AM, Nathan Hjelm wrote: > > > I must be missing something here. type_size.c contains MPI_Type_size and > > MPI_Type_size_x and I see all the MPI and PMPI variants in the resulting > > .so, .dylib, and .a. > > > If you have a nathan.c file with: > > - > void MPI_foo() { ... } > void MPI_bar() { ... } > - > > This will result in defining both symbols in that nathan.o file, which ends > up in libmpi.so. > > Then if someone writes a code like this: > > - > int main() { > MPI_Init(); > MPI_Foo(); > MPI_Bar(); > MPI_Finalize(); > return 0; > } > - > > And then they interpose their own version of MPI_Bar() with their > libinterposition.so, *it won't work* (meaning their version of MPI_Bar() > won't be called). > > This happens because the linker will first see MPI_Foo() in main and resolves > it. When it resolves the MPI_Foo symbol, it pulls *all* symbols out of the > .o from where MPI_Foo came (i.e., nathan.o in libmpi.so) -- i.e., including > MPI_Bar. > > So when MPI_Bar goes to get executed, it's *already been resolved* to the one > in nathan.o/libmpi.so, not the one from libinterposition.so. > > Even worse, if they reversed the order of foo/bar in main, then the linker > would likely give you a duplicate symbol error because it will first resolve > MPI_Bar from libinterposition.so, and then later resolve MPI_Foo from > libmpi.so, but it will also pull MPI_Bar from libmpi.so -- kaboom. > > Linkers are insanely complicated. Ugh. Thats unfortunate. I guess I could add a type_size.h and put the static inline function in there then put the definions of MPI_Type_size_x and MPI_Type_size in their own files. This way I can still avoid the extra code. -Nathan
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Jul 17, 2013, at 10:48 AM, Nathan Hjelm wrote: > I must be missing something here. type_size.c contains MPI_Type_size and > MPI_Type_size_x and I see all the MPI and PMPI variants in the resulting .so, > .dylib, and .a. If you have a nathan.c file with: - void MPI_foo() { ... } void MPI_bar() { ... } - This will result in defining both symbols in that nathan.o file, which ends up in libmpi.so. Then if someone writes a code like this: - int main() { MPI_Init(); MPI_Foo(); MPI_Bar(); MPI_Finalize(); return 0; } - And then they interpose their own version of MPI_Bar() with their libinterposition.so, *it won't work* (meaning their version of MPI_Bar() won't be called). This happens because the linker will first see MPI_Foo() in main and resolves it. When it resolves the MPI_Foo symbol, it pulls *all* symbols out of the .o from where MPI_Foo came (i.e., nathan.o in libmpi.so) -- i.e., including MPI_Bar. So when MPI_Bar goes to get executed, it's *already been resolved* to the one in nathan.o/libmpi.so, not the one from libinterposition.so. Even worse, if they reversed the order of foo/bar in main, then the linker would likely give you a duplicate symbol error because it will first resolve MPI_Bar from libinterposition.so, and then later resolve MPI_Foo from libmpi.so, but it will also pull MPI_Bar from libmpi.so -- kaboom. Linkers are insanely complicated. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Tue, Jul 16, 2013 at 09:03:22PM +, Jeff Squyres (jsquyres) wrote: > On Jul 16, 2013, at 4:54 PM, Nathan Hjelm wrote: > > >> 3. We had a policy that we only export one single MPI level function per > >> file in the mpi directory. You changed this as some of the files exports > >> now two function (the original function together with the _x version). > > > > I was trying to avoid having too much duplicate code. If including both > > functions in the same file is not ok I will move the _x functions to their > > own .c files. > > This is an unfortunate side-effect of the PMPI mandate (be able to override > any individual MPI symbol). Because of the way linkers work, you can only > put 1 MPI symbol in any given .o file. :-( I must be missing something here. type_size.c contains MPI_Type_size and MPI_Type_size_x and I see all the MPI and PMPI variants in the resulting .so, .dylib, and .a. -Nathan
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Jul 16, 2013, at 23:11 , "David Goodell (dgoodell)" wrote: > On Jul 16, 2013, at 4:03 PM, George Bosilca > wrote: > >> On Jul 16, 2013, at 22:29 , Jeff Squyres (jsquyres) >> wrote: >> >>> On Jul 16, 2013, at 4:22 PM, George Bosilca wrote: >>> Btw, I have a question to you fellow MPI Forum attendees. I just can't remember why the MPI forum felt there was a need for the MPI_Type_get[_true]_extent_x? MPI_Count can't be bigger than MPI_Aint, >>> >>> Yes, it can -- it has to be the largest integer type (i.e., it even has to >>> be able to handle an MPI_Offset). >> >> Technicalities! In the entire standard MPI_Offset is only used to access >> files, not to build datatypes. As such there is no way to have the extent of >> an datatype bigger than MPI_Aint. > > That's not true. You can obtain a datatype with an extent outside the range > of an MPI_Aint by nesting types. Just create a config of size 1, then create > a type a very large extent from your contig with MPI_Type_create_resized, > then create a second contig of that resized with a count >1. Sure. But the only reason you create such a nested type is to access files (otherwise you can't go over the MPI_Aint boundary safely). Thus I would have expected the limit to be similar to MPI_Offset and not a new type MPI_Count … Oh I see now. MPI_Aint is the largest difference in memory and MPI_Offset is the largest difference for files. Thus, MPI_Count is the largest of the two, so it can adapt in all cases. I'm happy with this conclusion … Thanks everyone. George. > >> Thus, these accessors returning MPI_Count are a useless overkill, as they >> cannot offer more precision that what the version returning MPI_Aint is >> already offering. >> >> George. >> >> PS: I hope nobody has the idea to define the MPI_Offset as a signed type … > > Not sure if you're joking here... MPI_Offset must also be signed, again, for > Fortran interoperability. > > -Dave > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
Er... changing that value will have ABI implications... :-( On Jul 16, 2013, at 5:12 PM, Nathan Hjelm wrote: > Ugh, that isn't what I wanted to hear. MPI_Count can have the value of > MPI_UNDEFINED which we define as -32766. Do we have to redefine this value to > ensure there are no problems? > > -Nathan > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Tue, Jul 16, 2013 at 11:08:32PM +0200, George Bosilca wrote: > > On Jul 16, 2013, at 23:03 , Nathan Hjelm wrote: > > > On Tue, Jul 16, 2013 at 10:22:33PM +0200, George Bosilca wrote: > >> Nathan, > >> > >> I read your code and it's definitively looking good. I have however few > >> minor issues with your patch. > >> > >> 1. MPI_Aint is unsigned as it must represent the difference between two > >> memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go > >> through size_t possibly reducing it's extent. I would suggest you used > >> ssize_t instead. > >> 2. In several other locations size_t is used as a conversion base. In some > >> of these location there is even a comment talking about ssize_t ? > > > > I looked at the code in question and there shouldn't be an issue. Where we > > want to return MPI_Aint it is never converted to a size_t. The size_t is to > > ensure that if we return an MPI_Count that the value is never larger than > > SSIZE_MAX or negative. Am I wrong in assuming MPI_Count can never be > > negative? > > Based on the standard it is both a size and a displacement (including > relative) in a file, so my understanding is that it can be negative. Ugh, that isn't what I wanted to hear. MPI_Count can have the value of MPI_UNDEFINED which we define as -32766. Do we have to redefine this value to ensure there are no problems? -Nathan
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Jul 16, 2013, at 23:07 , "Jeff Squyres (jsquyres)" wrote: > On Jul 16, 2013, at 5:03 PM, George Bosilca wrote: > >>> Yes, it can -- it has to be the largest integer type (i.e., it even has to >>> be able to handle an MPI_Offset). >> >> Technicalities! In the entire standard MPI_Offset is only used to access >> files, not to build datatypes. As such there is no way to have the extent of >> an datatype bigger than MPI_Aint. > > Datatypes are used in FILE_SET_VIEW. Doesn't matter. There you don't create a datatype, you force one on the view you have of the file. I guess the forum was a little overzealous … George. > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Jul 16, 2013, at 4:03 PM, George Bosilca wrote: > On Jul 16, 2013, at 22:29 , Jeff Squyres (jsquyres) > wrote: > >> On Jul 16, 2013, at 4:22 PM, George Bosilca wrote: >> >>> Btw, I have a question to you fellow MPI Forum attendees. I just can't >>> remember why the MPI forum felt there was a need for the >>> MPI_Type_get[_true]_extent_x? MPI_Count can't be bigger than MPI_Aint, >> >> Yes, it can -- it has to be the largest integer type (i.e., it even has to >> be able to handle an MPI_Offset). > > Technicalities! In the entire standard MPI_Offset is only used to access > files, not to build datatypes. As such there is no way to have the extent of > an datatype bigger than MPI_Aint. That's not true. You can obtain a datatype with an extent outside the range of an MPI_Aint by nesting types. Just create a contig of size 1, then create a type a very large extent from your contig with MPI_Type_create_resized, then create a second contig of that resized with a count >1. > Thus, these accessors returning MPI_Count are a useless overkill, as they > cannot offer more precision that what the version returning MPI_Aint is > already offering. > > George. > > PS: I hope nobody has the idea to define the MPI_Offset as a signed type … Not sure if you're joking here... MPI_Offset must also be signed, again, for Fortran interoperability. -Dave
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Tue, Jul 16, 2013 at 11:03:27PM +0200, George Bosilca wrote: > > On Jul 16, 2013, at 22:29 , Jeff Squyres (jsquyres) > wrote: > > > On Jul 16, 2013, at 4:22 PM, George Bosilca wrote: > > > >> Btw, I have a question to you fellow MPI Forum attendees. I just can't > >> remember why the MPI forum felt there was a need for the > >> MPI_Type_get[_true]_extent_x? MPI_Count can't be bigger than MPI_Aint, > > > > Yes, it can -- it has to be the largest integer type (i.e., it even has to > > be able to handle an MPI_Offset). > > Technicalities! In the entire standard MPI_Offset is only used to access > files, not to build datatypes. As such there is no way to have the extent of > an datatype bigger than MPI_Aint. Thus, these accessors returning MPI_Count > are a useless overkill, as they cannot offer more precision that what the > version returning MPI_Aint is already offering. > > George. > > PS: I hope nobody has the idea to define the MPI_Offset as a signed type ? Externally MPI_Offset is defines as a signed type (long long, long, or int) but internally it is treated as unsigned. I will update MPI_Count to have the same treatment (since it can be MPI_UNDEFINED which is a negative number). -Nathan
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Jul 16, 2013, at 23:03 , Nathan Hjelm wrote: > On Tue, Jul 16, 2013 at 10:22:33PM +0200, George Bosilca wrote: >> Nathan, >> >> I read your code and it's definitively looking good. I have however few >> minor issues with your patch. >> >> 1. MPI_Aint is unsigned as it must represent the difference between two >> memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go >> through size_t possibly reducing it's extent. I would suggest you used >> ssize_t instead. >> 2. In several other locations size_t is used as a conversion base. In some >> of these location there is even a comment talking about ssize_t ? > > I looked at the code in question and there shouldn't be an issue. Where we > want to return MPI_Aint it is never converted to a size_t. The size_t is to > ensure that if we return an MPI_Count that the value is never larger than > SSIZE_MAX or negative. Am I wrong in assuming MPI_Count can never be negative? Based on the standard it is both a size and a displacement (including relative) in a file, so my understanding is that it can be negative. George. > If so I can change the checks in MPI_Type_get_[true_]_extent_x to not loose > this value. > > The other places that use size_t (MPI_Get_elements for example) are in places > where I beleive the value will never legally be negative so it is safe to > assume the returned value should be MPI_UNDEFINED in those cases. Is there a > particular case I should look at? > > -Nathan > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Jul 16, 2013, at 5:03 PM, George Bosilca wrote: >> Yes, it can -- it has to be the largest integer type (i.e., it even has to >> be able to handle an MPI_Offset). > > Technicalities! In the entire standard MPI_Offset is only used to access > files, not to build datatypes. As such there is no way to have the extent of > an datatype bigger than MPI_Aint. Datatypes are used in FILE_SET_VIEW. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
Apparently I just can't type that freaking word. Thanks Nathan for pointing out the truth ;) George. On Jul 16, 2013, at 22:56 , Nathan Hjelm wrote: > I think you meant signed. It is signed in both configure.ac and > ompi_datatype_module.c. > > -Nathan > > On Tue, Jul 16, 2013 at 10:48:12PM +0200, George Bosilca wrote: >> It's a typo, MPI_Aint is of course unsigned. >> >> George. >> >> On Jul 16, 2013, at 22:37 , David Goodell (dgoodell) >> wrote: >> >>> On Jul 16, 2013, at 3:22 PM, George Bosilca wrote: >>> I read your code and it's definitively looking good. I have however few minor issues with your patch. 1. MPI_Aint is unsigned as it must represent the difference between two memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go through size_t possibly reducing it's extent. I would suggest you used ssize_t instead. >>> >>> MPI_Aint must be signed for Fortran compatibility (among other reasons). >>> If OMPI's MPI_Aint is unsigned then that's a bug in OMPI. >>> >>> -Dave >>> >>> >>> ___ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Tue, Jul 16, 2013 at 10:22:33PM +0200, George Bosilca wrote: > Nathan, > > I read your code and it's definitively looking good. I have however few minor > issues with your patch. > > 1. MPI_Aint is unsigned as it must represent the difference between two > memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go > through size_t possibly reducing it's extent. I would suggest you used > ssize_t instead. > 2. In several other locations size_t is used as a conversion base. In some of > these location there is even a comment talking about ssize_t ? I looked at the code in question and there shouldn't be an issue. Where we want to return MPI_Aint it is never converted to a size_t. The size_t is to ensure that if we return an MPI_Count that the value is never larger than SSIZE_MAX or negative. Am I wrong in assuming MPI_Count can never be negative? If so I can change the checks in MPI_Type_get_[true_]_extent_x to not loose this value. The other places that use size_t (MPI_Get_elements for example) are in places where I beleive the value will never legally be negative so it is safe to assume the returned value should be MPI_UNDEFINED in those cases. Is there a particular case I should look at? -Nathan
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Jul 16, 2013, at 22:29 , Jeff Squyres (jsquyres) wrote: > On Jul 16, 2013, at 4:22 PM, George Bosilca wrote: > >> Btw, I have a question to you fellow MPI Forum attendees. I just can't >> remember why the MPI forum felt there was a need for the >> MPI_Type_get[_true]_extent_x? MPI_Count can't be bigger than MPI_Aint, > > Yes, it can -- it has to be the largest integer type (i.e., it even has to be > able to handle an MPI_Offset). Technicalities! In the entire standard MPI_Offset is only used to access files, not to build datatypes. As such there is no way to have the extent of an datatype bigger than MPI_Aint. Thus, these accessors returning MPI_Count are a useless overkill, as they cannot offer more precision that what the version returning MPI_Aint is already offering. George. PS: I hope nobody has the idea to define the MPI_Offset as a signed type … >> so I don't see what is the benefit of extending the >> MPI_Type_get_true_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) and >> MPI_Type_get_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) with the >> corresponding _X versions? > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Jul 16, 2013, at 4:54 PM, Nathan Hjelm wrote: >> 3. We had a policy that we only export one single MPI level function per >> file in the mpi directory. You changed this as some of the files exports now >> two function (the original function together with the _x version). > > I was trying to avoid having too much duplicate code. If including both > functions in the same file is not ok I will move the _x functions to their > own .c files. This is an unfortunate side-effect of the PMPI mandate (be able to override any individual MPI symbol). Because of the way linkers work, you can only put 1 MPI symbol in any given .o file. :-( -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
I think you meant signed. It is signed in both configure.ac and ompi_datatype_module.c. -Nathan On Tue, Jul 16, 2013 at 10:48:12PM +0200, George Bosilca wrote: > It's a typo, MPI_Aint is of course unsigned. > > George. > > On Jul 16, 2013, at 22:37 , David Goodell (dgoodell) > wrote: > > > On Jul 16, 2013, at 3:22 PM, George Bosilca wrote: > > > >> I read your code and it's definitively looking good. I have however few > >> minor issues with your patch. > >> > >> 1. MPI_Aint is unsigned as it must represent the difference between two > >> memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go > >> through size_t possibly reducing it's extent. I would suggest you used > >> ssize_t instead. > > > > MPI_Aint must be signed for Fortran compatibility (among other reasons). > > If OMPI's MPI_Aint is unsigned then that's a bug in OMPI. > > > > -Dave > > > > > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Tue, Jul 16, 2013 at 10:22:33PM +0200, George Bosilca wrote: > Nathan, > > I read your code and it's definitively looking good. I have however few minor > issues with your patch. > > 1. MPI_Aint is unsigned as it must represent the difference between two > memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go > through size_t possibly reducing it's extent. I would suggest you used > ssize_t instead. Ah, yes. That is correct will fix that and update my repository now. > 2. In several other locations size_t is used as a conversion base. In some of > these location there is even a comment talking about ssize_t ? Will fix this as well. > 3. We had a policy that we only export one single MPI level function per file > in the mpi directory. You changed this as some of the files exports now two > function (the original function together with the _x version). I was trying to avoid having too much duplicate code. If including both functions in the same file is not ok I will move the _x functions to their own .c files. > 4. In the OPAL datatype stuff sometimes you use size_t and sometimes ssize_t > for the same type of logic (set and get count as an example). Why? I replaced uint32_t with size_t and int32_t with ssize_t to be consistent with the original code. > 5. You change the comments in the opal_datatype.h with "question marks"? the > cache boundary must be known, it can't be somewhere between x-y bytes ago ? The problem is size_t can be either 4 or 8 bytes so there are two possible places for the cache boundary. If you prefer I can change those to use int64_t and uint64_t instead so we will know where the cache boundaries are. (or leave them as 32-bit if that is the correct answer). > 6. I'm not sure the change of nbElems from uint32_t to size_t (in > opal/datatype/opal_datatype.h) is doing what you expect? Admittedly, I changed the size of nbElems early on. I left it as 64-bit (32-bit on 32-bit platforms) to allow the creation of datatypes with more than 2^32 elements. Not sure this senario will ever occur though. > > > Btw, I have a question to you fellow MPI Forum attendees. I just can't > remember why the MPI forum felt there was a need for the > MPI_Type_get[_true]_extent_x? MPI_Count can't be bigger than MPI_Aint, so I > don't see what is the benefit of extending the > MPI_Type_get_true_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) and > MPI_Type_get_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) with the > corresponding _X versions? It was before my involement with the forum. Jeff knows better why this was done. Thanks for taking a look. I will let you know when I have fixed the ssize_t/size_t issue. -Nathan
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
It's a typo, MPI_Aint is of course unsigned. George. On Jul 16, 2013, at 22:37 , David Goodell (dgoodell) wrote: > On Jul 16, 2013, at 3:22 PM, George Bosilca wrote: > >> I read your code and it's definitively looking good. I have however few >> minor issues with your patch. >> >> 1. MPI_Aint is unsigned as it must represent the difference between two >> memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go >> through size_t possibly reducing it's extent. I would suggest you used >> ssize_t instead. > > MPI_Aint must be signed for Fortran compatibility (among other reasons). If > OMPI's MPI_Aint is unsigned then that's a bug in OMPI. > > -Dave > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Jul 16, 2013, at 3:22 PM, George Bosilca wrote: > I read your code and it's definitively looking good. I have however few minor > issues with your patch. > > 1. MPI_Aint is unsigned as it must represent the difference between two > memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go > through size_t possibly reducing it's extent. I would suggest you used > ssize_t instead. MPI_Aint must be signed for Fortran compatibility (among other reasons). If OMPI's MPI_Aint is unsigned then that's a bug in OMPI. -Dave
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
On Jul 16, 2013, at 4:22 PM, George Bosilca wrote: > Btw, I have a question to you fellow MPI Forum attendees. I just can't > remember why the MPI forum felt there was a need for the > MPI_Type_get[_true]_extent_x? MPI_Count can't be bigger than MPI_Aint, Yes, it can -- it has to be the largest integer type (i.e., it even has to be able to handle an MPI_Offset). > so I don't see what is the benefit of extending the > MPI_Type_get_true_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) and > MPI_Type_get_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) with the > corresponding _X versions? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] RFC: add support for large counts using derived datatypes
Nathan, I read your code and it's definitively looking good. I have however few minor issues with your patch. 1. MPI_Aint is unsigned as it must represent the difference between two memory arbitrary locations. In your MPI_Type_get_[true_]extent_x you go through size_t possibly reducing it's extent. I would suggest you used ssize_t instead. 2. In several other locations size_t is used as a conversion base. In some of these location there is even a comment talking about ssize_t … 3. We had a policy that we only export one single MPI level function per file in the mpi directory. You changed this as some of the files exports now two function (the original function together with the _x version). 4. In the OPAL datatype stuff sometimes you use size_t and sometimes ssize_t for the same type of logic (set and get count as an example). Why? 5. You change the comments in the opal_datatype.h with "question marks"? the cache boundary must be known, it can't be somewhere between x-y bytes ago … 6. I'm not sure the change of nbElems from uint32_t to size_t (in opal/datatype/opal_datatype.h) is doing what you expect… Btw, I have a question to you fellow MPI Forum attendees. I just can't remember why the MPI forum felt there was a need for the MPI_Type_get[_true]_extent_x? MPI_Count can't be bigger than MPI_Aint, so I don't see what is the benefit of extending the MPI_Type_get_true_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) and MPI_Type_get_extent(MPI_Datatype, MPI_Aint*, MPI_Aint*) with the corresponding _X versions? George. On Jul 16, 2013, at 21:14 , Nathan Hjelm wrote: > What: Add support for the MPI-3 MPI_Count datatype and functions: > MPI_Get_elements_x, MPI_Status_set_elements_x, MPI_Type_get_extent_x, > MPI_Type_get_true_extent_x, and MPI_Type_size_x. This will be CMR'd to 1.7.3 > if there are no objections. > > Why: MPI_Count is required by the MPI 3.0 standard. This will add another > checkmark by MPI 3 support. > > When: Setting a short timeout of one week (Tues, July 23, 2013). Most of the > changes add the new functionality but there are some changes that affect the > datatype engine. > > Details follow. > > -Nathan > > Repository @ github: https://github.com/hjelmn/ompi-count.git > > Relevant commits: > General support: > https://github.com/hjelmn/ompi-count/commit/db54d13404a241642fa783d5b3cc74edcb1103f2 > Fortran support: > https://github.com/hjelmn/ompi-count/commit/293adf84be52c2cd8acfe31be19cfe0afe14752d > Others: > https://github.com/hjelmn/ompi-count/commit/6c6ca8e539da675632c249c891ff93fdbc9d8de8 > > https://github.com/hjelmn/ompi-count/commit/9638ef1f245f12bb98abbf5f47e1ecfd1a018862 > > https://github.com/hjelmn/ompi-count/commit/e158aa152d122e554b89498f5a71284ce1361a99 > > Add support for MPI_Count type and MPI_COUNT datatype and add the required > MPI-3 functions MPI_Get_elements_x, MPI_Status_set_elements_x, > MPI_Type_get_extent_x, MPI_Type_get_true_extent_x, and MPI_Type_size_x. > This commit adds only the C bindings. Fortran bindins will be added in > another commit. For now the MPI_Count type is define to have the same size > as MPI_Offset. The type is required to be at least as large as MPI_Offset > and MPI_Aint. The type was initially intended to be a ssize_t (if it was > the same size as a long long) but there were issues compiling romio with > that definition (despite the inclusion of stddef.h). > > I updated the datatype engine to use size_t instead of uint32_t to support > large datatypes. This will require some review to make sure that 1) the > changes are beneficial, 2) nothing was broken by the change (I doubt > anything was), and 3) there are no performance regressions due to this > change. > > George, please look over these changes and let me know if you see anything > wrong with my updates to the datatype engine. > > -Nathan > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
[OMPI devel] RFC: add support for large counts using derived datatypes
What: Add support for the MPI-3 MPI_Count datatype and functions: MPI_Get_elements_x, MPI_Status_set_elements_x, MPI_Type_get_extent_x, MPI_Type_get_true_extent_x, and MPI_Type_size_x. This will be CMR'd to 1.7.3 if there are no objections. Why: MPI_Count is required by the MPI 3.0 standard. This will add another checkmark by MPI 3 support. When: Setting a short timeout of one week (Tues, July 23, 2013). Most of the changes add the new functionality but there are some changes that affect the datatype engine. Details follow. -Nathan Repository @ github: https://github.com/hjelmn/ompi-count.git Relevant commits: General support: https://github.com/hjelmn/ompi-count/commit/db54d13404a241642fa783d5b3cc74edcb1103f2 Fortran support: https://github.com/hjelmn/ompi-count/commit/293adf84be52c2cd8acfe31be19cfe0afe14752d Others: https://github.com/hjelmn/ompi-count/commit/6c6ca8e539da675632c249c891ff93fdbc9d8de8 https://github.com/hjelmn/ompi-count/commit/9638ef1f245f12bb98abbf5f47e1ecfd1a018862 https://github.com/hjelmn/ompi-count/commit/e158aa152d122e554b89498f5a71284ce1361a99 Add support for MPI_Count type and MPI_COUNT datatype and add the required MPI-3 functions MPI_Get_elements_x, MPI_Status_set_elements_x, MPI_Type_get_extent_x, MPI_Type_get_true_extent_x, and MPI_Type_size_x. This commit adds only the C bindings. Fortran bindins will be added in another commit. For now the MPI_Count type is define to have the same size as MPI_Offset. The type is required to be at least as large as MPI_Offset and MPI_Aint. The type was initially intended to be a ssize_t (if it was the same size as a long long) but there were issues compiling romio with that definition (despite the inclusion of stddef.h). I updated the datatype engine to use size_t instead of uint32_t to support large datatypes. This will require some review to make sure that 1) the changes are beneficial, 2) nothing was broken by the change (I doubt anything was), and 3) there are no performance regressions due to this change. George, please look over these changes and let me know if you see anything wrong with my updates to the datatype engine. -Nathan