Re: [OMPI users] Old version openmpi 1.2 support infiniband?

2018-03-29 Thread Dave Love
Kaiming Ouyang  writes:

> Hi Jeff,
> Thank you for your advice. I will contact the author for some suggestions.
> I also notice I may port this old version library to new openmpi 3.0. I
> will work on this soon. Thank you.

I haven't used them, but at least the profiling part, and possibly
control, should be covered by plugins at .
(Score-P is the replacement for the vampirtrace instrumentation included
with openmpi until recently; I think the vampirtrace plugin interface is
compatible with score-p's.)
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Old version openmpi 1.2 support infiniband?

2018-03-21 Thread Kaiming Ouyang
Hi Jeff,
Thank you for your advice. I will contact the author for some suggestions.
I also notice I may port this old version library to new openmpi 3.0. I
will work on this soon. Thank you.

Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521


On Wed, Mar 21, 2018 at 5:24 AM, Jeff Squyres (jsquyres)  wrote:

> You might want to take that library author's advice from their README:
>
> -
> The source code herein was used as the basis of Rountree ICS 2009.  It
> was my first nontrivial MPI tool and was never intended to be released
> to the wider world.  I beleive it was tied rather tightly to a subset
> of a (now) old MPI implementation.  I expect a nontrivial amount of
> work would have to be done to get this to compile and run again, and
> that effort would probably be better served starting from scratch
> (using Todd Gamblin's wrap.py PMPI shim generator, for example).
> -
>
>
> > On Mar 21, 2018, at 2:23 AM, John Hearns via users <
> users@lists.open-mpi.org> wrote:
> >
> > Kaiming,  good luck with your project.  I think you should contact Barry
> Rountree directly. you will probably get good advice!
> >
> > It is worth saying that with Turboboost there is variation between each
> individual CPU die, even within the same SKU.
> > What Turboboost does is to set a thermal envelope, and the CPU core(s)
> ramp up in frequency till the thermal limit is reached.
> > So each CPU die is slightly different  (*)
> > Indeed in my last job we had a benchmarking exercise where the
> instruction was to explicitly turn off Turboboost.
> >
> >
> > (*) As I work at ASML I really should understand this better... I really
> should.
> >
> >
> >
> >
> >
> >
> > On 20 March 2018 at 19:34, Kaiming Ouyang  wrote:
> > Hi John,
> > Thank you for your advice. But this is only related to its
> functionality, and right now my problem is it cannot compile with new
> version openmpi.
> > The reason may come from its patch file since it needs to intercept MPI
> calls to profile some data. New version openmpi may change its framework so
> that this old software does not fit it anymore.
> >
> >
> > Kaiming Ouyang, Research Assistant.
> > Department of Computer Science and Engineering
> > University of California, Riverside
> > 900 University Avenue, Riverside, CA 92521
> >
> >
> > On Tue, Mar 20, 2018 at 10:46 AM, John Hearns via users <
> users@lists.open-mpi.org> wrote:
> > "It does not handle more recent improvements such as Intel's turbo
> > mode and the processor performance inhomogeneity that comes with it."
> > I guess it is easy enough to disable Turbo mode in the BIOS though.
> >
> >
> >
> > On 20 March 2018 at 17:48, Kaiming Ouyang  wrote:
> > I think the problem it has is it only deal with the old framework
> because it will intercept MPI calls and do some profiling. Here is the
> library:
> > https://github.com/LLNL/Adagio
> >
> > I checked the openmpi changelog. From openmpi 1.3, it began to switch to
> a new framework, and openmpi 1.4+ has different one too. This library only
> works under openmpi 1.2.
> > Thank you for your advise, I will try it. My current problem is this
> library seems to try to patch mpi.h file, but it fails during the patching
> process for new version openmpi. I don't know the reason yet, and will
> check it soon. Thank you.
> >
> > Kaiming Ouyang, Research Assistant.
> > Department of Computer Science and Engineering
> > University of California, Riverside
> > 900 University Avenue, Riverside, CA 92521
> >
> >
> > On Tue, Mar 20, 2018 at 4:35 AM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
> > On Mar 19, 2018, at 11:32 PM, Kaiming Ouyang  wrote:
> > >
> > > Thank you.
> > > I am using newest version HPL.
> > > I forgot to say I can run HPL with openmpi-3.0 under infiniband. The
> reason I want to use old version is I need to compile a library that only
> supports old version openmpi, so I am trying to do this tricky job.
> >
> > Gotcha.
> >
> > Is there something in particular about the old library that requires
> Open MPI v1.2.x?
> >
> > More specifically: is there a particular error you get when you try to
> use Open MPI v3.0.0 with that library?
> >
> > I ask because if the app supports the MPI API in Open MPI v1.2.9, then
> it also supports the MPI API in Open MPI v3.0.0.  We *have* changed lots of
> other things under the covers in that time, such as:
> >
> > - how those MPI API's are implemented
> > - mpirun (and friends) command line parameters
> > - MCA parameters
> > - compilation flags
> >
> > But many of those things might actually be mostly -- if not entirely --
> hidden from a library that uses MPI.
> >
> > My point: it may be easier to get your library to use a newer version of
> Open MPI than you think.  For example, if the library has some hard-coded
> flags in their configure/Makefile to build with Open MPI, just replace
> those flags with `mpicc --showm

Re: [OMPI users] Old version openmpi 1.2 support infiniband?

2018-03-21 Thread Jeff Squyres (jsquyres)
You might want to take that library author's advice from their README:

-
The source code herein was used as the basis of Rountree ICS 2009.  It
was my first nontrivial MPI tool and was never intended to be released
to the wider world.  I beleive it was tied rather tightly to a subset
of a (now) old MPI implementation.  I expect a nontrivial amount of 
work would have to be done to get this to compile and run again, and
that effort would probably be better served starting from scratch 
(using Todd Gamblin's wrap.py PMPI shim generator, for example).
-


> On Mar 21, 2018, at 2:23 AM, John Hearns via users  
> wrote:
> 
> Kaiming,  good luck with your project.  I think you should contact Barry 
> Rountree directly. you will probably get good advice!
> 
> It is worth saying that with Turboboost there is variation between each 
> individual CPU die, even within the same SKU.
> What Turboboost does is to set a thermal envelope, and the CPU core(s) ramp 
> up in frequency till the thermal limit is reached.
> So each CPU die is slightly different  (*)
> Indeed in my last job we had a benchmarking exercise where the instruction 
> was to explicitly turn off Turboboost.
> 
> 
> (*) As I work at ASML I really should understand this better... I really 
> should.
> 
> 
> 
> 
> 
> 
> On 20 March 2018 at 19:34, Kaiming Ouyang  wrote:
> Hi John,
> Thank you for your advice. But this is only related to its functionality, and 
> right now my problem is it cannot compile with new version openmpi. 
> The reason may come from its patch file since it needs to intercept MPI calls 
> to profile some data. New version openmpi may change its framework so that 
> this old software does not fit it anymore. 
> 
> 
> Kaiming Ouyang, Research Assistant.
> Department of Computer Science and Engineering
> University of California, Riverside
> 900 University Avenue, Riverside, CA 92521
> 
> 
> On Tue, Mar 20, 2018 at 10:46 AM, John Hearns via users 
>  wrote:
> "It does not handle more recent improvements such as Intel's turbo 
> mode and the processor performance inhomogeneity that comes with it."
> I guess it is easy enough to disable Turbo mode in the BIOS though.
> 
> 
> 
> On 20 March 2018 at 17:48, Kaiming Ouyang  wrote:
> I think the problem it has is it only deal with the old framework because it 
> will intercept MPI calls and do some profiling. Here is the library:
> https://github.com/LLNL/Adagio
> 
> I checked the openmpi changelog. From openmpi 1.3, it began to switch to a 
> new framework, and openmpi 1.4+ has different one too. This library only 
> works under openmpi 1.2.
> Thank you for your advise, I will try it. My current problem is this library 
> seems to try to patch mpi.h file, but it fails during the patching process 
> for new version openmpi. I don't know the reason yet, and will check it soon. 
> Thank you.
> 
> Kaiming Ouyang, Research Assistant.
> Department of Computer Science and Engineering
> University of California, Riverside
> 900 University Avenue, Riverside, CA 92521
> 
> 
> On Tue, Mar 20, 2018 at 4:35 AM, Jeff Squyres (jsquyres)  
> wrote:
> On Mar 19, 2018, at 11:32 PM, Kaiming Ouyang  wrote:
> >
> > Thank you.
> > I am using newest version HPL.
> > I forgot to say I can run HPL with openmpi-3.0 under infiniband. The reason 
> > I want to use old version is I need to compile a library that only supports 
> > old version openmpi, so I am trying to do this tricky job.
> 
> Gotcha.
> 
> Is there something in particular about the old library that requires Open MPI 
> v1.2.x?
> 
> More specifically: is there a particular error you get when you try to use 
> Open MPI v3.0.0 with that library?
> 
> I ask because if the app supports the MPI API in Open MPI v1.2.9, then it 
> also supports the MPI API in Open MPI v3.0.0.  We *have* changed lots of 
> other things under the covers in that time, such as:
> 
> - how those MPI API's are implemented
> - mpirun (and friends) command line parameters
> - MCA parameters
> - compilation flags
> 
> But many of those things might actually be mostly -- if not entirely -- 
> hidden from a library that uses MPI.
> 
> My point: it may be easier to get your library to use a newer version of Open 
> MPI than you think.  For example, if the library has some hard-coded flags in 
> their configure/Makefile to build with Open MPI, just replace those flags 
> with `mpicc --showme:BLAH` variants (see `mpicc --showme:help` for a full 
> listing).  This will have Open MPI tell you exactly what flags it needs to 
> compile, link, etc.
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
> 
> 
> ___
> users mailing list
> users@lists.open

Re: [OMPI users] Old version openmpi 1.2 support infiniband?

2018-03-21 Thread John Hearns via users
Kaiming,  good luck with your project.  I think you should contact Barry
Rountree directly. you will probably get good advice!

It is worth saying that with Turboboost there is variation between each
individual CPU die, even within the same SKU.
What Turboboost does is to set a thermal envelope, and the CPU core(s) ramp
up in frequency till the thermal limit is reached.
So each CPU die is slightly different  (*)
Indeed in my last job we had a benchmarking exercise where the instruction
was to explicitly turn off Turboboost.


(*) As I work at ASML I really should understand this better... I really
should.






On 20 March 2018 at 19:34, Kaiming Ouyang  wrote:

> Hi John,
> Thank you for your advice. But this is only related to its functionality,
> and right now my problem is it cannot compile with new version openmpi.
> The reason may come from its patch file since it needs to intercept MPI
> calls to profile some data. New version openmpi may change its framework so
> that this old software does not fit it anymore.
>
>
> Kaiming Ouyang, Research Assistant.
> Department of Computer Science and Engineering
> University of California, Riverside
> 900 University Avenue, Riverside, CA 92521
>
>
> On Tue, Mar 20, 2018 at 10:46 AM, John Hearns via users <
> users@lists.open-mpi.org> wrote:
>
>> "It does not handle more recent improvements such as Intel's turbo
>> mode and the processor performance inhomogeneity that comes with it."
>> I guess it is easy enough to disable Turbo mode in the BIOS though.
>>
>>
>>
>> On 20 March 2018 at 17:48, Kaiming Ouyang  wrote:
>>
>>> I think the problem it has is it only deal with the old
>>> framework because it will intercept MPI calls and do some profiling. Here
>>> is the library:
>>> https://github.com/LLNL/Adagio
>>>
>>> I checked the openmpi changelog. From openmpi 1.3, it began to switch to
>>> a new framework, and openmpi 1.4+ has different one too. This library only
>>> works under openmpi 1.2.
>>> Thank you for your advise, I will try it. My current problem is this
>>> library seems to try to patch mpi.h file, but it fails during the patching
>>> process for new version openmpi. I don't know the reason yet, and will
>>> check it soon. Thank you.
>>>
>>> Kaiming Ouyang, Research Assistant.
>>> Department of Computer Science and Engineering
>>> University of California, Riverside
>>> 900 University Avenue, Riverside, CA 92521
>>>
>>>
>>> On Tue, Mar 20, 2018 at 4:35 AM, Jeff Squyres (jsquyres) <
>>> jsquy...@cisco.com> wrote:
>>>
 On Mar 19, 2018, at 11:32 PM, Kaiming Ouyang  wrote:
 >
 > Thank you.
 > I am using newest version HPL.
 > I forgot to say I can run HPL with openmpi-3.0 under infiniband. The
 reason I want to use old version is I need to compile a library that only
 supports old version openmpi, so I am trying to do this tricky job.

 Gotcha.

 Is there something in particular about the old library that requires
 Open MPI v1.2.x?

 More specifically: is there a particular error you get when you try to
 use Open MPI v3.0.0 with that library?

 I ask because if the app supports the MPI API in Open MPI v1.2.9, then
 it also supports the MPI API in Open MPI v3.0.0.  We *have* changed lots of
 other things under the covers in that time, such as:

 - how those MPI API's are implemented
 - mpirun (and friends) command line parameters
 - MCA parameters
 - compilation flags

 But many of those things might actually be mostly -- if not entirely --
 hidden from a library that uses MPI.

 My point: it may be easier to get your library to use a newer version
 of Open MPI than you think.  For example, if the library has some
 hard-coded flags in their configure/Makefile to build with Open MPI, just
 replace those flags with `mpicc --showme:BLAH` variants (see `mpicc
 --showme:help` for a full listing).  This will have Open MPI tell you
 exactly what flags it needs to compile, link, etc.

 --
 Jeff Squyres
 jsquy...@cisco.com

 ___
 users mailing list
 users@lists.open-mpi.org
 https://lists.open-mpi.org/mailman/listinfo/users

>>>
>>>
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>>>
>>
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Old version openmpi 1.2 support infiniband?

2018-03-20 Thread Kaiming Ouyang
Hi John,
Thank you for your advice. But this is only related to its functionality,
and right now my problem is it cannot compile with new version openmpi.
The reason may come from its patch file since it needs to intercept MPI
calls to profile some data. New version openmpi may change its framework so
that this old software does not fit it anymore.


Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521


On Tue, Mar 20, 2018 at 10:46 AM, John Hearns via users <
users@lists.open-mpi.org> wrote:

> "It does not handle more recent improvements such as Intel's turbo
> mode and the processor performance inhomogeneity that comes with it."
> I guess it is easy enough to disable Turbo mode in the BIOS though.
>
>
>
> On 20 March 2018 at 17:48, Kaiming Ouyang  wrote:
>
>> I think the problem it has is it only deal with the old framework because
>> it will intercept MPI calls and do some profiling. Here is the library:
>> https://github.com/LLNL/Adagio
>>
>> I checked the openmpi changelog. From openmpi 1.3, it began to switch to
>> a new framework, and openmpi 1.4+ has different one too. This library only
>> works under openmpi 1.2.
>> Thank you for your advise, I will try it. My current problem is this
>> library seems to try to patch mpi.h file, but it fails during the patching
>> process for new version openmpi. I don't know the reason yet, and will
>> check it soon. Thank you.
>>
>> Kaiming Ouyang, Research Assistant.
>> Department of Computer Science and Engineering
>> University of California, Riverside
>> 900 University Avenue, Riverside, CA 92521
>>
>>
>> On Tue, Mar 20, 2018 at 4:35 AM, Jeff Squyres (jsquyres) <
>> jsquy...@cisco.com> wrote:
>>
>>> On Mar 19, 2018, at 11:32 PM, Kaiming Ouyang  wrote:
>>> >
>>> > Thank you.
>>> > I am using newest version HPL.
>>> > I forgot to say I can run HPL with openmpi-3.0 under infiniband. The
>>> reason I want to use old version is I need to compile a library that only
>>> supports old version openmpi, so I am trying to do this tricky job.
>>>
>>> Gotcha.
>>>
>>> Is there something in particular about the old library that requires
>>> Open MPI v1.2.x?
>>>
>>> More specifically: is there a particular error you get when you try to
>>> use Open MPI v3.0.0 with that library?
>>>
>>> I ask because if the app supports the MPI API in Open MPI v1.2.9, then
>>> it also supports the MPI API in Open MPI v3.0.0.  We *have* changed lots of
>>> other things under the covers in that time, such as:
>>>
>>> - how those MPI API's are implemented
>>> - mpirun (and friends) command line parameters
>>> - MCA parameters
>>> - compilation flags
>>>
>>> But many of those things might actually be mostly -- if not entirely --
>>> hidden from a library that uses MPI.
>>>
>>> My point: it may be easier to get your library to use a newer version of
>>> Open MPI than you think.  For example, if the library has some hard-coded
>>> flags in their configure/Makefile to build with Open MPI, just replace
>>> those flags with `mpicc --showme:BLAH` variants (see `mpicc --showme:help`
>>> for a full listing).  This will have Open MPI tell you exactly what flags
>>> it needs to compile, link, etc.
>>>
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>>
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>>>
>>
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Old version openmpi 1.2 support infiniband?

2018-03-20 Thread John Hearns via users
"It does not handle more recent improvements such as Intel's turbo
mode and the processor performance inhomogeneity that comes with it."
I guess it is easy enough to disable Turbo mode in the BIOS though.



On 20 March 2018 at 17:48, Kaiming Ouyang  wrote:

> I think the problem it has is it only deal with the old framework because
> it will intercept MPI calls and do some profiling. Here is the library:
> https://github.com/LLNL/Adagio
>
> I checked the openmpi changelog. From openmpi 1.3, it began to switch to a
> new framework, and openmpi 1.4+ has different one too. This library only
> works under openmpi 1.2.
> Thank you for your advise, I will try it. My current problem is this
> library seems to try to patch mpi.h file, but it fails during the patching
> process for new version openmpi. I don't know the reason yet, and will
> check it soon. Thank you.
>
> Kaiming Ouyang, Research Assistant.
> Department of Computer Science and Engineering
> University of California, Riverside
> 900 University Avenue, Riverside, CA 92521
>
>
> On Tue, Mar 20, 2018 at 4:35 AM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
>
>> On Mar 19, 2018, at 11:32 PM, Kaiming Ouyang  wrote:
>> >
>> > Thank you.
>> > I am using newest version HPL.
>> > I forgot to say I can run HPL with openmpi-3.0 under infiniband. The
>> reason I want to use old version is I need to compile a library that only
>> supports old version openmpi, so I am trying to do this tricky job.
>>
>> Gotcha.
>>
>> Is there something in particular about the old library that requires Open
>> MPI v1.2.x?
>>
>> More specifically: is there a particular error you get when you try to
>> use Open MPI v3.0.0 with that library?
>>
>> I ask because if the app supports the MPI API in Open MPI v1.2.9, then it
>> also supports the MPI API in Open MPI v3.0.0.  We *have* changed lots of
>> other things under the covers in that time, such as:
>>
>> - how those MPI API's are implemented
>> - mpirun (and friends) command line parameters
>> - MCA parameters
>> - compilation flags
>>
>> But many of those things might actually be mostly -- if not entirely --
>> hidden from a library that uses MPI.
>>
>> My point: it may be easier to get your library to use a newer version of
>> Open MPI than you think.  For example, if the library has some hard-coded
>> flags in their configure/Makefile to build with Open MPI, just replace
>> those flags with `mpicc --showme:BLAH` variants (see `mpicc --showme:help`
>> for a full listing).  This will have Open MPI tell you exactly what flags
>> it needs to compile, link, etc.
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Old version openmpi 1.2 support infiniband?

2018-03-20 Thread Kaiming Ouyang
I think the problem it has is it only deal with the old framework because
it will intercept MPI calls and do some profiling. Here is the library:
https://github.com/LLNL/Adagio

I checked the openmpi changelog. From openmpi 1.3, it began to switch to a
new framework, and openmpi 1.4+ has different one too. This library only
works under openmpi 1.2.
Thank you for your advise, I will try it. My current problem is this
library seems to try to patch mpi.h file, but it fails during the patching
process for new version openmpi. I don't know the reason yet, and will
check it soon. Thank you.

Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521


On Tue, Mar 20, 2018 at 4:35 AM, Jeff Squyres (jsquyres)  wrote:

> On Mar 19, 2018, at 11:32 PM, Kaiming Ouyang  wrote:
> >
> > Thank you.
> > I am using newest version HPL.
> > I forgot to say I can run HPL with openmpi-3.0 under infiniband. The
> reason I want to use old version is I need to compile a library that only
> supports old version openmpi, so I am trying to do this tricky job.
>
> Gotcha.
>
> Is there something in particular about the old library that requires Open
> MPI v1.2.x?
>
> More specifically: is there a particular error you get when you try to use
> Open MPI v3.0.0 with that library?
>
> I ask because if the app supports the MPI API in Open MPI v1.2.9, then it
> also supports the MPI API in Open MPI v3.0.0.  We *have* changed lots of
> other things under the covers in that time, such as:
>
> - how those MPI API's are implemented
> - mpirun (and friends) command line parameters
> - MCA parameters
> - compilation flags
>
> But many of those things might actually be mostly -- if not entirely --
> hidden from a library that uses MPI.
>
> My point: it may be easier to get your library to use a newer version of
> Open MPI than you think.  For example, if the library has some hard-coded
> flags in their configure/Makefile to build with Open MPI, just replace
> those flags with `mpicc --showme:BLAH` variants (see `mpicc --showme:help`
> for a full listing).  This will have Open MPI tell you exactly what flags
> it needs to compile, link, etc.
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Old version openmpi 1.2 support infiniband?

2018-03-20 Thread Jeff Squyres (jsquyres)
On Mar 19, 2018, at 11:32 PM, Kaiming Ouyang  wrote:
> 
> Thank you.
> I am using newest version HPL. 
> I forgot to say I can run HPL with openmpi-3.0 under infiniband. The reason I 
> want to use old version is I need to compile a library that only supports old 
> version openmpi, so I am trying to do this tricky job.

Gotcha.

Is there something in particular about the old library that requires Open MPI 
v1.2.x?

More specifically: is there a particular error you get when you try to use Open 
MPI v3.0.0 with that library?

I ask because if the app supports the MPI API in Open MPI v1.2.9, then it also 
supports the MPI API in Open MPI v3.0.0.  We *have* changed lots of other 
things under the covers in that time, such as:

- how those MPI API's are implemented
- mpirun (and friends) command line parameters
- MCA parameters
- compilation flags

But many of those things might actually be mostly -- if not entirely -- hidden 
from a library that uses MPI.

My point: it may be easier to get your library to use a newer version of Open 
MPI than you think.  For example, if the library has some hard-coded flags in 
their configure/Makefile to build with Open MPI, just replace those flags with 
`mpicc --showme:BLAH` variants (see `mpicc --showme:help` for a full listing).  
This will have Open MPI tell you exactly what flags it needs to compile, link, 
etc.

-- 
Jeff Squyres
jsquy...@cisco.com

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Old version openmpi 1.2 support infiniband?

2018-03-19 Thread Kaiming Ouyang
Thank you.
I am using newest version HPL.
I forgot to say I can run HPL with openmpi-3.0 under infiniband. The reason
I want to use old version is I need to compile a library that only supports
old version openmpi, so I am trying to do this tricky job. Anyways, thank
you for your reply Jeff, have a good day.

Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521


On Mon, Mar 19, 2018 at 8:39 PM, Jeff Squyres (jsquyres)  wrote:

> I'm sorry; I can't help debug a version from 9 years ago.  The best
> suggestion I have is to use a modern version of Open MPI.
>
> Note, however, your use of "--mca btl ..." is going to have the same
> meaning for all versions of Open MPI.  The problem you showed in the first
> mail was with the shared memory transport.  Using "--mca btl tcp,self"
> means you're not using the shared memory transport.  If you don't specify
> "--mca btl tcp,self", Open MPI will automatically use the shared memory
> transport.  Hence, you could be running into the same (or similar/related)
> problem that you mentioned in the first mail -- i.e., something is going
> wrong with how the v1.2.9 shared memory transport is interacting with your
> system.
>
> Likewise, "--mca btl_tcp_if_include ib0" tells the TCP BTL plugin to use
> the "ib0" network.  But if you have the openib BTL available (i.e., the
> IB-native plug), that will be used instead of the TCP BTL because native
> verbs over IB performs much better than TCP over IB.  Meaning: if you
> specify btl_Tcp_if_include without specifying "--mca btl tcp,self", then
> (assuming openib is available) the TCP BTL likely isn't used and the
> btl_tcp_if_include value is therefore ignored.
>
> Also, what version of Linpack are you using?  The error you show is
> usually indicative of an MPI application bug (the MPI_COMM_SPLIT error).
> If you're running an old version of xhpl, you should upgrade to the latest.
>
>
>
>
> > On Mar 19, 2018, at 9:59 PM, Kaiming Ouyang  wrote:
> >
> > Hi Jeff,
> > Thank you for your reply. I just changed to another cluster which does
> not have infiniband. I ran the HPL by:
> > mpirun --mca btl tcp,self -np 144 --hostfile /root/research/hostfile
> ./xhpl
> >
> > It ran successfully, but if I delete "--mca btl tcp,self", it cannot run
> again. So I doubt whether openmpi 1.2 cannot identify the proper network
> interface and set correct parameters for them.
> > Then, I return back to the previous cluster with infiniband and type the
> same command above. It gets stuck forever.
> >
> > I change the command to:
> > mpirun --mca btl_tcp_if_include ib0 --hostfile
> /root/research/hostfile-ib -np 48 ./xhpl
> >
> > It can successfully launch but gives me errors as follows when HPL tries
> to split the communication:
> >
> > [node1.novalocal:09562] *** An error occurred in MPI_Comm_split
> > [node1.novalocal:09562] *** on communicator MPI COMMUNICATOR 3 SPLIT
> FROM 0
> > [node1.novalocal:09562] *** MPI_ERR_IN_STATUS: error code in status
> > [node1.novalocal:09562] *** MPI_ERRORS_ARE_FATAL (goodbye)
> > [node1.novalocal:09583] *** An error occurred in MPI_Comm_split
> > [node1.novalocal:09583] *** on communicator MPI COMMUNICATOR 3 SPLIT
> FROM 0
> > [node1.novalocal:09583] *** MPI_ERR_IN_STATUS: error code in status
> > [node1.novalocal:09583] *** MPI_ERRORS_ARE_FATAL (goodbye)
> > [node1.novalocal:09637] *** An error occurred in MPI_Comm_split
> > [node1.novalocal:09637] *** on communicator MPI COMMUNICATOR 3 SPLIT
> FROM 0
> > [node1.novalocal:09637] *** MPI_ERR_IN_STATUS: error code in status
> > [node1.novalocal:09637] *** MPI_ERRORS_ARE_FATAL (goodbye)
> > [node1.novalocal:09994] *** An error occurred in MPI_Comm_split
> > [node1.novalocal:09994] *** on communicator MPI COMMUNICATOR 3 SPLIT
> FROM 0
> > [node1.novalocal:09994] *** MPI_ERR_IN_STATUS: error code in status
> > [node1.novalocal:09994] *** MPI_ERRORS_ARE_FATAL (goodbye)
> > mpirun noticed that job rank 0 with PID 46005 on node test-ib exited on
> signal 15 (Terminated).
> >
> > Hope you can give me some suggestions. Thank you.
> >
> > Kaiming Ouyang, Research Assistant.
> > Department of Computer Science and Engineering
> > University of California, Riverside
> > 900 University Avenue, Riverside, CA 92521
> >
> >
> > On Mon, Mar 19, 2018 at 7:35 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
> > That's actually failing in a shared memory section of the code.
> >
> > But to answer your question, yes, Open MPI 1.2 did have IB support.
> >
> > That being said, I have no idea what would cause this shared memory segv
> -- it's quite possible that it's simple bit rot (i.e., v1.2.9 was released
> 9 years ago -- see https://www.open-mpi.org/software/ompi/versions/
> timeline.php.  Perhaps it does not function correctly on modern
> glibc/Linux kernel-based platforms).
> >
> > Can you upgrade to a [much] newer Open MPI?
> >
> >
> >
> > > On Mar 19, 2018, at 8

Re: [OMPI users] Old version openmpi 1.2 support infiniband?

2018-03-19 Thread Jeff Squyres (jsquyres)
I'm sorry; I can't help debug a version from 9 years ago.  The best suggestion 
I have is to use a modern version of Open MPI.

Note, however, your use of "--mca btl ..." is going to have the same meaning 
for all versions of Open MPI.  The problem you showed in the first mail was 
with the shared memory transport.  Using "--mca btl tcp,self" means you're not 
using the shared memory transport.  If you don't specify "--mca btl tcp,self", 
Open MPI will automatically use the shared memory transport.  Hence, you could 
be running into the same (or similar/related) problem that you mentioned in the 
first mail -- i.e., something is going wrong with how the v1.2.9 shared memory 
transport is interacting with your system.

Likewise, "--mca btl_tcp_if_include ib0" tells the TCP BTL plugin to use the 
"ib0" network.  But if you have the openib BTL available (i.e., the IB-native 
plug), that will be used instead of the TCP BTL because native verbs over IB 
performs much better than TCP over IB.  Meaning: if you specify 
btl_Tcp_if_include without specifying "--mca btl tcp,self", then (assuming 
openib is available) the TCP BTL likely isn't used and the btl_tcp_if_include 
value is therefore ignored.

Also, what version of Linpack are you using?  The error you show is usually 
indicative of an MPI application bug (the MPI_COMM_SPLIT error).  If you're 
running an old version of xhpl, you should upgrade to the latest.




> On Mar 19, 2018, at 9:59 PM, Kaiming Ouyang  wrote:
> 
> Hi Jeff,
> Thank you for your reply. I just changed to another cluster which does not 
> have infiniband. I ran the HPL by:
> mpirun --mca btl tcp,self -np 144 --hostfile /root/research/hostfile ./xhpl
> 
> It ran successfully, but if I delete "--mca btl tcp,self", it cannot run 
> again. So I doubt whether openmpi 1.2 cannot identify the proper network 
> interface and set correct parameters for them. 
> Then, I return back to the previous cluster with infiniband and type the same 
> command above. It gets stuck forever.
> 
> I change the command to:
> mpirun --mca btl_tcp_if_include ib0 --hostfile /root/research/hostfile-ib -np 
> 48 ./xhpl
> 
> It can successfully launch but gives me errors as follows when HPL tries to 
> split the communication:
> 
> [node1.novalocal:09562] *** An error occurred in MPI_Comm_split
> [node1.novalocal:09562] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
> [node1.novalocal:09562] *** MPI_ERR_IN_STATUS: error code in status
> [node1.novalocal:09562] *** MPI_ERRORS_ARE_FATAL (goodbye)
> [node1.novalocal:09583] *** An error occurred in MPI_Comm_split
> [node1.novalocal:09583] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
> [node1.novalocal:09583] *** MPI_ERR_IN_STATUS: error code in status
> [node1.novalocal:09583] *** MPI_ERRORS_ARE_FATAL (goodbye)
> [node1.novalocal:09637] *** An error occurred in MPI_Comm_split
> [node1.novalocal:09637] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
> [node1.novalocal:09637] *** MPI_ERR_IN_STATUS: error code in status
> [node1.novalocal:09637] *** MPI_ERRORS_ARE_FATAL (goodbye)
> [node1.novalocal:09994] *** An error occurred in MPI_Comm_split
> [node1.novalocal:09994] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
> [node1.novalocal:09994] *** MPI_ERR_IN_STATUS: error code in status
> [node1.novalocal:09994] *** MPI_ERRORS_ARE_FATAL (goodbye)
> mpirun noticed that job rank 0 with PID 46005 on node test-ib exited on 
> signal 15 (Terminated).
> 
> Hope you can give me some suggestions. Thank you.
> 
> Kaiming Ouyang, Research Assistant.
> Department of Computer Science and Engineering
> University of California, Riverside
> 900 University Avenue, Riverside, CA 92521
> 
> 
> On Mon, Mar 19, 2018 at 7:35 PM, Jeff Squyres (jsquyres)  
> wrote:
> That's actually failing in a shared memory section of the code.
> 
> But to answer your question, yes, Open MPI 1.2 did have IB support.
> 
> That being said, I have no idea what would cause this shared memory segv -- 
> it's quite possible that it's simple bit rot (i.e., v1.2.9 was released 9 
> years ago -- see 
> https://www.open-mpi.org/software/ompi/versions/timeline.php.  Perhaps it 
> does not function correctly on modern glibc/Linux kernel-based platforms).
> 
> Can you upgrade to a [much] newer Open MPI?
> 
> 
> 
> > On Mar 19, 2018, at 8:29 PM, Kaiming Ouyang  wrote:
> >
> > Hi everyone,
> > Recently I need to compile High-Performance Linpack code with openmpi 1.2 
> > version (a little bit old). When I finish compilation, and try to run, I 
> > get the following errors:
> >
> > [test:32058] *** Process received signal ***
> > [test:32058] Signal: Segmentation fault (11)
> > [test:32058] Signal code: Address not mapped (1)
> > [test:32058] Failing at address: 0x14a2b84b6304
> > [test:32058] [ 0] /lib64/libpthread.so.0(+0xf5e0) [0x14eb116295e0]
> > [test:32058] [ 1] 
> > /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0x28a)
> >  [0x14eaa81258aa]
> > [test:32058] [ 

Re: [OMPI users] Old version openmpi 1.2 support infiniband?

2018-03-19 Thread Kaiming Ouyang
Hi Jeff,
Thank you for your reply. I just changed to another cluster which does not
have infiniband. I ran the HPL by:
mpirun *--mca btl tcp,self* -np 144 --hostfile /root/research/hostfile
./xhpl

It ran successfully, but if I delete "--mca btl tcp,self", it cannot run
again. So I doubt whether openmpi 1.2 cannot identify the proper network
interface and set correct parameters for them.
Then, I return back to the previous cluster with infiniband and type the
same command above. It gets stuck forever.

I change the command to:
mpirun *--mca btl_tcp_if_include ib0* --hostfile /root/research/hostfile-ib
-np 48 ./xhpl

It can successfully launch but gives me errors as follows when HPL tries to
split the communication:

[node1.novalocal:09562] *** An error occurred in MPI_Comm_split
[node1.novalocal:09562] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[node1.novalocal:09562] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09562] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1.novalocal:09583] *** An error occurred in MPI_Comm_split
[node1.novalocal:09583] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[node1.novalocal:09583] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09583] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1.novalocal:09637] *** An error occurred in MPI_Comm_split
[node1.novalocal:09637] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[node1.novalocal:09637] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09637] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1.novalocal:09994] *** An error occurred in MPI_Comm_split
[node1.novalocal:09994] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[node1.novalocal:09994] *** MPI_ERR_IN_STATUS: error code in status
[node1.novalocal:09994] *** MPI_ERRORS_ARE_FATAL (goodbye)
mpirun noticed that job rank 0 with PID 46005 on node test-ib exited on
signal 15 (Terminated).

Hope you can give me some suggestions. Thank you.

Kaiming Ouyang, Research Assistant.
Department of Computer Science and Engineering
University of California, Riverside
900 University Avenue, Riverside, CA 92521


On Mon, Mar 19, 2018 at 7:35 PM, Jeff Squyres (jsquyres)  wrote:

> That's actually failing in a shared memory section of the code.
>
> But to answer your question, yes, Open MPI 1.2 did have IB support.
>
> That being said, I have no idea what would cause this shared memory segv
> -- it's quite possible that it's simple bit rot (i.e., v1.2.9 was released
> 9 years ago -- see https://www.open-mpi.org/software/ompi/versions/
> timeline.php.  Perhaps it does not function correctly on modern
> glibc/Linux kernel-based platforms).
>
> Can you upgrade to a [much] newer Open MPI?
>
>
>
> > On Mar 19, 2018, at 8:29 PM, Kaiming Ouyang  wrote:
> >
> > Hi everyone,
> > Recently I need to compile High-Performance Linpack code with openmpi
> 1.2 version (a little bit old). When I finish compilation, and try to run,
> I get the following errors:
> >
> > [test:32058] *** Process received signal ***
> > [test:32058] Signal: Segmentation fault (11)
> > [test:32058] Signal code: Address not mapped (1)
> > [test:32058] Failing at address: 0x14a2b84b6304
> > [test:32058] [ 0] /lib64/libpthread.so.0(+0xf5e0) [0x14eb116295e0]
> > [test:32058] [ 1] /root/research/lib/openmpi-1.
> 2.9/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0x28a)
> [0x14eaa81258aa]
> > [test:32058] [ 2] /root/research/lib/openmpi-1.
> 2.9/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x2b) [0x14eaa853219b]
> > [test:32058] [ 3] /root/research/lib/openmpi-1.
> 2.9/lib/libopen-pal.so.0(opal_progress+0x4a) [0x14eb128dbaaa]
> > [test:32058] [ 4] /root/research/lib/openmpi-1.
> 2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x1d) [0x14eaf41e6b4d]
> > [test:32058] [ 5] /root/research/lib/openmpi-1.
> 2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x3a5) [0x14eaf41eac45]
> > [test:32058] [ 6] /root/research/lib/openmpi-1.
> 2.9/lib/libopen-rte.so.0(mca_oob_recv_packed+0x33) [0x14eb12b62223]
> > [test:32058] [ 7] /root/research/lib/openmpi-1.
> 2.9/lib/openmpi/mca_gpr_proxy.so(orte_gpr_proxy_put+0x1f9)
> [0x14eaf3dd7db9]
> > [test:32058] [ 8] /root/research/lib/openmpi-1.
> 2.9/lib/libopen-rte.so.0(orte_smr_base_set_proc_state+0x31d)
> [0x14eb12b7893d]
> > [test:32058] [ 9] /root/research/lib/openmpi-1.
> 2.9/lib/libmpi.so.0(ompi_mpi_init+0x8d6) [0x14eb13202136]
> > [test:32058] [10] /root/research/lib/openmpi-1.
> 2.9/lib/libmpi.so.0(MPI_Init+0x6a) [0x14eb1322461a]
> > [test:32058] [11] ./xhpl(main+0x5d) [0x404e7d]
> > [test:32058] [12] /lib64/libc.so.6(__libc_start_main+0xf5)
> [0x14eb11278c05]
> > [test:32058] [13] ./xhpl() [0x4056cb]
> > [test:32058] *** End of error message ***
> > mpirun noticed that job rank 0 with PID 31481 on node test.novalocal
> exited on signal 15 (Terminated).
> > 23 additional processes aborted (not shown)
> >
> > The machine has infiniband, so I doubt whether openmpi 1.2 does not
> support infiniband by default. I also try to run it not through infini

Re: [OMPI users] Old version openmpi 1.2 support infiniband?

2018-03-19 Thread Jeff Squyres (jsquyres)
That's actually failing in a shared memory section of the code.

But to answer your question, yes, Open MPI 1.2 did have IB support.

That being said, I have no idea what would cause this shared memory segv -- 
it's quite possible that it's simple bit rot (i.e., v1.2.9 was released 9 years 
ago -- see https://www.open-mpi.org/software/ompi/versions/timeline.php.  
Perhaps it does not function correctly on modern glibc/Linux kernel-based 
platforms).

Can you upgrade to a [much] newer Open MPI?



> On Mar 19, 2018, at 8:29 PM, Kaiming Ouyang  wrote:
> 
> Hi everyone,
> Recently I need to compile High-Performance Linpack code with openmpi 1.2 
> version (a little bit old). When I finish compilation, and try to run, I get 
> the following errors:
> 
> [test:32058] *** Process received signal ***
> [test:32058] Signal: Segmentation fault (11)
> [test:32058] Signal code: Address not mapped (1)
> [test:32058] Failing at address: 0x14a2b84b6304
> [test:32058] [ 0] /lib64/libpthread.so.0(+0xf5e0) [0x14eb116295e0]
> [test:32058] [ 1] 
> /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0x28a)
>  [0x14eaa81258aa]
> [test:32058] [ 2] 
> /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x2b)
>  [0x14eaa853219b]
> [test:32058] [ 3] 
> /root/research/lib/openmpi-1.2.9/lib/libopen-pal.so.0(opal_progress+0x4a) 
> [0x14eb128dbaaa]
> [test:32058] [ 4] 
> /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x1d)
>  [0x14eaf41e6b4d]
> [test:32058] [ 5] 
> /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x3a5)
>  [0x14eaf41eac45]
> [test:32058] [ 6] 
> /root/research/lib/openmpi-1.2.9/lib/libopen-rte.so.0(mca_oob_recv_packed+0x33)
>  [0x14eb12b62223]
> [test:32058] [ 7] 
> /root/research/lib/openmpi-1.2.9/lib/openmpi/mca_gpr_proxy.so(orte_gpr_proxy_put+0x1f9)
>  [0x14eaf3dd7db9]
> [test:32058] [ 8] 
> /root/research/lib/openmpi-1.2.9/lib/libopen-rte.so.0(orte_smr_base_set_proc_state+0x31d)
>  [0x14eb12b7893d]
> [test:32058] [ 9] 
> /root/research/lib/openmpi-1.2.9/lib/libmpi.so.0(ompi_mpi_init+0x8d6) 
> [0x14eb13202136]
> [test:32058] [10] 
> /root/research/lib/openmpi-1.2.9/lib/libmpi.so.0(MPI_Init+0x6a) 
> [0x14eb1322461a]
> [test:32058] [11] ./xhpl(main+0x5d) [0x404e7d]
> [test:32058] [12] /lib64/libc.so.6(__libc_start_main+0xf5) [0x14eb11278c05]
> [test:32058] [13] ./xhpl() [0x4056cb]
> [test:32058] *** End of error message ***
> mpirun noticed that job rank 0 with PID 31481 on node test.novalocal exited 
> on signal 15 (Terminated). 
> 23 additional processes aborted (not shown)
> 
> The machine has infiniband, so I doubt whether openmpi 1.2 does not support 
> infiniband by default. I also try to run it not through infiniband, but the 
> program can only deal with small size input. When I increase the input size 
> and grid size, it just gets stuck. The program I run is a benchmark, so I 
> don't think there would be a problem in the code. Any idea? Thanks.
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users


-- 
Jeff Squyres
jsquy...@cisco.com

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users