[hwloc-devel] Create success (hwloc git 1.10.0-8-g85101b7)

2014-10-28 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success.

Snapshot:   hwloc 1.10.0-8-g85101b7
Start time: Tue Oct 28 21:02:49 EDT 2014
End time:   Tue Oct 28 21:04:17 EDT 2014

Your friendly daemon,
Cyrador


[hwloc-devel] Create success (hwloc git dev-253-g12f229d)

2014-10-28 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success.

Snapshot:   hwloc dev-253-g12f229d
Start time: Tue Oct 28 21:01:01 EDT 2014
End time:   Tue Oct 28 21:02:34 EDT 2014

Your friendly daemon,
Cyrador


[OMPI devel] OMPI collectives

2014-10-28 Thread Ralph Castain
Hey folks

I’ve got someone asking about any documentation and/or papers out there that 
might summarize the algorithms used in the bcast, allreduce, barrier, and 
alltoall collectives - and might describe the analytical cost of each algo 
(i.e., #steps, etc).

The papers on our web site are getting pretty old. I know folks have been 
researching and publishing about these things.

Anybody have something we could pass along and/or add to the web site?
Ralph



Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Howard Pritchard
Hi Paul,

Yes, that is the minor problem I was referring to.  It does in fact
reflect the oldness of CLE 4.  The cray pmi 5 and higher is newer
software which probably should never have been installed on
CLE 4, since the alps packaging changed completely between
CLE 4 and 5.

Howard


2014-10-28 14:57 GMT-06:00 Paul Hargrove :

> By Howard's definition I guess NERSC's Hopper (XE6) qualifies as "very
> old" at PrgEnv 4.2.34
>
> {hargrove@hopper06 ~}$ pkg-config --cflags cray-pmi
> Package cray-alpslli was not found in the pkg-config search path.
> Perhaps you should add the directory containing `cray-alpslli.pc'
> to the PKG_CONFIG_PATH environment variable
> Package 'cray-alpslli', required by 'cray-pmi', not found
>
> -Paul
>
>
> On Tue, Oct 28, 2014 at 1:05 PM, Howard Pritchard 
> wrote:
>
>> Hi Folks,
>>
>> The simplest and best way on cray is to use the pkg-config command.
>> No looking for odd header file names, etc.  There is a minor issue
>> with external login nodes running very old (like CLE 4.X) that one has
>> to workaround, but otherwise works well.
>>
>> pkg-config --cflags cray-pmi
>>
>> etc. etc.
>>
>> The pc files for the various cray software packages are suppose to include
>> all dependencies on headers files, libs, etc. from other cay packages.
>>
>> Howard
>>
>>
>>
>>
>> 2014-10-28 13:20 GMT-06:00 Ralph Castain :
>>
>>>
>>> On Oct 28, 2014, at 12:17 PM, Paul Hargrove  wrote:
>>>
>>> Ralph,
>>>
>>> The Cray's at NERSC have *both* pmi_cray.h and pmi.h (and pmi2.h as
>>> well).
>>>
>>>
>>> I understand that - I was questioning if that is universally true or
>>> not. IF we are guaranteed that nobody with a Cray ever renames pmi_cray.h
>>> to pmi.h, THEN your check will be fine. Otherwise, we can’t trust it.
>>>
>>> And I seem to recall that the earlier Crays, at least, didn’t have this
>>> naming distinction - or at least, not at LANL. Hence my question.
>>>
>>>
>>> That is why I said our configure logic checks for pmi_cray.h *first*.
>>> Sorry if that wasn't clear.
>>>
>>> On NERSC's XE6:
>>>
>>> {hargrove@hopper06 ~}$ ls /opt/cray/pmi/default/include/
>>> pmi2.h  pmi_cray_ext.h  pmi_cray.h  pmi.h  pmi_version.h
>>> {hargrove@hopper06 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
>>> cray-libpmi-devel-4.0.1-1..9753.86.3.gem
>>>
>>>
>>> On NERSC's XC30:
>>>
>>> {hargrove@edison08 ~}$ ls /opt/cray/pmi/default/include/
>>> pmi.h  pmi2.h  pmi_cray.h  pmi_cray_ext.h  pmi_version.h
>>> {hargrove@edison08 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
>>> cray-libpmi-devel-5.0.5-1..10300.134.8.ari
>>>
>>>
>>> -Paul
>>>
>>> On Tue, Oct 28, 2014 at 12:02 PM, Ralph Castain 
>>> wrote:
>>>

 On Oct 28, 2014, at 11:59 AM, Paul Hargrove  wrote:


 On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard  wrote:

>
>> We may no longer require those as you have separated the Cray check
>> out, but the original problem is that we would pickup the Slurm 
>> components
>> on the Cray because we would find pmi.h
>>
>> Oh,  I forgot about that .
>

 In GASNet's configure logic we look for "pmi_cray.h" before "pmi.h”.


 Hmmm…on LANL’s Cray systems, it was still labeled “pmi.h”

 So far that has been sufficient to disambiguate the implementations.
 One might also try checking libpmi for Cray's extensions.

 -Paul


 --
 Paul H. Hargrove  phhargr...@lbl.gov
 Future Technologies Group
 Computer and Data Sciences Department Tel: +1-510-495-2352
 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
  ___
 devel mailing list
 de...@open-mpi.org
 Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
 Link to this post:
 http://www.open-mpi.org/community/lists/devel/2014/10/16114.php



 ___
 devel mailing list
 de...@open-mpi.org
 Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
 Link to this post:
 http://www.open-mpi.org/community/lists/devel/2014/10/16115.php

>>>
>>>
>>>
>>> --
>>> Paul H. Hargrove  phhargr...@lbl.gov
>>> Future Technologies Group
>>> Computer and Data Sciences Department Tel: +1-510-495-2352
>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>>  ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2014/10/16116.php
>>>
>>>
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: 

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
By Howard's definition I guess NERSC's Hopper (XE6) qualifies as "very old"
at PrgEnv 4.2.34

{hargrove@hopper06 ~}$ pkg-config --cflags cray-pmi
Package cray-alpslli was not found in the pkg-config search path.
Perhaps you should add the directory containing `cray-alpslli.pc'
to the PKG_CONFIG_PATH environment variable
Package 'cray-alpslli', required by 'cray-pmi', not found

-Paul


On Tue, Oct 28, 2014 at 1:05 PM, Howard Pritchard 
wrote:

> Hi Folks,
>
> The simplest and best way on cray is to use the pkg-config command.
> No looking for odd header file names, etc.  There is a minor issue
> with external login nodes running very old (like CLE 4.X) that one has
> to workaround, but otherwise works well.
>
> pkg-config --cflags cray-pmi
>
> etc. etc.
>
> The pc files for the various cray software packages are suppose to include
> all dependencies on headers files, libs, etc. from other cay packages.
>
> Howard
>
>
>
>
> 2014-10-28 13:20 GMT-06:00 Ralph Castain :
>
>>
>> On Oct 28, 2014, at 12:17 PM, Paul Hargrove  wrote:
>>
>> Ralph,
>>
>> The Cray's at NERSC have *both* pmi_cray.h and pmi.h (and pmi2.h as well).
>>
>>
>> I understand that - I was questioning if that is universally true or not.
>> IF we are guaranteed that nobody with a Cray ever renames pmi_cray.h to
>> pmi.h, THEN your check will be fine. Otherwise, we can't trust it.
>>
>> And I seem to recall that the earlier Crays, at least, didn't have this
>> naming distinction - or at least, not at LANL. Hence my question.
>>
>>
>> That is why I said our configure logic checks for pmi_cray.h *first*.
>> Sorry if that wasn't clear.
>>
>> On NERSC's XE6:
>>
>> {hargrove@hopper06 ~}$ ls /opt/cray/pmi/default/include/
>> pmi2.h  pmi_cray_ext.h  pmi_cray.h  pmi.h  pmi_version.h
>> {hargrove@hopper06 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
>> cray-libpmi-devel-4.0.1-1..9753.86.3.gem
>>
>>
>> On NERSC's XC30:
>>
>> {hargrove@edison08 ~}$ ls /opt/cray/pmi/default/include/
>> pmi.h  pmi2.h  pmi_cray.h  pmi_cray_ext.h  pmi_version.h
>> {hargrove@edison08 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
>> cray-libpmi-devel-5.0.5-1..10300.134.8.ari
>>
>>
>> -Paul
>>
>> On Tue, Oct 28, 2014 at 12:02 PM, Ralph Castain  wrote:
>>
>>>
>>> On Oct 28, 2014, at 11:59 AM, Paul Hargrove  wrote:
>>>
>>>
>>> On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard 
>>> wrote:
>>>

> We may no longer require those as you have separated the Cray check
> out, but the original problem is that we would pickup the Slurm components
> on the Cray because we would find pmi.h
>
> Oh,  I forgot about that .

>>>
>>> In GASNet's configure logic we look for "pmi_cray.h" before "pmi.h".
>>>
>>>
>>> Hmmm...on LANL's Cray systems, it was still labeled "pmi.h"
>>>
>>> So far that has been sufficient to disambiguate the implementations.
>>> One might also try checking libpmi for Cray's extensions.
>>>
>>> -Paul
>>>
>>>
>>> --
>>> Paul H. Hargrove  phhargr...@lbl.gov
>>> Future Technologies Group
>>> Computer and Data Sciences Department Tel: +1-510-495-2352
>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>>  ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2014/10/16114.php
>>>
>>>
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2014/10/16115.php
>>>
>>
>>
>>
>> --
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>  ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/10/16116.php
>>
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/10/16117.php
>>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16118.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Howard Pritchard
Hi Ralph,

Oh on the cray, you don't need to specify the --with-pmi, except to say you
either want
a particular directory (for instance if you wanted to try your luck with s2
on a cray
nativized slurm),  or you want to say --with-pmi=no.

Howard


2014-10-28 14:14 GMT-06:00 Ralph Castain :

>
> On Oct 28, 2014, at 1:05 PM, Howard Pritchard  wrote:
>
> Hi Folks,
>
> The simplest and best way on cray is to use the pkg-config command.
> No looking for odd header file names, etc.  There is a minor issue
> with external login nodes running very old (like CLE 4.X) that one has
> to workaround, but otherwise works well.
>
> pkg-config --cflags cray-pmi
>
> etc. etc.
>
> The pc files for the various cray software packages are suppose to include
> all dependencies on headers files, libs, etc. from other cay packages.
>
>
> Given that difference, perhaps you should just add a —with flag that
> specifically addresses Cray? We can’t use that method for Slurm or any
> other PMI-based system, and I’m not sure how you can generalize things
> adequately.
>
>
> Howard
>
>
>
>
> 2014-10-28 13:20 GMT-06:00 Ralph Castain :
>
>>
>> On Oct 28, 2014, at 12:17 PM, Paul Hargrove  wrote:
>>
>> Ralph,
>>
>> The Cray's at NERSC have *both* pmi_cray.h and pmi.h (and pmi2.h as well).
>>
>>
>> I understand that - I was questioning if that is universally true or not.
>> IF we are guaranteed that nobody with a Cray ever renames pmi_cray.h to
>> pmi.h, THEN your check will be fine. Otherwise, we can’t trust it.
>>
>> And I seem to recall that the earlier Crays, at least, didn’t have this
>> naming distinction - or at least, not at LANL. Hence my question.
>>
>>
>> That is why I said our configure logic checks for pmi_cray.h *first*.
>> Sorry if that wasn't clear.
>>
>> On NERSC's XE6:
>>
>> {hargrove@hopper06 ~}$ ls /opt/cray/pmi/default/include/
>> pmi2.h  pmi_cray_ext.h  pmi_cray.h  pmi.h  pmi_version.h
>> {hargrove@hopper06 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
>> cray-libpmi-devel-4.0.1-1..9753.86.3.gem
>>
>>
>> On NERSC's XC30:
>>
>> {hargrove@edison08 ~}$ ls /opt/cray/pmi/default/include/
>> pmi.h  pmi2.h  pmi_cray.h  pmi_cray_ext.h  pmi_version.h
>> {hargrove@edison08 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
>> cray-libpmi-devel-5.0.5-1..10300.134.8.ari
>>
>>
>> -Paul
>>
>> On Tue, Oct 28, 2014 at 12:02 PM, Ralph Castain  wrote:
>>
>>>
>>> On Oct 28, 2014, at 11:59 AM, Paul Hargrove  wrote:
>>>
>>>
>>> On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard 
>>> wrote:
>>>

> We may no longer require those as you have separated the Cray check
> out, but the original problem is that we would pickup the Slurm components
> on the Cray because we would find pmi.h
>
> Oh,  I forgot about that .

>>>
>>> In GASNet's configure logic we look for "pmi_cray.h" before "pmi.h”.
>>>
>>>
>>> Hmmm…on LANL’s Cray systems, it was still labeled “pmi.h”
>>>
>>> So far that has been sufficient to disambiguate the implementations.
>>> One might also try checking libpmi for Cray's extensions.
>>>
>>> -Paul
>>>
>>>
>>> --
>>> Paul H. Hargrove  phhargr...@lbl.gov
>>> Future Technologies Group
>>> Computer and Data Sciences Department Tel: +1-510-495-2352
>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>>  ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2014/10/16114.php
>>>
>>>
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2014/10/16115.php
>>>
>>
>>
>>
>> --
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>  ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/10/16116.php
>>
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/10/16117.php
>>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16118.php
>
>
>

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Howard Pritchard
HI Ralph,

I think I found the problem.  Thanks.

Howard


2014-10-28 12:58 GMT-06:00 Ralph Castain :

>
> On Oct 28, 2014, at 11:53 AM, Howard Pritchard 
> wrote:
>
> Hi Ralph,
>
>
> 2014-10-28 12:26 GMT-06:00 Ralph Castain :
>
>>
>> > On Oct 28, 2014, at 11:16 AM, Howard Pritchard 
>> wrote:
>> >
>> > Hi Folks,
>> >
>> > I'm trying to figure out what broke for pmi configure since now the
>> pmix/cray component
>> > doesn't compile any longer in master.
>>
>> Ouch - sorry about that. I thought the Cray component strictly used the
>> new Cray PMI check (which I didn’t touch) - isn’t that true?
>>
> That is correct.  Not clear which changes are causing the problem.
>
>
> Ah crud - you do indeed use the PMI code:
>
> OPAL_CHECK_PMI([CRAY_PMI], [opal_check_cray_pmi_happy="yes"],
>
>
> I’m afraid I did break you :-(
>
> Want me to investigate the fix?
>
>
>> >
>> > I was happening to look in the s1 and s2 configure.m4's and noticed a
>> AC_REQUIRE
>> > for OPAL_CHECK_UGNI.  This doesn't make sense to me.  Maybe these were
>> > accidentally copied from the configure.m4 for the cray pmi?
>>
>> We may no longer require those as you have separated the Cray check out,
>> but the original problem is that we would pickup the Slurm components on
>> the Cray because we would find pmi.h
>>
>> Oh,  I forgot about that .
>
>
> Yeah, I’m afraid we do have to retain them because the Cray code does use
> —with-pmi and therefore overlaps the Slurm check.
>
>
> >
>> > Howard
>> >
>> >
>> >
>> >
>> > ___
>> > devel mailing list
>> > de...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> > Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/10/16110.php
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/10/16111.php
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16112.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16113.php
>


Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Ralph Castain

> On Oct 28, 2014, at 1:05 PM, Howard Pritchard  wrote:
> 
> Hi Folks,
> 
> The simplest and best way on cray is to use the pkg-config command.
> No looking for odd header file names, etc.  There is a minor issue
> with external login nodes running very old (like CLE 4.X) that one has
> to workaround, but otherwise works well.
> 
> pkg-config --cflags cray-pmi
> 
> etc. etc.
> 
> The pc files for the various cray software packages are suppose to include
> all dependencies on headers files, libs, etc. from other cay packages.

Given that difference, perhaps you should just add a —with flag that 
specifically addresses Cray? We can’t use that method for Slurm or any other 
PMI-based system, and I’m not sure how you can generalize things adequately.

> 
> Howard
> 
> 
> 
> 
> 2014-10-28 13:20 GMT-06:00 Ralph Castain  >:
> 
>> On Oct 28, 2014, at 12:17 PM, Paul Hargrove > > wrote:
>> 
>> Ralph,
>> 
>> The Cray's at NERSC have *both* pmi_cray.h and pmi.h (and pmi2.h as well).
> 
> I understand that - I was questioning if that is universally true or not. IF 
> we are guaranteed that nobody with a Cray ever renames pmi_cray.h to pmi.h, 
> THEN your check will be fine. Otherwise, we can’t trust it.
> 
> And I seem to recall that the earlier Crays, at least, didn’t have this 
> naming distinction - or at least, not at LANL. Hence my question.
> 
> 
>> That is why I said our configure logic checks for pmi_cray.h *first*.
>> Sorry if that wasn't clear.
>> 
>> On NERSC's XE6:
>> {hargrove@hopper06 ~}$ ls /opt/cray/pmi/default/include/
>> pmi2.h  pmi_cray_ext.h  pmi_cray.h  pmi.h  pmi_version.h
>> {hargrove@hopper06 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
>> cray-libpmi-devel-4.0.1-1..9753.86.3.gem
>> 
>> On NERSC's XC30:
>> {hargrove@edison08 ~}$ ls /opt/cray/pmi/default/include/
>> pmi.h  pmi2.h  pmi_cray.h  pmi_cray_ext.h  pmi_version.h
>> {hargrove@edison08 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
>> cray-libpmi-devel-5.0.5-1..10300.134.8.ari
>> 
>> -Paul
>> 
>> On Tue, Oct 28, 2014 at 12:02 PM, Ralph Castain > > wrote:
>> 
>>> On Oct 28, 2014, at 11:59 AM, Paul Hargrove >> > wrote:
>>> 
>>> 
>>> On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard >> > wrote:
>>> 
>>> We may no longer require those as you have separated the Cray check out, 
>>> but the original problem is that we would pickup the Slurm components on 
>>> the Cray because we would find pmi.h
>>> 
>>> Oh,  I forgot about that . 
>>> 
>>> In GASNet's configure logic we look for "pmi_cray.h" before "pmi.h”.
>> 
>> Hmmm…on LANL’s Cray systems, it was still labeled “pmi.h”
>> 
>>> So far that has been sufficient to disambiguate the implementations.
>>> One might also try checking libpmi for Cray's extensions.
>>> 
>>> -Paul
>>> 
>>> 
>>> -- 
>>> Paul H. Hargrove  phhargr...@lbl.gov 
>>> 
>>> Future Technologies Group
>>> Computer and Data Sciences Department Tel: +1-510-495-2352 
>>> 
>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org 
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>>> 
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/devel/2014/10/16114.php 
>>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org 
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> 
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/10/16115.php 
>> 
>> 
>> 
>> 
>> -- 
>> Paul H. Hargrove  phhargr...@lbl.gov 
>> 
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352 
>> 
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 
>> ___
>> devel mailing list
>> de...@open-mpi.org 
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> 
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/10/16116.php 
>> 
> 
> ___
> devel mailing list
> 

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
On Tue, Oct 28, 2014 at 12:20 PM, Ralph Castain  wrote:

> On Oct 28, 2014, at 12:17 PM, Paul Hargrove  wrote:
>
> Ralph,
>
> The Cray's at NERSC have *both* pmi_cray.h and pmi.h (and pmi2.h as well).
>
>
> I understand that - I was questioning if that is universally true or not.
> IF we are guaranteed that nobody with a Cray ever renames pmi_cray.h to
> pmi.h, THEN your check will be fine. Otherwise, we can't trust it.
>
> And I seem to recall that the earlier Crays, at least, didn't have this
> naming distinction - or at least, not at LANL. Hence my question.
>

Fair enough.
I would say anybody moving or renaming files provided by Cray gets what
they deserve. However, since I have no way to confirm older or future
systems, I cannot answer your question with an affirmative.

What about checking for the presence of pmi_cray_ext.h?
Is that any better?

So, if one is not going to trust ANY filenames, one might instead see if
pmi.h and libpmi.* provide Cray's extensions.  If there are Cray extensions
used by OPAL/ORTE/OMPI, then checking for those would be "the right way"
anyway.

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Howard Pritchard
Hi Folks,

The simplest and best way on cray is to use the pkg-config command.
No looking for odd header file names, etc.  There is a minor issue
with external login nodes running very old (like CLE 4.X) that one has
to workaround, but otherwise works well.

pkg-config --cflags cray-pmi

etc. etc.

The pc files for the various cray software packages are suppose to include
all dependencies on headers files, libs, etc. from other cay packages.

Howard




2014-10-28 13:20 GMT-06:00 Ralph Castain :

>
> On Oct 28, 2014, at 12:17 PM, Paul Hargrove  wrote:
>
> Ralph,
>
> The Cray's at NERSC have *both* pmi_cray.h and pmi.h (and pmi2.h as well).
>
>
> I understand that - I was questioning if that is universally true or not.
> IF we are guaranteed that nobody with a Cray ever renames pmi_cray.h to
> pmi.h, THEN your check will be fine. Otherwise, we can’t trust it.
>
> And I seem to recall that the earlier Crays, at least, didn’t have this
> naming distinction - or at least, not at LANL. Hence my question.
>
>
> That is why I said our configure logic checks for pmi_cray.h *first*.
> Sorry if that wasn't clear.
>
> On NERSC's XE6:
>
> {hargrove@hopper06 ~}$ ls /opt/cray/pmi/default/include/
> pmi2.h  pmi_cray_ext.h  pmi_cray.h  pmi.h  pmi_version.h
> {hargrove@hopper06 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
> cray-libpmi-devel-4.0.1-1..9753.86.3.gem
>
>
> On NERSC's XC30:
>
> {hargrove@edison08 ~}$ ls /opt/cray/pmi/default/include/
> pmi.h  pmi2.h  pmi_cray.h  pmi_cray_ext.h  pmi_version.h
> {hargrove@edison08 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
> cray-libpmi-devel-5.0.5-1..10300.134.8.ari
>
>
> -Paul
>
> On Tue, Oct 28, 2014 at 12:02 PM, Ralph Castain  wrote:
>
>>
>> On Oct 28, 2014, at 11:59 AM, Paul Hargrove  wrote:
>>
>>
>> On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard 
>> wrote:
>>
>>>
 We may no longer require those as you have separated the Cray check
 out, but the original problem is that we would pickup the Slurm components
 on the Cray because we would find pmi.h

 Oh,  I forgot about that .
>>>
>>
>> In GASNet's configure logic we look for "pmi_cray.h" before "pmi.h”.
>>
>>
>> Hmmm…on LANL’s Cray systems, it was still labeled “pmi.h”
>>
>> So far that has been sufficient to disambiguate the implementations.
>> One might also try checking libpmi for Cray's extensions.
>>
>> -Paul
>>
>>
>> --
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>  ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/10/16114.php
>>
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/10/16115.php
>>
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16116.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16117.php
>


Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Ralph Castain

> On Oct 28, 2014, at 12:17 PM, Paul Hargrove  wrote:
> 
> Ralph,
> 
> The Cray's at NERSC have *both* pmi_cray.h and pmi.h (and pmi2.h as well).

I understand that - I was questioning if that is universally true or not. IF we 
are guaranteed that nobody with a Cray ever renames pmi_cray.h to pmi.h, THEN 
your check will be fine. Otherwise, we can’t trust it.

And I seem to recall that the earlier Crays, at least, didn’t have this naming 
distinction - or at least, not at LANL. Hence my question.


> That is why I said our configure logic checks for pmi_cray.h *first*.
> Sorry if that wasn't clear.
> 
> On NERSC's XE6:
> {hargrove@hopper06 ~}$ ls /opt/cray/pmi/default/include/
> pmi2.h  pmi_cray_ext.h  pmi_cray.h  pmi.h  pmi_version.h
> {hargrove@hopper06 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
> cray-libpmi-devel-4.0.1-1..9753.86.3.gem
> 
> On NERSC's XC30:
> {hargrove@edison08 ~}$ ls /opt/cray/pmi/default/include/
> pmi.h  pmi2.h  pmi_cray.h  pmi_cray_ext.h  pmi_version.h
> {hargrove@edison08 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
> cray-libpmi-devel-5.0.5-1..10300.134.8.ari
> 
> -Paul
> 
> On Tue, Oct 28, 2014 at 12:02 PM, Ralph Castain  > wrote:
> 
>> On Oct 28, 2014, at 11:59 AM, Paul Hargrove > > wrote:
>> 
>> 
>> On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard > > wrote:
>> 
>> We may no longer require those as you have separated the Cray check out, but 
>> the original problem is that we would pickup the Slurm components on the 
>> Cray because we would find pmi.h
>> 
>> Oh,  I forgot about that . 
>> 
>> In GASNet's configure logic we look for "pmi_cray.h" before "pmi.h”.
> 
> Hmmm…on LANL’s Cray systems, it was still labeled “pmi.h”
> 
>> So far that has been sufficient to disambiguate the implementations.
>> One might also try checking libpmi for Cray's extensions.
>> 
>> -Paul
>> 
>> 
>> -- 
>> Paul H. Hargrove  phhargr...@lbl.gov 
>> 
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352 
>> 
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 
>> ___
>> devel mailing list
>> de...@open-mpi.org 
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> 
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2014/10/16114.php 
>> 
> 
> ___
> devel mailing list
> de...@open-mpi.org 
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> 
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/10/16115.php 
> 
> 
> 
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov 
> 
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/10/16116.php



Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
Ralph,

The Cray's at NERSC have *both* pmi_cray.h and pmi.h (and pmi2.h as well).
That is why I said our configure logic checks for pmi_cray.h *first*.
Sorry if that wasn't clear.

On NERSC's XE6:

{hargrove@hopper06 ~}$ ls /opt/cray/pmi/default/include/
pmi2.h  pmi_cray_ext.h  pmi_cray.h  pmi.h  pmi_version.h
{hargrove@hopper06 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
cray-libpmi-devel-4.0.1-1..9753.86.3.gem


On NERSC's XC30:

{hargrove@edison08 ~}$ ls /opt/cray/pmi/default/include/
pmi.h  pmi2.h  pmi_cray.h  pmi_cray_ext.h  pmi_version.h
{hargrove@edison08 ~}$ rpm -qf /opt/cray/pmi/default/include/pmi_cray.h
cray-libpmi-devel-5.0.5-1..10300.134.8.ari


-Paul

On Tue, Oct 28, 2014 at 12:02 PM, Ralph Castain  wrote:

>
> On Oct 28, 2014, at 11:59 AM, Paul Hargrove  wrote:
>
>
> On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard 
> wrote:
>
>>
>>> We may no longer require those as you have separated the Cray check out,
>>> but the original problem is that we would pickup the Slurm components on
>>> the Cray because we would find pmi.h
>>>
>>> Oh,  I forgot about that .
>>
>
> In GASNet's configure logic we look for "pmi_cray.h" before "pmi.h".
>
>
> Hmmm...on LANL's Cray systems, it was still labeled "pmi.h"
>
> So far that has been sufficient to disambiguate the implementations.
> One might also try checking libpmi for Cray's extensions.
>
> -Paul
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16114.php
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16115.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Ralph Castain

> On Oct 28, 2014, at 11:59 AM, Paul Hargrove  wrote:
> 
> 
> On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard  > wrote:
> 
> We may no longer require those as you have separated the Cray check out, but 
> the original problem is that we would pickup the Slurm components on the Cray 
> because we would find pmi.h
> 
> Oh,  I forgot about that . 
> 
> In GASNet's configure logic we look for "pmi_cray.h" before "pmi.h”.

Hmmm…on LANL’s Cray systems, it was still labeled “pmi.h”

> So far that has been sufficient to disambiguate the implementations.
> One might also try checking libpmi for Cray's extensions.
> 
> -Paul
> 
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov 
> 
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/10/16114.php



Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Paul Hargrove
On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard 
wrote:

>
>> We may no longer require those as you have separated the Cray check out,
>> but the original problem is that we would pickup the Slurm components on
>> the Cray because we would find pmi.h
>>
>> Oh,  I forgot about that .
>

In GASNet's configure logic we look for "pmi_cray.h" before "pmi.h".
So far that has been sufficient to disambiguate the implementations.
One might also try checking libpmi for Cray's extensions.

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Ralph Castain

> On Oct 28, 2014, at 11:53 AM, Howard Pritchard  wrote:
> 
> Hi Ralph,
> 
> 
> 2014-10-28 12:26 GMT-06:00 Ralph Castain  >:
> 
> > On Oct 28, 2014, at 11:16 AM, Howard Pritchard  > > wrote:
> >
> > Hi Folks,
> >
> > I'm trying to figure out what broke for pmi configure since now the 
> > pmix/cray component
> > doesn't compile any longer in master.
> 
> Ouch - sorry about that. I thought the Cray component strictly used the new 
> Cray PMI check (which I didn’t touch) - isn’t that true?
> That is correct.  Not clear which changes are causing the problem. 

Ah crud - you do indeed use the PMI code:

OPAL_CHECK_PMI([CRAY_PMI], [opal_check_cray_pmi_happy="yes"],


I’m afraid I did break you :-(

Want me to investigate the fix?

> 
> >
> > I was happening to look in the s1 and s2 configure.m4's and noticed a 
> > AC_REQUIRE
> > for OPAL_CHECK_UGNI.  This doesn't make sense to me.  Maybe these were
> > accidentally copied from the configure.m4 for the cray pmi?
> 
> We may no longer require those as you have separated the Cray check out, but 
> the original problem is that we would pickup the Slurm components on the Cray 
> because we would find pmi.h
> 
> Oh,  I forgot about that . 

Yeah, I’m afraid we do have to retain them because the Cray code does use 
—with-pmi and therefore overlaps the Slurm check.


> >
> > Howard
> >
> >
> >
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org 
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > 
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2014/10/16110.php 
> > 
> 
> ___
> devel mailing list
> de...@open-mpi.org 
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> 
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/10/16111.php 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/10/16112.php



Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Howard Pritchard
Hi Ralph,


2014-10-28 12:26 GMT-06:00 Ralph Castain :

>
> > On Oct 28, 2014, at 11:16 AM, Howard Pritchard 
> wrote:
> >
> > Hi Folks,
> >
> > I'm trying to figure out what broke for pmi configure since now the
> pmix/cray component
> > doesn't compile any longer in master.
>
> Ouch - sorry about that. I thought the Cray component strictly used the
> new Cray PMI check (which I didn’t touch) - isn’t that true?
>
That is correct.  Not clear which changes are causing the problem.

>
> >
> > I was happening to look in the s1 and s2 configure.m4's and noticed a
> AC_REQUIRE
> > for OPAL_CHECK_UGNI.  This doesn't make sense to me.  Maybe these were
> > accidentally copied from the configure.m4 for the cray pmi?
>
> We may no longer require those as you have separated the Cray check out,
> but the original problem is that we would pickup the Slurm components on
> the Cray because we would find pmi.h
>
> Oh,  I forgot about that .

>
> > Howard
> >
> >
> >
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16110.php
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/10/16111.php


Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Ralph Castain

> On Oct 28, 2014, at 11:16 AM, Howard Pritchard  wrote:
> 
> Hi Folks,
> 
> I'm trying to figure out what broke for pmi configure since now the pmix/cray 
> component
> doesn't compile any longer in master.

Ouch - sorry about that. I thought the Cray component strictly used the new 
Cray PMI check (which I didn’t touch) - isn’t that true?

> 
> I was happening to look in the s1 and s2 configure.m4's and noticed a 
> AC_REQUIRE
> for OPAL_CHECK_UGNI.  This doesn't make sense to me.  Maybe these were
> accidentally copied from the configure.m4 for the cray pmi? 

We may no longer require those as you have separated the Cray check out, but 
the original problem is that we would pickup the Slurm components on the Cray 
because we would find pmi.h

> 
> Howard
> 
> 
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/10/16110.php



[OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Howard Pritchard
Hi Folks,

I'm trying to figure out what broke for pmi configure since now the
pmix/cray component
doesn't compile any longer in master.

I was happening to look in the s1 and s2 configure.m4's and noticed a
AC_REQUIRE
for OPAL_CHECK_UGNI.  This doesn't make sense to me.  Maybe these were
accidentally copied from the configure.m4 for the cray pmi?

Howard


Re: [OMPI devel] [mpich-discuss] ROMIO+Lustre problems in OpenMPI 1.8.3

2014-10-28 Thread Rob Latham



On 10/28/2014 06:00 AM, Paul Kapinos wrote:

Dear Open MPI and ROMIO developer,

We use Open MPI v.1.6.x and 1.8.x in our cluster.
We have Lustre file system; we wish to use MPI_IO.
So the OpenMPI's are compiled with this flag:
 > --with-io-romio-flags='--with-file-system=testfs+ufs+nfs+lustre'

In our newest installation openmpi/1.8.3 we found that MPI_IO is *broken*.

Short seek for root of the evil bring the following to light:

- the ROMIO component 'MCA io: romio' isn't here at all in the affected
version, because

- configure of ROMIO has *failed* (cf. logs (a,b,c).
- because lustre_user.h was found but could not be compiled.


lustre_user.h cannot be compiled because quota defines won't compile. 
Ugh, what a mess.


A while back I noticed this and fixed it by removing an XOPEN_SOURCE 
feature test macro:


http://trac.mpich.org/projects/mpich/ticket/1973

Then, on solaris with --enable-strict we needed to put *back* the 
XOPEN_SOURCE macro or else pread and pwrite would be undefined.


So what I really need to to is delete XOPEN_SOURCE since it causes such 
headaches, and on the rare platforms that only have pread/pwrite defined 
if you take extraordinary measures, if at all, I'll have a ROMIO pread 
and pwrite that simply do seek + write (or read).


For now, please delete the XOPEN_SOURCE line at the very beginning of 
src/mpi/romio/adio/ad_lustre/ad_lustre_rwcontig.c


==rob





In our system, there are two lustre_user.h available:
$ locate lustre_user.h
/usr/include/linux/lustre_user.h
/usr/include/lustre/lustre_user.h
As I'm not very convinient with lustre, I just attach both of them.

pk224850@cluster:~[509]$ uname -a
Linux cluster.rz.RWTH-Aachen.DE 2.6.32-431.29.2.el6.x86_64 #1 SMP Tue
Sep 9 13:45:55 CDT 2014 x86_64 x86_64 x86_64 GNU/Linux

pk224850@cluster:~[510]$ cat /etc/issue
Scientific Linux release 6.5 (Carbon)

Note that openmpi/1.8.1 seem to be fully OK (MPI_IO works) in our
environment.

Best

Paul Kapinos

P.S. Is there a confugure flag, which will enforce ROMIO? That is when
ROMIO not available, configure would fail. This would make such hidden
errors publique at installation time..






a) Log in Open MPI's config.log:
--

configure:226781: OMPI configuring in ompi/mca/io/romio/romio
configure:226866: running /bin/sh './configure'
--with-file-system=testfs+ufs+nfs+lustre  FROM_OMPI=yes CC="icc
-std=c99" CFLAGS="-DNDEBUG -O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2
-m64 -finline-functions -fno-strict-aliasing -restrict -fexceptions
-Qoption,cpp,--extended_float_types -pthread" CPPFLAGS="
-I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/hwloc/hwloc172/hwloc/include
-I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent
-I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent/include"
FFLAGS="-O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2   -m64  "
LDFLAGS="-O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2   -m64
-fexceptions " --enable-shared --disable-static
--with-file-system=testfs+ufs+nfs+lustre
--prefix=/opt/MPI/openmpi-1.8.3/linux/intel --disable-aio
--cache-file=/dev/null --srcdir=. --disable-option-checking
configure:226876: /bin/sh './configure' *failed* for
ompi/mca/io/romio/romio
configure:226911: WARNING: ROMIO distribution did not configure
successfully
configure:227425: checking if MCA component io:romio can compile
configure:227427: result: no
--




b) dump of Open MPI's 'configure' output to the console:
--

checking lustre/lustre_user.h usability... no
checking lustre/lustre_user.h presence... yes
configure: WARNING: lustre/lustre_user.h: present but cannot be compiled
configure: WARNING: lustre/lustre_user.h: check for missing
prerequisite headers?
configure: WARNING: lustre/lustre_user.h: see the Autoconf documentation
configure: WARNING: lustre/lustre_user.h: section "Present But
Cannot Be Compiled"
configure: WARNING: lustre/lustre_user.h: proceeding with the compiler's
result
configure: WARNING: ##  ##
configure: WARNING: ## Report this to disc...@mpich.org ##
configure: WARNING: ##  ##
checking for lustre/lustre_user.h... no
configure: error: LUSTRE support requested but cannot find
lustre/lustre_user.h header file
configure: /bin/sh './configure' *failed* for ompi/mca/io/romio/romio
configure: WARNING: ROMIO distribution did not configure successfully
checking if MCA component io:romio can compile... no
--


c) ompi/mca/io/romio/romio's config.log:
--

configure:20962: checking lustre/lustre_user.h 

Re: [OMPI devel] 1.8.3 and PSM errors

2014-10-28 Thread Adrian Reber
Good to know. I will update the infinipath libraries on the next
occasion and report back. This will probably take a few days (or weeks).

Adrian

On Mon, Oct 27, 2014 at 10:22:08PM +, Friedley, Andrew wrote:
> Hi Adrian,
> 
> I'm unable to reproduce here with OMPI v1.8.3 (I assume you're doing this 
> with one 8-core node):
> 
> $ mpirun -np 32 -mca pml cm -mca mtl psm ./mpi_test_suite -t "environment"
> (Rank:0) tst_test_array[0]:Status
> (Rank:0) tst_test_array[1]:Request_Null
> (Rank:0) tst_test_array[2]:Type_dup
> (Rank:0) tst_test_array[3]:Get_version
> Number of failed tests:0
> 
> Works with various np from 8 to 32.  Your original case:
> 
> $ mpirun -np 32 ./mpi_test_suite -t "All,^io,^one-sided"
> 
> Runs for a while and eventually hits send cancellation errors.
> 
> Any chance you could try updating your infinipath libraries?
> 
> Andrew
> 
> > -Original Message-
> > From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Adrian
> > Reber
> > Sent: Monday, October 27, 2014 9:11 AM
> > To: Open MPI Developers
> > Subject: Re: [OMPI devel] 1.8.3 and PSM errors
> > 
> > This is a simpler test setup:
> > 
> > On 8 core machines this works:
> > 
> > $ mpirun  -np 8  mpi_test_suite -t "environment"
> > [...]
> > Number of failed tests:0
> > 
> > Using 9 or more cores it fails:
> > 
> > $ mpirun  -np 9  mpi_test_suite -t "environment"
> > 
> > mpi_test_suite:20293 terminated with signal 11 at PC=2b6d107fa9a4
> > SP=7fff06431a70.  Backtrace:
> > /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b6d107fa9a
> > 4]
> > /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b6d107eb1
> > 72]
> > /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-
> > 4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b6d0fa6e384]
> > /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-
> > 4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b6d0f93376a]
> > /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-
> > 4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b6d0f963d42]
> > mpi_test_suite[0x46cd00]
> > mpi_test_suite[0x44434c]
> > /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b6d10047d5d]
> > mpi_test_suite[0x4058e9]
> > ---
> > Primary job  terminated normally, but 1 process returned a non-zero exit
> > code.. Per user-direction, the job has been aborted.
> > ---
> > 
> > mpi_test_suite:11212 terminated with signal 11 at PC=2b2c27d0d9a4
> > SP=75020430.  Backtrace:
> > /usr/lib64/libpsm_infinipath.so.1(ips_proto_connect+0x334)[0x2b2c27d0d9a
> > 4]
> > /usr/lib64/libpsm_infinipath.so.1(__psm_ep_connect+0x692)[0x2b2c27cfe17
> > 2]
> > /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-
> > 4.9/lib/libmpi.so.1(ompi_mtl_psm_add_procs+0x1a4)[0x2b2c26f81384]
> > /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-
> > 4.9/lib/libmpi.so.1(ompi_comm_get_rprocs+0x2fa)[0x2b2c26e4676a]
> > /opt/bwhpc/common/mpi/openmpi/1.8.3-gnu-
> > 4.9/lib/libmpi.so.1(MPI_Intercomm_create+0x332)[0x2b2c26e76d42]
> > mpi_test_suite[0x46cd00]
> > mpi_test_suite[0x44434c]
> > /lib64/libc.so.6(__libc_start_main+0xfd)[0x2b2c2755ad5d]
> > mpi_test_suite[0x4058e9]
> > --
> > mpirun detected that one or more processes exited with non-zero status,
> > thus causing the job to be terminated. The first process to do so was:
> > 
> >   Process name: [[47415,1],0]
> >   Exit code:1
> > --
> > 
> > 
> > 
> > On Mon, Oct 27, 2014 at 08:27:17AM -0700, Ralph Castain wrote:
> > > I’m afraid I can’t quite decipher from all this what actually fails. Of 
> > > course,
> > PSM doesn’t support dynamic operations like comm_spawn or
> > connect_accept, so if you are running those tests that just won’t work. Is
> > that the heart of the problem here?
> > >
> > >
> > > > On Oct 27, 2014, at 1:40 AM, Adrian Reber  wrote:
> > > >
> > > > Running Open MPI 1.8.3 with PSM does not seem to work right now at all.
> > > > I am getting the same errors also on trunk from my newly set up MTT.
> > > > Before trying to debug this I just wanted to make sure this is not a
> > > > configuration error. I have following PSM packages installed:
> > > >
> > > > infinipath-devel-3.1.1-363.1140_rhel6_qlc.noarch
> > > > infinipath-libs-3.1.1-363.1140_rhel6_qlc.x86_64
> > > > infinipath-3.1.1-363.1140_rhel6_qlc.x86_64
> > > >
> > > > with 1.6.5 I do not see PSM errors and the test suite fails much later:
> > > >
> > > > P2P tests Many-to-one with MPI_Iprobe (MPI_ANY_SOURCE) (21/48),
> > comm
> > > > Intracomm merged of the Halved Intercomm (13/13), type
> > > > MPI_TYPE_MIX_ARRAY (28/29) P2P tests Many-to-one with MPI_Iprobe
> > > > (MPI_ANY_SOURCE) (21/48), comm Intracomm merged of the Halved
> > > > Intercomm (13/13), type MPI_TYPE_MIX_LB_UB (29/29)
> > > > n050304:5.0.Cannot cancel send requests (req=0x2ad8ba881f80) P2P
> > > > 

Re: [OMPI devel] fixing a bug in 1.8 that's not in master

2014-10-28 Thread Jeff Squyres (jsquyres)
I just updated the wiki:

NOTE: Pull requests on ompi-release must include a hash reference in the 
body/comments corresponding to the commit(s) on ompi:master from which it is 
derived, OR indicate that this is solely a release branch bug (i.e., there's no 
corresponding commit on ompi:master because the bug doesn't/didn't exist on 
ompi:master).




On Oct 27, 2014, at 9:57 PM, Ralph Castain 
> wrote:

Just create a topic branch from v1.8 in a local clone of ompi-release, make the 
change there, and then file a PR on the ompi-release repo

Obviously, if it is a bug solely confined to v1.8, you can’t put it in master 
first :-)


On Oct 27, 2014, at 3:22 PM, Howard Pritchard 
> wrote:

Hi Folks,

A cut and past error seems to have happened with
plm_alps_modules.c in 1.8 which causes a compile failure
when building for cray.  So right now, there's no building
ompi 1.8 for crays.

The problem is not present in master.

For these kinds of problems, are we suppose to bypass
all the "has to be in master, need commit, etc." stuff described in

https://github.com/open-mpi/ompi/wiki/SubmittingPullRequests

and just go straight to pushing to a fork of ompi-release, etc.
as per the rest of the instructions on submitting pull requests?

Just want to make sure I'm doing the right thing here.

Howard



___
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2014/10/16104.php

___
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2014/10/16105.php


--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



[OMPI devel] ROMIO+Lustre problems in OpenMPI 1.8.3

2014-10-28 Thread Paul Kapinos

Dear Open MPI and ROMIO developer,

We use Open MPI v.1.6.x and 1.8.x in our cluster.
We have Lustre file system; we wish to use MPI_IO.
So the OpenMPI's are compiled with this flag:
> --with-io-romio-flags='--with-file-system=testfs+ufs+nfs+lustre'

In our newest installation openmpi/1.8.3 we found that MPI_IO is *broken*.

Short seek for root of the evil bring the following to light:

- the ROMIO component 'MCA io: romio' isn't here at all in the affected version, 
because


- configure of ROMIO has *failed* (cf. logs (a,b,c).
- because lustre_user.h was found but could not be compiled.


In our system, there are two lustre_user.h available:
$ locate lustre_user.h
/usr/include/linux/lustre_user.h
/usr/include/lustre/lustre_user.h
As I'm not very convinient with lustre, I just attach both of them.

pk224850@cluster:~[509]$ uname -a
Linux cluster.rz.RWTH-Aachen.DE 2.6.32-431.29.2.el6.x86_64 #1 SMP Tue Sep 9 
13:45:55 CDT 2014 x86_64 x86_64 x86_64 GNU/Linux


pk224850@cluster:~[510]$ cat /etc/issue
Scientific Linux release 6.5 (Carbon)

Note that openmpi/1.8.1 seem to be fully OK (MPI_IO works) in our environment.

Best

Paul Kapinos

P.S. Is there a confugure flag, which will enforce ROMIO? That is when ROMIO not 
available, configure would fail. This would make such hidden errors publique at 
installation time..







a) Log in Open MPI's config.log:
--
configure:226781: OMPI configuring in ompi/mca/io/romio/romio
configure:226866: running /bin/sh './configure' 
--with-file-system=testfs+ufs+nfs+lustre  FROM_OMPI=yes CC="icc -std=c99" 
CFLAGS="-DNDEBUG -O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2 -m64 
-finline-functions -fno-strict-aliasing -restrict -fexceptions 
-Qoption,cpp,--extended_float_types -pthread" CPPFLAGS=" 
-I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/hwloc/hwloc172/hwloc/include 
-I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent 
-I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent/include" 
FFLAGS="-O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model fast=2   -m64  " LDFLAGS="-O3 -ip 
-axAVX,SSE4.2,SSE4.1 -fp-model fast=2   -m64   -fexceptions " --enable-shared 
--disable-static --with-file-system=testfs+ufs+nfs+lustre 
--prefix=/opt/MPI/openmpi-1.8.3/linux/intel --disable-aio --cache-file=/dev/null 
--srcdir=. --disable-option-checking

configure:226876: /bin/sh './configure' *failed* for ompi/mca/io/romio/romio
configure:226911: WARNING: ROMIO distribution did not configure successfully
configure:227425: checking if MCA component io:romio can compile
configure:227427: result: no
--



b) dump of Open MPI's 'configure' output to the console:
--
checking lustre/lustre_user.h usability... no
checking lustre/lustre_user.h presence... yes
configure: WARNING: lustre/lustre_user.h: present but cannot be compiled
configure: WARNING: lustre/lustre_user.h: check for missing prerequisite 
headers?

configure: WARNING: lustre/lustre_user.h: see the Autoconf documentation
configure: WARNING: lustre/lustre_user.h: section "Present But Cannot Be 
Compiled"

configure: WARNING: lustre/lustre_user.h: proceeding with the compiler's result
configure: WARNING: ##  ##
configure: WARNING: ## Report this to disc...@mpich.org ##
configure: WARNING: ##  ##
checking for lustre/lustre_user.h... no
configure: error: LUSTRE support requested but cannot find lustre/lustre_user.h 
header file

configure: /bin/sh './configure' *failed* for ompi/mca/io/romio/romio
configure: WARNING: ROMIO distribution did not configure successfully
checking if MCA component io:romio can compile... no
--

c) ompi/mca/io/romio/romio's config.log:
--
configure:20962: checking lustre/lustre_user.h usability
configure:20962: icc -std=c99 -c -DNDEBUG -O3 -ip -axAVX,SSE4.2,SSE4.1 -fp-model 
fast=2 -m64 -finline-functions -fno-strict-aliasing -restrict -fexceptions 
-Qoption,cpp,--extended_float_types -pthread 
-I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/hwloc/hwloc172/hwloc/include 
-I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent 
-I/w0/tmp/pk224850/linuxc2_9713/openmpi-1.8.3_linux64_intel/opal/mca/event/libevent2021/libevent/include 
conftest.c >&5

/usr/include/sys/quota.h(221): error: identifier "caddr_t" is undefined
 caddr_t __addr) __THROW;
 ^

compilation aborted for conftest.c (code 2)
configure:20962: $? = 2
configure: failed program was:
| /* confdefs.h */
|