[OMPI devel] Updated prrte submodule on "main" branch

2024-05-02 Thread Jeff Squyres (jsquyres) via devel
https://github.com/open-mpi/ompi/pull/12449 has just been merged, meaning that 
the PRRTE git submodule on Open MPI's main branch now points to the Open MPI 
fork of PRRTE, not the upstream/community PRRTE.  You will either need to get a 
new github clone to pick up the new submodule change, or you will need to run 
the following after updating to the latest tip of main:

git submodule sync
git submodule update --recursive

Note that you may periodically need to run these commands when switching back 
and forth between the main and v5.0.x branches (because the v5.0.x branch PRRTE 
git submodule still points to the upstream/community PRTE).  Per discussion at 
the face-to-face meeting last week, there are no current plans to update the 
v5.0.x PRRTE git submodule away from the upstream/community PRRTE – the plan is 
to ride out the rest of v5.0.x with PRRTE as-is.

The updated PRRTE submodule on main reflects the intent to change for v6.0.x.  
Put differently: PRRTE is now a (hidden) implementation detail of mpirun on 
main.

Documentation and other tighter (re)integration will occur on main over time.

--
Jeff Squyres


Re: [OMPI devel] Github Action to auto-close stale/abandoned Github Issues

2024-02-16 Thread Jeff Squyres (jsquyres) via devel
The auto-close-GitHub-issues bot has been merged 
(https://github.com/open-mpi/ompi/pull/12329).

If you apply the label "State: Awaiting user information" to a Github issue and 
there's no reply in 4 weeks, the issue will be closed with a polite message.
____
From: Jeff Squyres (jsquyres)
Sent: Tuesday, February 13, 2024 12:15 PM
To: Open MPI Developers 
Subject: Github Action to auto-close stale/abandoned Github Issues

Looking for feedback on auto-closing Github Issues: 
https://github.com/open-mpi/ompi/pull/12329.  The rough idea is:

  1.
OMPI community member puts the "State: Awaiting User Information" label on a 
Github Issue (this does not apply to PRs)
  2.
If there's no reply in X=14 days, the bot will emit a comment saying 
(paraphrase) "Hey, it's been 2 weeks, if you don't reply in 2 more weeks, we'll 
auto-close"
  3.
If there's no reply in Y=14 more days, the bot will emit a comment saying 
(paraphrase) "It's been a month with no reply; I'm closing", and then it will 
close the issue.

Thoughts?  Comments?  Suggestions for different X or Y values?

--
Jeff Squyres


[OMPI devel] Github Action to auto-close stale/abandoned Github Issues

2024-02-13 Thread Jeff Squyres (jsquyres) via devel
Looking for feedback on auto-closing Github Issues: 
https://github.com/open-mpi/ompi/pull/12329.  The rough idea is:

  1.
OMPI community member puts the "State: Awaiting User Information" label on a 
Github Issue (this does not apply to PRs)
  2.
If there's no reply in X=14 days, the bot will emit a comment saying 
(paraphrase) "Hey, it's been 2 weeks, if you don't reply in 2 more weeks, we'll 
auto-close"
  3.
If there's no reply in Y=14 more days, the bot will emit a comment saying 
(paraphrase) "It's been a month with no reply; I'm closing", and then it will 
close the issue.

Thoughts?  Comments?  Suggestions for different X or Y values?

--
Jeff Squyres


[OMPI devel] Tuesday Webex meeting schedule

2023-12-19 Thread Jeff Squyres (jsquyres) via devel
Here's the schedule for our next few weekly Webex meetings:


  *   December 26, 2023: canceled (happy holidays!)
  *   January 2, 2024: canceled (happy holidays!)
  *   January 9, 2024: first meeting of 2024

There is a new Webex link for the 2024 meetings; it has been published the 
developer Slack channel (we don't publish the meeting link publicly because 
we've had problems with spammers in the past).  Email me directly if you wish 
to attend and are not in the Slack channel to see the new 2024 meeting link.

Tommy and Wenduo have a proposal for moving to an agenda-driven weekly meeting, 
similar to how OFI (and possibly UCX?) conducts their meetings.  They'll 
present details of how this will work in January.  If everyone agrees, we can 
move to that model (potentially in February?).

--
Jeff Squyres


[OMPI devel] Open MPI BOF at SC'23

2023-11-06 Thread Jeff Squyres (jsquyres) via devel
We're excited to see everyone next week in Denver, Colorado, USA at SC23!

Open MPI will be hosting our usual State of the Union Birds of a Feather (BOF) 
session
 on Wednesday, 15, November, 2023, from 12:15-1:15pm US Mountain time.  The 
event is in-person only; SC does not allow us to livestream.

During the BOF, we'll present the state of Open MPI, where we are, and where 
we're going.  We also use the BOF as an opportunity to directly respond to your 
questions.  We only have an hour; it's really helpful if you submit your 
questions ahead of 
time
 so that we can include them directly in our presentation.  We'll obviously 
take questions in-person, too, and will be available after the presentation as 
well, but chances are: if you have a question, others have the same question.  
So submit your question to 
us
 so that we can include them in the presentation!  

Hope to see you in Denver!

--
Jeff Squyres


[OMPI devel] Fw: MPI 4.1 Published

2023-11-04 Thread Jeff Squyres (jsquyres) via devel
FYI.


From: mpi-forum  on behalf of Martin 
Schulz via mpi-forum 
Sent: Friday, November 3, 2023 7:26 AM
To: Pritchard Jr., Howard via mpi-forum 
Subject: [Mpi-forum] MPI 4.1 Published


Dear MPI Forum Community,



For all of you who could not join us yesterday, I am happy and excited to 
announce that we finalized and then ratified the MPI 4.1 Standard yesterday! 
The document is already online at:



https://www.mpi-forum.org/docs/mpi-4.1/mpi41-report.pdf



Many thanks to all of you for contributing to MPI 4.1 and making it possible to 
get this version out before SC23. Special thanks to Wes and Bill who wrangled 
with git and Latex issues until the very end.



I know this was a lot of work, both regarding new functionality and all the 
clean-up items (which may not look like that much on paper, but I think will 
really help the readability and sustainability of the standard), as well as the 
tough final push! This is really a significant accomplishment and it wouldn’t 
have been possible without the dedication, time and expertise from this group!



If you are at SC23, please join us for the MPI Forum BoF (Tuesday 12:15-13:15 
local Denver time) where we will present MPI 4.1 to the HPC community. If we 
get critical mass, we could also attempt a group photo at this time!



Also, please help publicize the new standard through your various channels! If 
there are any particular ideas that we could do as a whole forum, please let us 
know as well!



Thanks again and I am looking forward to continuing to work with all of you 
towards our next milestones MPI 4.2 and MPI 5.0!



Martin





PS: As we finished our agenda items for this meeting yesterday, the fourth and 
last day of the voting meeting is cancelled. Enjoy the weekend!











--

Prof. Dr. Martin Schulz, Chair of Computer Architecture and Parallel Systems

Department of Informatics, TU-Munich, Boltzmannstraße 3, D-85748 Garching

Member of the Board of Directors at the Leibniz Supercomputing Centre (LRZ)

Email: schu...@in.tum.de




Re: [OMPI devel] Question about the future of C++ API support

2023-10-30 Thread Jeff Squyres (jsquyres) via devel
The C++ bindings were removed from the MPI-3.0 standard in 2012 -- they've been 
officially deleted for 12 years.  The MPI Forum talked extensively with users 
about this before removing them.  We held on to the C++ bindings here in Open 
MPI for a long time, but finally deleted them in 5.0.0.

Do you know if the packages you cited actually use the MPI C++ bindings, or are 
they C++ packages that happen to use MPI C bindings?

One of the reasons that the Forum decided to delete the C++ bindings was 
because C++ app developer community overwhelmingly stated that they used the C 
bindings (because the C++ bindings were almost exactly a 1:1 mapping to the C 
bindings, and didn't offer much more/different functionality than the C 
bindings).  There were a small number of users who actually used the MPI C++ 
bindings, but to my knowledge, they all migrated to the C bindings over time.


From: devel on behalf of Orion Poplawski via devel
Sent: Thursday, October 26, 2023 11:15 PM
To: Open MPI Development
Cc: Orion Poplawski
Subject: [OMPI devel] Question about the future of C++ API support

I see that openmpi 5.0.0 drops the C++ API.  I'm trying to find some
background on this.

In particular, is this likely to be a permanent change?

Also, have there been discussions of users of the C++ API?  In Fedora we
seem to have:

MUSIC-1.1.16-13.20201002git8c6b77a.fc39.src.rpm
boost-1.81.0-9.fc40.src.rpm
bout++-5.0.0-11.fc40.src.rpm
coin-or-Ipopt-3.14.12-2.fc39.src.rpm
combblas-2.0.0-4.fc40.src.rpm
freefem++-4.13-6.fc40.src.rpm
ga-5.8.2-2.fc39.src.rpm
gloo-0.5.0^git20230824.01a0c81-6.fc40.src.rpm
gmsh-4.11.1-6.fc39.src.rpm
hpx-1.9.1-1.fc40.src.rpm
intel-mpi-benchmarks-2021.3-4.fc39.src.rpm
libneurosim-1.2.0-8.20210110.gitafc003f.fc39.src.rpm
mathgl-8.0.1-6.fc39.src.rpm
mfem-4.6-1.fc40.src.rpm
nest-3.4-1.fc39.src.rpm
netcdf-cxx4-4.3.1-9.fc39.src.rpm
netgen-mesher-6.2.2202-7.fc39.src.rpm
orsa-0.7.0-61.fc39.src.rpm
python-steps-3.6.0-30.fc39.src.rpm
scalasca-2.6.1-2.fc38.src.rpm

Thanks.

--
Orion Poplawski
he/him/his  - surely the least important thing about me
IT Systems Manager 720-772-5637
NWRA, Boulder/CoRA Office FAX: 303-415-9702
3380 Mitchell Lane   or...@nwra.com
Boulder, CO 80301 https://www.nwra.com/


Re: [OMPI devel] Asking for a letter of support

2023-10-06 Thread Jeff Squyres (jsquyres) via devel
I see that you also emailed MPICH with the same question.  I'll reply with 
more-or-less the same response: I am not a lawyer, and this is not legal 
advice, but Open MPI has a very permissive license (see the LICENSE file); you 
should be able to use it in your research without any letter of support from 
the Open MPI community.

Note, too, that this mailing list goes to several hundred people around the 
world.  If you are looking for a letter for support to include in a funding 
proposal, you might want to be a bit more targeted and ask someone who is 
familiar with you and/or your specific work.  You might also want to give them 
more than 1 day to respond.

From: devel  on behalf of Diao, Xiaoxu via 
devel 
Sent: Friday, October 6, 2023 11:44 AM
To: devel@lists.open-mpi.org 
Cc: Diao, Xiaoxu 
Subject: [OMPI devel] Asking for a letter of support


Dear Sir/Madam,



My name is Xiaoxu Diao. I am a research associate at The Ohio State University. 
We are in the process of developing a proposal for an SBIR/STTR program. We are 
planning to use your library (Open MPI) in our application. Please let us know 
whom we should contact to obtain a formal letter of support indicating support 
for the use of your library during the Phase I development of our application. 
We would appreciate it if we could receive your response before 10/7/2023.

Thank you.



Best regards,

Xiaoxu Diao




Re: [OMPI devel] MPI ABI effort

2023-08-29 Thread Jeff Squyres (jsquyres) via devel
An interesting point was brought up on the dev Webex today: we should probably 
finish MPI_Count first.

Put differently: the value of the ABI is diminished if Open MPI doesn't support 
MPI_Count.

From: Howard Pritchard 
Sent: Tuesday, August 29, 2023 12:20 PM
To: Open MPI Developers 
Cc: Jeff Squyres (jsquyres) 
Subject: Re: [OMPI devel] MPI ABI effort

LANL would be interested in supporting this feature as well.

Howard

On Mon, Aug 28, 2023 at 9:58 AM Jeff Squyres (jsquyres) via devel 
mailto:devel@lists.open-mpi.org>> wrote:
We got a presentation from the ABI WG (proxied via Quincey from AWS) a few 
months ago.

The proposal looked reasonable.

No one has signed up to do the work yet, but based on what we saw in that 
presentation, the general consensus was "sure, we could probably get on board 
with that."

There's definitely going to be issues to be worked out (e.g., are we going to 
break Open MPI ABI? Maybe offer 2 flavors of ABI? Is this a configure-time 
option, or do we build "both" ways?  ...etc.), but it sounded like the 
community members who heard this proposal were generally in favor of moving in 
this direction.

From: devel 
mailto:devel-boun...@lists.open-mpi.org>> on 
behalf of Gilles Gouaillardet via devel 
mailto:devel@lists.open-mpi.org>>
Sent: Saturday, August 26, 2023 2:20 AM
To: Open MPI Developers 
mailto:devel@lists.open-mpi.org>>
Cc: Gilles Gouaillardet 
mailto:gilles.gouaillar...@gmail.com>>
Subject: [OMPI devel] MPI ABI effort

Folks,

Jeff Hammond and al. published "MPI Application Binary Interface 
Standardization" las week
https://arxiv.org/abs/2308.11214

The paper reads the (C) ABI has already been prototyped natively in MPICH.

Is there any current interest into prototyping this ABI into Open MPI?


Cheers,

Gilles


Re: [OMPI devel] MPI ABI effort

2023-08-28 Thread Jeff Squyres (jsquyres) via devel
We got a presentation from the ABI WG (proxied via Quincey from AWS) a few 
months ago.

The proposal looked reasonable.

No one has signed up to do the work yet, but based on what we saw in that 
presentation, the general consensus was "sure, we could probably get on board 
with that."

There's definitely going to be issues to be worked out (e.g., are we going to 
break Open MPI ABI? Maybe offer 2 flavors of ABI? Is this a configure-time 
option, or do we build "both" ways?  ...etc.), but it sounded like the 
community members who heard this proposal were generally in favor of moving in 
this direction.

From: devel  on behalf of Gilles Gouaillardet 
via devel 
Sent: Saturday, August 26, 2023 2:20 AM
To: Open MPI Developers 
Cc: Gilles Gouaillardet 
Subject: [OMPI devel] MPI ABI effort

Folks,

Jeff Hammond and al. published "MPI Application Binary Interface 
Standardization" las week
https://arxiv.org/abs/2308.11214

The paper reads the (C) ABI has already been prototyped natively in MPICH.

Is there any current interest into prototyping this ABI into Open MPI?


Cheers,

Gilles


[OMPI devel] Open MPI 4.1.6rc1 release candidate posted

2023-08-08 Thread Jeff Squyres (jsquyres) via devel
There have been enough minor fixes to warrant a 4.1.6 release.  We've posted 
4.1.6rc1 tarballs in the usual location: 
https://www.open-mpi.org/software/ompi/v4.1/


Changes since v4.1.5:

  *   Update to properly handle PMIx v>=4.2.3.  Thanks to Bruno Chareyre, 
Github user @sukanka, and Christof Koehler for raising the compatibility issues 
and helping test the fixes.

  *   Fix minor issues and add some minor performance optimizations with OFI 
support.

  *

Support the "striping_factor" and "striping_unit" MPI_Info names recomended by 
the MPI standard for parallel IO.

  *   Fixed some minor issues with UCX support.

  *   Minor optimization for 0-byte MPI_Alltoallw (i.e., make it a no-op).

We would welcome and testing and feedback.

Thanks,

Brian & Jeff


Re: [OMPI devel] 32 bit issues (was: 32 bit support needs a maintainer)

2023-03-16 Thread Jeff Squyres (jsquyres) via devel
Is there any progress on this front?

(I just asked the same question on 
https://github.com/open-mpi/ompi/issues/11409 -- someone else also asked about 
restoring 32-bit support)

v5.0.0 is potentially getting close; we're running out of time if a community 
maintainer wants to preserve 32 bit support.

From: Jeff Squyres (jsquyres) 
Sent: Tuesday, February 7, 2023 4:07 PM
To: Yatindra Vaishnav ; Open MPI Developers 

Subject: 32 bit issues (was: 32 bit support needs a maintainer)

Just try to build Open PMIx -- by itself, not as embedded in Open MPI -- in 32 
bit mode and you'll see the compile failures.

I think​ that the PMIx's configure has a "if building in 32 bit mode, print a 
message that this is not supported and abort" (just like Open MPI).  You will 
need to remove that check in order to be able to compile Open PMIx in 32 bit 
mode.



From: Yatindra Vaishnav 
Sent: Tuesday, February 7, 2023 4:02 PM
To: Open MPI Developers ; Jeff Squyres (jsquyres) 

Subject: RE: [OMPI devel] 32 bit support needs a maintainer


Hi Jeff,

Where can I see the bugs of 32-bit support on OpenPMIx? I see this link:

Memory Leaks · Issue #1276 · openpmix/openpmix 
(github.com)<https://github.com/openpmix/openpmix/issues>



Regards,

--Yatindra.



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows



From: Yatindra Vaishnav via devel<mailto:devel@lists.open-mpi.org>
Sent: Monday, February 6, 2023 1:27 PM
To: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>; 
devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>
Cc: Yatindra Vaishnav<mailto:yatindr...@hotmail.com>
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Sure Jeff, Let me take a look.





From: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>
Sent: Monday, February 6, 2023 1:03 PM
To: Yatindra Vaishnav<mailto:yatindr...@hotmail.com>; 
devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Ok.  OpenPMIx is the "bottom" of the stack of Open MPI --> PRRTE --> OpenPMIx, 
so fixing the 32 bit issues there first is what makes sense -- see 
https://docs.open-mpi.org/en/v5.0.x/installing-open-mpi/required-support-libraries.html#library-dependencies.



We have been trying to get Open MPI v5.0.0 out for quite a while, and seem to 
actually be getting closer to getting over the finish line.  So getting these 
fixes in sooner rather than later would be better.



From: Yatindra Vaishnav 
Sent: Monday, February 6, 2023 2:56 PM
To: Jeff Squyres (jsquyres) ; devel@lists.open-mpi.org 

Subject: RE: [OMPI devel] 32 bit support needs a maintainer



Yes Jeff, I can give roughly 5-8 hours a week. And yes I can take care of 
OpenPMIx bugs first and will look into others afterwards?



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows



From: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>
Sent: Monday, February 6, 2023 11:48 AM
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Ok.  What kind of timeframe do you have to work on this?  Will you be able to 
look into the OpenPMIx 32-bit bugs in the immediate future, and then start 
testing with PRRTE and Open MPI?



From: Yatindra Vaishnav 
Sent: Monday, February 6, 2023 10:57 AM
To: Jeff Squyres (jsquyres) ; devel@lists.open-mpi.org 

Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Hi Jeff,

Nice to your response. And yes I responded to the same mail. I registered 
myself for OpenMPI development community. And I'm aware about the 
responsibilities. And I can setup a 32-bit VMs to do bug fixes and testing. I 
already have a server machine which I can help with.





Get Outlook for iOS<https://aka.ms/o0ukef>



From: Jeff Squyres (jsquyres) 
Sent: Monday, February 6, 2023 7:45 AM
To: devel@lists.open-mpi.org 
Cc: Yatindra Vaishnav 
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Greetings Yatindra; thanks for responding.



Just curious: are you replying in response to the discussion that just came up 
a few days ago on https://github.com/open-mpi/ompi/pull/11282, where we set the 
upcoming Open MPI v5.0's configure script to abort in 32-bit environments?  
I.e., are you part of the Debian community?



Regardless, there are several aspects of 32 bit support that are needed:

  *   Bug fixes in OpenPMIx
  *   Potential bug fixes in PRRTE
  *   Potential bug Open MPI
  *   Testing of all 3 of the above

Specifically: we know that there are 32-bit bugs in OpenPMIx that need to be 
resolved.  As such, we stopped testing 32 bit in all 3 projects quite a while 
ago.  Hence, even when 32-bit support is re-enabled in OpenPMIx, other 32-bit 
bugs may surface in PRRTE and Open MPI that require fixing.



There will also need to be continual testing of all 3 of these in 32-bit 
environments.  Th

Re: [OMPI devel] 5.0.x bug release blockers hard stop

2023-02-22 Thread Jeff Squyres (jsquyres) via devel
This seems like an odd stance to take.  You're basically saying "Damn the 
torpedoes, full speed ahead."  Or, put differently, "I don't care what the bug 
is, unless it's data corruption, we're not calling it a blocker."  Technically, 
that means that I could merge a PR that removes configure.ac, and that would 
not be considered a blocker.  This is especially true in light of what Josh 
said -- we can't even make distribution tarballs right now because there's a 
problem in ROMIO's configure (which has been known for weeks, but no one has 
fixed it yet).  That is definitely a blocker, even though it is not a data 
corruption bug.

I think I would be amenable to an intention more along the lines of "let's 
closely evaluate bugs that come up and see if they really are​ blockers, 
because we really want to get v5.0.x out the door as soon as possible."

From: devel  on behalf of Josh Hursey via 
devel 
Sent: Tuesday, February 21, 2023 7:37 PM
To: Open MPI Developers 
Cc: Josh Hursey 
Subject: Re: [OMPI devel] 5.0.x bug release blockers hard stop

I think that's fine, with the exception of fixing the ROMIO packaging issue - 
which is a blocker (the release will not build without it fixed):
 - https://github.com/open-mpi/ompi/issues/11364

On Tue, Feb 21, 2023 at 4:58 PM Zhang, William via devel 
mailto:devel@lists.open-mpi.org>> wrote:

Hello everyone,



During the weekly meeting today, I proposed that we stop labeling bugs as 
blockers for the 5.0.0 release which was seconded with the exception of data 
corruption bugs. In the interest of having 5.0.0 released, we must put a hard 
stop on diagnosing and ingesting bug fixes. We can have these be brought in 
during later bugfix releases. Due to the low attendance during the meeting 
today and lack of Jeff, Brian, 5.0.x release managers, I wanted to re-iterate 
on this through e-mail to get wider buy-in. Please levy any objections to this 
here.



Thanks,
William


--
Josh Hursey
IBM Spectrum MPI Developer


Re: [OMPI devel] 32 bit issues (was: 32 bit support needs a maintainer)

2023-02-13 Thread Jeff Squyres (jsquyres) via devel
The PMIx community took a different approach than Open MPI: they chose not to 
embed local copies of hwloc and libevent.  In Open MPI, we embed local copies 
of hwloc, libevent, and PMIx, and use those if they cannot otherwise be found 
on the local system.

We here in the Open MPI community cannot find any customers or users who are 
running MPI / HPC applications in 32 bit environments any more.

You posted here on the devel list at the just about the same time (last week) 
as the Debian community pushed back on us for disabling 32-bit builds for the 
upcoming v5.0.x series -- see the conversation that started at 
https://github.com/open-mpi/ompi/pull/11282#issuecomment-1416364660.  This is 
why I asked you if you were part of the Debian community.

The tl;dr of that lengthy discussion on the PR is: the Debian community doesn't 
have any known users running HPC apps in 32 bit environments, but they do have 
a lot of intertia of software packages that are built in 32 bit environments, 
and many of them have (optional) MPI support.  Disabling MPI for all those 
Debian packages in 32 bit environments is a lot of work for them.  They'd 
prefer if we just keep supporting 32 bit.

Hence, if you're not part of the Debian community, but feel like supporting 
them, they're the primary (and only) use case that we have for 32 bit PMIx + 
Open MPI support for v5.0.x.

From: Yatindra Vaishnav 
Sent: Monday, February 13, 2023 1:59 PM
To: Jeff Squyres (jsquyres) ; Open MPI Developers 

Subject: RE: 32 bit issues (was: 32 bit support needs a maintainer)


Ok Jeff,

I understand that openpmix is different community. I thought this is the same 
group who does the review and all. I also able to build

the OpenPMIx independently from OpenMPI but needed the hwloc and libevent code 
to compile. With Brice’ concern whatever new

features are being implement is not at all needed for 32-bit environment? And 
another point is earlier 32-bit machines are obsoleted

and not used in clusters anymore? If not then no need to maintain 32-bit and I 
can start contributing to 5.0.x onwards; if that is what

make best use of my time. (I was under impression that there 32-bit clusters 
which may need to use the latest and greatest features

available in the OpenMPI).



Let me know.



Regards,

--Yatindra.



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows



From: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>
Sent: Monday, February 13, 2023 10:21 AM
To: Yatindra Vaishnav<mailto:yatindr...@hotmail.com>; Open MPI 
Developers<mailto:devel@lists.open-mpi.org>
Subject: Re: 32 bit issues (was: 32 bit support needs a maintainer)



There were a number of compilation errors for 32 bit environments in PMIx.  I 
don't know what they are offhand, but if you get compile errors when building 
PMIx in 32 bit environments, that's the errors that need to be fixed.



Also, note that PMIx is technically a different community than Open MPI: you'll 
need to raise a PR in their repo.  The fixes will need to be applied to their 
master branch (which is what Open MPI's main branch uses) and to their v4.2.x 
branch (which is what Open MPI's v5.0.x branch uses).



Brice is raising more-or-less the same point that we have been raising here in 
Open MPI: HPC doesn't generally use 32 bit environments any more, and it's 
extra work to fix / maintain the code for environments where it is not used.  
If you're electing to do that extra work, that's fine.



From: Yatindra Vaishnav 
Sent: Monday, February 13, 2023 11:58 AM
To: Open MPI Developers ; Jeff Squyres (jsquyres) 

Subject: RE: 32 bit issues (was: 32 bit support needs a maintainer)



Hi Jeff,

Did you get a chance to look at the issue I shared? Is it the same issue? I 
fixed all compilation issues in PMIx.

Once you let me know I’ll do verification and send code for review.



--Yatindra.



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows



From: Yatindra Vaishnav via devel<mailto:devel@lists.open-mpi.org>
Sent: Sunday, February 12, 2023 2:08 PM
To: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>; Open MPI 
Developers<mailto:devel@lists.open-mpi.org>
Cc: Yatindra Vaishnav<mailto:yatindr...@hotmail.com>
Subject: Re: [OMPI devel] 32 bit issues (was: 32 bit support needs a maintainer)



Hi Jeff,

I created 32-bit environment and was able to see the issue in in PMIx. I’m 
attaching the screen shot here, please confirm.

[cid:image004.png@01D93EEB.3368C080]



Regards,

--Yatindra.



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows



From: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>
Sent: Tuesday, February 7, 2023 2:13 PM
To: Yatindra Vaishnav<mailto:yatindr...@hotmail.com>; Open MPI 
Developers<mailto:devel@lists.open-mpi.org>
Subject: Re: 32 bit issues (was: 32 bit support needs a maintainer)



Be su

Re: [OMPI devel] 32 bit issues (was: 32 bit support needs a maintainer)

2023-02-13 Thread Jeff Squyres (jsquyres) via devel
There were a number of compilation errors for 32 bit environments in PMIx.  I 
don't know what they are offhand, but if you get compile errors when building 
PMIx in 32 bit environments, that's the errors that need to be fixed.

Also, note that PMIx is technically a different community than Open MPI: you'll 
need to raise a PR in their repo.  The fixes will need to be applied to their 
master branch (which is what Open MPI's main branch uses) and to their v4.2.x 
branch (which is what Open MPI's v5.0.x branch uses).

Brice is raising more-or-less the same point that we have been raising here in 
Open MPI: HPC doesn't generally use 32 bit environments any more, and it's 
extra work to fix / maintain the code for environments where it is not used.  
If you're electing to do that extra work, that's fine.

From: Yatindra Vaishnav 
Sent: Monday, February 13, 2023 11:58 AM
To: Open MPI Developers ; Jeff Squyres (jsquyres) 

Subject: RE: 32 bit issues (was: 32 bit support needs a maintainer)


Hi Jeff,

Did you get a chance to look at the issue I shared? Is it the same issue? I 
fixed all compilation issues in PMIx.

Once you let me know I’ll do verification and send code for review.



--Yatindra.



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows



From: Yatindra Vaishnav via devel<mailto:devel@lists.open-mpi.org>
Sent: Sunday, February 12, 2023 2:08 PM
To: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>; Open MPI 
Developers<mailto:devel@lists.open-mpi.org>
Cc: Yatindra Vaishnav<mailto:yatindr...@hotmail.com>
Subject: Re: [OMPI devel] 32 bit issues (was: 32 bit support needs a maintainer)



Hi Jeff,

I created 32-bit environment and was able to see the issue in in PMIx. I’m 
attaching the screen shot here, please confirm.

[cid:image004.png@01D93EEB.3368C080]



Regards,

--Yatindra.



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows



From: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>
Sent: Tuesday, February 7, 2023 2:13 PM
To: Yatindra Vaishnav<mailto:yatindr...@hotmail.com>; Open MPI 
Developers<mailto:devel@lists.open-mpi.org>
Subject: Re: 32 bit issues (was: 32 bit support needs a maintainer)



Be sure to see 
https://github.com/open-mpi/ompi/pull/11282#issuecomment-1421523518.





From: Jeff Squyres (jsquyres) 
Sent: Tuesday, February 7, 2023 4:07 PM
To: Yatindra Vaishnav ; Open MPI Developers 

Subject: 32 bit issues (was: 32 bit support needs a maintainer)



Just try to build Open PMIx -- by itself, not as embedded in Open MPI -- in 32 
bit mode and you'll see the compile failures.



I think​ that the PMIx's configure has a "if building in 32 bit mode, print a 
message that this is not supported and abort" (just like Open MPI).  You will 
need to remove that check in order to be able to compile Open PMIx in 32 bit 
mode.







From: Yatindra Vaishnav 
Sent: Tuesday, February 7, 2023 4:02 PM
To: Open MPI Developers ; Jeff Squyres (jsquyres) 

Subject: RE: [OMPI devel] 32 bit support needs a maintainer



Hi Jeff,

Where can I see the bugs of 32-bit support on OpenPMIx? I see this link:

Memory Leaks · Issue #1276 · openpmix/openpmix 
(github.com)<https://github.com/openpmix/openpmix/issues>



Regards,

--Yatindra.



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows



From: Yatindra Vaishnav via devel<mailto:devel@lists.open-mpi.org>
Sent: Monday, February 6, 2023 1:27 PM
To: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>; 
devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>
Cc: Yatindra Vaishnav<mailto:yatindr...@hotmail.com>
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Sure Jeff, Let me take a look.





From: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>
Sent: Monday, February 6, 2023 1:03 PM
To: Yatindra Vaishnav<mailto:yatindr...@hotmail.com>; 
devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Ok.  OpenPMIx is the "bottom" of the stack of Open MPI --> PRRTE --> OpenPMIx, 
so fixing the 32 bit issues there first is what makes sense -- see 
https://docs.open-mpi.org/en/v5.0.x/installing-open-mpi/required-support-libraries.html#library-dependencies.



We have been trying to get Open MPI v5.0.0 out for quite a while, and seem to 
actually be getting closer to getting over the finish line.  So getting these 
fixes in sooner rather than later would be better.



From: Yatindra Vaishnav 
Sent: Monday, February 6, 2023 2:56 PM
To: Jeff Squyres (jsquyres) ; devel@lists.open-mpi.org 

Subject: RE: [OMPI devel] 32 bit support needs a maintainer



Yes Jeff, I can give roughly 5-8 hours a week. And yes I can take care of 
OpenPMIx bugs first and will look into others afterwards?



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986&

Re: [OMPI devel] 32 bit issues (was: 32 bit support needs a maintainer)

2023-02-07 Thread Jeff Squyres (jsquyres) via devel
Be sure to see 
https://github.com/open-mpi/ompi/pull/11282#issuecomment-1421523518.


From: Jeff Squyres (jsquyres) 
Sent: Tuesday, February 7, 2023 4:07 PM
To: Yatindra Vaishnav ; Open MPI Developers 

Subject: 32 bit issues (was: 32 bit support needs a maintainer)

Just try to build Open PMIx -- by itself, not as embedded in Open MPI -- in 32 
bit mode and you'll see the compile failures.

I think​ that the PMIx's configure has a "if building in 32 bit mode, print a 
message that this is not supported and abort" (just like Open MPI).  You will 
need to remove that check in order to be able to compile Open PMIx in 32 bit 
mode.



From: Yatindra Vaishnav 
Sent: Tuesday, February 7, 2023 4:02 PM
To: Open MPI Developers ; Jeff Squyres (jsquyres) 

Subject: RE: [OMPI devel] 32 bit support needs a maintainer


Hi Jeff,

Where can I see the bugs of 32-bit support on OpenPMIx? I see this link:

Memory Leaks · Issue #1276 · openpmix/openpmix 
(github.com)<https://github.com/openpmix/openpmix/issues>



Regards,

--Yatindra.



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows



From: Yatindra Vaishnav via devel<mailto:devel@lists.open-mpi.org>
Sent: Monday, February 6, 2023 1:27 PM
To: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>; 
devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>
Cc: Yatindra Vaishnav<mailto:yatindr...@hotmail.com>
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Sure Jeff, Let me take a look.





From: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>
Sent: Monday, February 6, 2023 1:03 PM
To: Yatindra Vaishnav<mailto:yatindr...@hotmail.com>; 
devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Ok.  OpenPMIx is the "bottom" of the stack of Open MPI --> PRRTE --> OpenPMIx, 
so fixing the 32 bit issues there first is what makes sense -- see 
https://docs.open-mpi.org/en/v5.0.x/installing-open-mpi/required-support-libraries.html#library-dependencies.



We have been trying to get Open MPI v5.0.0 out for quite a while, and seem to 
actually be getting closer to getting over the finish line.  So getting these 
fixes in sooner rather than later would be better.



From: Yatindra Vaishnav 
Sent: Monday, February 6, 2023 2:56 PM
To: Jeff Squyres (jsquyres) ; devel@lists.open-mpi.org 

Subject: RE: [OMPI devel] 32 bit support needs a maintainer



Yes Jeff, I can give roughly 5-8 hours a week. And yes I can take care of 
OpenPMIx bugs first and will look into others afterwards?



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows



From: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>
Sent: Monday, February 6, 2023 11:48 AM
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Ok.  What kind of timeframe do you have to work on this?  Will you be able to 
look into the OpenPMIx 32-bit bugs in the immediate future, and then start 
testing with PRRTE and Open MPI?



From: Yatindra Vaishnav 
Sent: Monday, February 6, 2023 10:57 AM
To: Jeff Squyres (jsquyres) ; devel@lists.open-mpi.org 

Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Hi Jeff,

Nice to your response. And yes I responded to the same mail. I registered 
myself for OpenMPI development community. And I'm aware about the 
responsibilities. And I can setup a 32-bit VMs to do bug fixes and testing. I 
already have a server machine which I can help with.





Get Outlook for iOS<https://aka.ms/o0ukef>



From: Jeff Squyres (jsquyres) 
Sent: Monday, February 6, 2023 7:45 AM
To: devel@lists.open-mpi.org 
Cc: Yatindra Vaishnav 
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Greetings Yatindra; thanks for responding.



Just curious: are you replying in response to the discussion that just came up 
a few days ago on https://github.com/open-mpi/ompi/pull/11282, where we set the 
upcoming Open MPI v5.0's configure script to abort in 32-bit environments?  
I.e., are you part of the Debian community?



Regardless, there are several aspects of 32 bit support that are needed:

  *   Bug fixes in OpenPMIx
  *   Potential bug fixes in PRRTE
  *   Potential bug Open MPI
  *   Testing of all 3 of the above

Specifically: we know that there are 32-bit bugs in OpenPMIx that need to be 
resolved.  As such, we stopped testing 32 bit in all 3 projects quite a while 
ago.  Hence, even when 32-bit support is re-enabled in OpenPMIx, other 32-bit 
bugs may surface in PRRTE and Open MPI that require fixing.



There will also need to be continual testing of all 3 of these in 32-bit 
environments.  This likely doesn't mean writing new tests, but rather running 
at least some subset of our existing tests in 32 bit environments to ensure 
that everything works properly.



Is this something that you would be able to provi

[OMPI devel] 32 bit issues (was: 32 bit support needs a maintainer)

2023-02-07 Thread Jeff Squyres (jsquyres) via devel
Just try to build Open PMIx -- by itself, not as embedded in Open MPI -- in 32 
bit mode and you'll see the compile failures.

I think​ that the PMIx's configure has a "if building in 32 bit mode, print a 
message that this is not supported and abort" (just like Open MPI).  You will 
need to remove that check in order to be able to compile Open PMIx in 32 bit 
mode.



From: Yatindra Vaishnav 
Sent: Tuesday, February 7, 2023 4:02 PM
To: Open MPI Developers ; Jeff Squyres (jsquyres) 

Subject: RE: [OMPI devel] 32 bit support needs a maintainer


Hi Jeff,

Where can I see the bugs of 32-bit support on OpenPMIx? I see this link:

Memory Leaks · Issue #1276 · openpmix/openpmix 
(github.com)<https://github.com/openpmix/openpmix/issues>



Regards,

--Yatindra.



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows



From: Yatindra Vaishnav via devel<mailto:devel@lists.open-mpi.org>
Sent: Monday, February 6, 2023 1:27 PM
To: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>; 
devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>
Cc: Yatindra Vaishnav<mailto:yatindr...@hotmail.com>
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Sure Jeff, Let me take a look.





From: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>
Sent: Monday, February 6, 2023 1:03 PM
To: Yatindra Vaishnav<mailto:yatindr...@hotmail.com>; 
devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Ok.  OpenPMIx is the "bottom" of the stack of Open MPI --> PRRTE --> OpenPMIx, 
so fixing the 32 bit issues there first is what makes sense -- see 
https://docs.open-mpi.org/en/v5.0.x/installing-open-mpi/required-support-libraries.html#library-dependencies.



We have been trying to get Open MPI v5.0.0 out for quite a while, and seem to 
actually be getting closer to getting over the finish line.  So getting these 
fixes in sooner rather than later would be better.



From: Yatindra Vaishnav 
Sent: Monday, February 6, 2023 2:56 PM
To: Jeff Squyres (jsquyres) ; devel@lists.open-mpi.org 

Subject: RE: [OMPI devel] 32 bit support needs a maintainer



Yes Jeff, I can give roughly 5-8 hours a week. And yes I can take care of 
OpenPMIx bugs first and will look into others afterwards?



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows



From: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>
Sent: Monday, February 6, 2023 11:48 AM
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Ok.  What kind of timeframe do you have to work on this?  Will you be able to 
look into the OpenPMIx 32-bit bugs in the immediate future, and then start 
testing with PRRTE and Open MPI?



From: Yatindra Vaishnav 
Sent: Monday, February 6, 2023 10:57 AM
To: Jeff Squyres (jsquyres) ; devel@lists.open-mpi.org 

Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Hi Jeff,

Nice to your response. And yes I responded to the same mail. I registered 
myself for OpenMPI development community. And I'm aware about the 
responsibilities. And I can setup a 32-bit VMs to do bug fixes and testing. I 
already have a server machine which I can help with.





Get Outlook for iOS<https://aka.ms/o0ukef>



From: Jeff Squyres (jsquyres) 
Sent: Monday, February 6, 2023 7:45 AM
To: devel@lists.open-mpi.org 
Cc: Yatindra Vaishnav 
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Greetings Yatindra; thanks for responding.



Just curious: are you replying in response to the discussion that just came up 
a few days ago on https://github.com/open-mpi/ompi/pull/11282, where we set the 
upcoming Open MPI v5.0's configure script to abort in 32-bit environments?  
I.e., are you part of the Debian community?



Regardless, there are several aspects of 32 bit support that are needed:

  *   Bug fixes in OpenPMIx
  *   Potential bug fixes in PRRTE
  *   Potential bug Open MPI
  *   Testing of all 3 of the above

Specifically: we know that there are 32-bit bugs in OpenPMIx that need to be 
resolved.  As such, we stopped testing 32 bit in all 3 projects quite a while 
ago.  Hence, even when 32-bit support is re-enabled in OpenPMIx, other 32-bit 
bugs may surface in PRRTE and Open MPI that require fixing.



There will also need to be continual testing of all 3 of these in 32-bit 
environments.  This likely doesn't mean writing new tests, but rather running 
at least some subset of our existing tests in 32 bit environments to ensure 
that everything works properly.



Is this something that you would be able to provide?



From: devel  on behalf of Yatindra Vaishnav 
via devel 
Sent: Sunday, February 5, 2023 2:32 AM
To: devel@lists.open-mpi.org 
Cc: Yatindra Vaishnav 
Subject: [OMPI devel] 32 bit support needs a maintainer



Hi All,

I’m Yatindra I would like to maintain this 32-bit support. I know I did c

Re: [OMPI devel] 32 bit support needs a maintainer

2023-02-06 Thread Jeff Squyres (jsquyres) via devel
Ok.  OpenPMIx is the "bottom" of the stack of Open MPI --> PRRTE --> OpenPMIx, 
so fixing the 32 bit issues there first is what makes sense -- see 
https://docs.open-mpi.org/en/v5.0.x/installing-open-mpi/required-support-libraries.html#library-dependencies.

We have been trying to get Open MPI v5.0.0 out for quite a while, and seem to 
actually be getting closer to getting over the finish line.  So getting these 
fixes in sooner rather than later would be better.

From: Yatindra Vaishnav 
Sent: Monday, February 6, 2023 2:56 PM
To: Jeff Squyres (jsquyres) ; devel@lists.open-mpi.org 

Subject: RE: [OMPI devel] 32 bit support needs a maintainer


Yes Jeff, I can give roughly 5-8 hours a week. And yes I can take care of 
OpenPMIx bugs first and will look into others afterwards?



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows



From: Jeff Squyres (jsquyres)<mailto:jsquy...@cisco.com>
Sent: Monday, February 6, 2023 11:48 AM
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Ok.  What kind of timeframe do you have to work on this?  Will you be able to 
look into the OpenPMIx 32-bit bugs in the immediate future, and then start 
testing with PRRTE and Open MPI?



From: Yatindra Vaishnav 
Sent: Monday, February 6, 2023 10:57 AM
To: Jeff Squyres (jsquyres) ; devel@lists.open-mpi.org 

Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Hi Jeff,

Nice to your response. And yes I responded to the same mail. I registered 
myself for OpenMPI development community. And I'm aware about the 
responsibilities. And I can setup a 32-bit VMs to do bug fixes and testing. I 
already have a server machine which I can help with.





Get Outlook for iOS<https://aka.ms/o0ukef>



From: Jeff Squyres (jsquyres) 
Sent: Monday, February 6, 2023 7:45 AM
To: devel@lists.open-mpi.org 
Cc: Yatindra Vaishnav 
Subject: Re: [OMPI devel] 32 bit support needs a maintainer



Greetings Yatindra; thanks for responding.



Just curious: are you replying in response to the discussion that just came up 
a few days ago on https://github.com/open-mpi/ompi/pull/11282, where we set the 
upcoming Open MPI v5.0's configure script to abort in 32-bit environments?  
I.e., are you part of the Debian community?



Regardless, there are several aspects of 32 bit support that are needed:

  *   Bug fixes in OpenPMIx
  *   Potential bug fixes in PRRTE
  *   Potential bug Open MPI
  *   Testing of all 3 of the above

Specifically: we know that there are 32-bit bugs in OpenPMIx that need to be 
resolved.  As such, we stopped testing 32 bit in all 3 projects quite a while 
ago.  Hence, even when 32-bit support is re-enabled in OpenPMIx, other 32-bit 
bugs may surface in PRRTE and Open MPI that require fixing.



There will also need to be continual testing of all 3 of these in 32-bit 
environments.  This likely doesn't mean writing new tests, but rather running 
at least some subset of our existing tests in 32 bit environments to ensure 
that everything works properly.



Is this something that you would be able to provide?



From: devel  on behalf of Yatindra Vaishnav 
via devel 
Sent: Sunday, February 5, 2023 2:32 AM
To: devel@lists.open-mpi.org 
Cc: Yatindra Vaishnav 
Subject: [OMPI devel] 32 bit support needs a maintainer



Hi All,

I’m Yatindra I would like to maintain this 32-bit support. I know I did come 
late in the party but would love to help for this.



Regards,

--Yatindra.



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows






Re: [OMPI devel] 32 bit support needs a maintainer

2023-02-06 Thread Jeff Squyres (jsquyres) via devel
Ok.  What kind of timeframe do you have to work on this?  Will you be able to 
look into the OpenPMIx 32-bit bugs in the immediate future, and then start 
testing with PRRTE and Open MPI?

From: Yatindra Vaishnav 
Sent: Monday, February 6, 2023 10:57 AM
To: Jeff Squyres (jsquyres) ; devel@lists.open-mpi.org 

Subject: Re: [OMPI devel] 32 bit support needs a maintainer

Hi Jeff,
Nice to your response. And yes I responded to the same mail. I registered 
myself for OpenMPI development community. And I'm aware about the 
responsibilities. And I can setup a 32-bit VMs to do bug fixes and testing. I 
already have a server machine which I can help with.


Get Outlook for iOS<https://aka.ms/o0ukef>

From: Jeff Squyres (jsquyres) 
Sent: Monday, February 6, 2023 7:45 AM
To: devel@lists.open-mpi.org 
Cc: Yatindra Vaishnav 
Subject: Re: [OMPI devel] 32 bit support needs a maintainer

Greetings Yatindra; thanks for responding.

Just curious: are you replying in response to the discussion that just came up 
a few days ago on https://github.com/open-mpi/ompi/pull/11282, where we set the 
upcoming Open MPI v5.0's configure script to abort in 32-bit environments?  
I.e., are you part of the Debian community?

Regardless, there are several aspects of 32 bit support that are needed:

  *   Bug fixes in OpenPMIx
  *   Potential bug fixes in PRRTE
  *   Potential bug Open MPI
  *   Testing of all 3 of the above

Specifically: we know that there are 32-bit bugs in OpenPMIx that need to be 
resolved.  As such, we stopped testing 32 bit in all 3 projects quite a while 
ago.  Hence, even when 32-bit support is re-enabled in OpenPMIx, other 32-bit 
bugs may surface in PRRTE and Open MPI that require fixing.

There will also need to be continual testing of all 3 of these in 32-bit 
environments.  This likely doesn't mean writing new tests, but rather running 
at least some subset of our existing tests in 32 bit environments to ensure 
that everything works properly.

Is this something that you would be able to provide?

From: devel  on behalf of Yatindra Vaishnav 
via devel 
Sent: Sunday, February 5, 2023 2:32 AM
To: devel@lists.open-mpi.org 
Cc: Yatindra Vaishnav 
Subject: [OMPI devel] 32 bit support needs a maintainer


Hi All,

I’m Yatindra I would like to maintain this 32-bit support. I know I did come 
late in the party but would love to help for this.



Regards,

--Yatindra.



Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows




Re: [OMPI devel] 32 bit support needs a maintainer

2023-02-06 Thread Jeff Squyres (jsquyres) via devel
Greetings Yatindra; thanks for responding.

Just curious: are you replying in response to the discussion that just came up 
a few days ago on https://github.com/open-mpi/ompi/pull/11282, where we set the 
upcoming Open MPI v5.0's configure script to abort in 32-bit environments?  
I.e., are you part of the Debian community?

Regardless, there are several aspects of 32 bit support that are needed:

  *   Bug fixes in OpenPMIx
  *   Potential bug fixes in PRRTE
  *   Potential bug Open MPI
  *   Testing of all 3 of the above

Specifically: we know that there are 32-bit bugs in OpenPMIx that need to be 
resolved.  As such, we stopped testing 32 bit in all 3 projects quite a while 
ago.  Hence, even when 32-bit support is re-enabled in OpenPMIx, other 32-bit 
bugs may surface in PRRTE and Open MPI that require fixing.

There will also need to be continual testing of all 3 of these in 32-bit 
environments.  This likely doesn't mean writing new tests, but rather running 
at least some subset of our existing tests in 32 bit environments to ensure 
that everything works properly.

Is this something that you would be able to provide?

From: devel  on behalf of Yatindra Vaishnav 
via devel 
Sent: Sunday, February 5, 2023 2:32 AM
To: devel@lists.open-mpi.org 
Cc: Yatindra Vaishnav 
Subject: [OMPI devel] 32 bit support needs a maintainer


Hi All,

I’m Yatindra I would like to maintain this 32-bit support. I know I did come 
late in the party but would love to help for this.



Regards,

--Yatindra.



Sent from Mail for Windows




[OMPI devel] 32 bit support needs a maintainer

2023-01-10 Thread Jeff Squyres (jsquyres) via devel
OMPI Developers --

Per https://github.com/open-mpi/ompi/pull/11282, the plan for Open MPI v5.0.x 
is to no longer support 32-bit builds.  configure will abort the build if it 
detects that sizeof(void*)==4.

The reason for this is that there is no one maintaining (or testing) 32 bit 
builds.  With some internal discussions, we're unaware of anyone -- maintainer 
or customer/user -- who cares about MPI for 32 bit platforms any more.  Heck, 
even people doing experimental HPC clusters on Raspberry Pi (and similar) 
platforms are 64 bit these days.

Unless someone steps up to test and maintain 32 bit platforms before Friday, 
we're going to merge #11282 and disable 32-bit builds for v5.0.x.

To be clear: we don't plan to remove the 32-bit infrastructure in the code base 
for a while.  It may take a while before someone who cares about 32 bit notices 
that Open MPI has gone in this direction.  So we'll wait for some period of 
time after 5.0.0 has been released before removing any 32-bit infrastructure 
from Open MPI's code base.  But we won't be changing our behavior, either: no 
one has been testing or maintaining 32-bit builds for quite some time.

--
Jeff Squyres
jsquy...@cisco.com


Re: [OMPI devel] How do you generate your FAQ pages?

2022-12-21 Thread Jeff Squyres (jsquyres) via devel
Are you asking about the old FAQ at https://www.open-mpi.org/faq/?

Or are you asking about the new docs for the upcoming v5.0.x (and beyond) at 
https://docs.open-mpi.org/?

The old FAQ is basically hand-written PHP.  When we started the FAQ, we 
couldn't find any suitable tools/libraries to do a FAQ in the context of our 
existing web site, so we wrote a handful of helper PHP functions to render all 
the FAQ questions/answers.  It supports a fairly primitive / limited wiki-style 
markup for the answers.  Over time, it has definitely shown its age and lack of 
features.  It has no Open MPI version separation for the answers, which makes 
organization difficult, especially if the answer to a given FAQ topic changes 
with different versions of Open MPI.

It has other unattractive aspects, too, in that Open MPI's docs are spread 
across multiple sources: the README, INSTALL, HACKING files, nroff (!!) man 
pages, the FAQ, the "Getting help" page on the web site, ... (and possibly some 
other places I'm not remembering off the top of my head).

That being said, you're welcome to steal the PHP from the current web site and 
adapt it to your needs.

For >= v5.0.0, we consolidated all​ the docs into a tree of ReStructuredText 
(RST) files under docs/ in the main git repo.  RST is like Markdown on 
steroids.  It's nowhere near as complicated (or powerful) as LaTeX, for 
example, but it's definitely more capable that Markdown (e.g., you can include 
files, have actual cross-references, use macros, etc.).  See 
https://docs.open-mpi.org/en/v5.0.x/developers/rst-for-markdown-expats.html for 
the RST guide we provide to the OMPI developers.

These RST docs are rendered by the Sphinx tool into HTML and nroff for the man 
pages.  The HTML is self-contained, so it can be hosted on a web site or viewed 
locally in a web browser with no web server (and is therefore included in the 
v5.0.x tarballs).  The ReadTheDocs.io site does free hosting of RST/Sphinx 
docs, and has excellent integration with Github (e.g., it renders pull requests 
into sandbox areas, supports per-branch and tag [i.e., per version and series] 
docs, and automatically re-renders a branch's HTML whenever a PR is merged).

We're not 100% complete in converting all the old content to RST yet -- there's 
still a few FAQ sections still in "FAQ" format (that content will eventually be 
folded into the same style as the rest of the documentation), and we still need 
some formal documentation of PRRTE and PMIx, but PR's for the beginnings of 
those docs are pending.

If you're looking at doing documentation for some project (FAQ-style or 
otherwise), I would recommend using RST/Springs/ReadTheDocs.io (vs. hand-coding 
some janky PHP solution on your own).  I've been very happy with it.

--
Jeff Squyres
jsquy...@cisco.com

From: devel  on behalf of Paul H. Hargrove 
via devel 
Sent: Tuesday, December 20, 2022 7:09 PM
To: Open MPI Developers 
Cc: Paul H. Hargrove ; Dan Bonachea 
Subject: [OMPI devel] How do you generate your FAQ pages?

Sorry for the somewhat off-topic question:
What tool(s) are you using to generate web pages for your wonderfully organized 
FAQ?

-Paul

--
Paul H. Hargrove mailto:phhargr...@lbl.gov>>
Pronouns: he, him, his
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department
Lawrence Berkeley National Laboratory


[OMPI devel] No Tuesday meeting on 22 Nov

2022-11-25 Thread Jeff Squyres (jsquyres) via devel
Due to the US Thanksgiving holiday week, we will not have the regular weekly 
Tuesday Open MPI webex next Tuesday (22 Nov).

--
Jeff Squyres
jsquy...@cisco.com


[OMPI devel] Regularly scheduled Open MPI webex next Tuesday

2022-11-10 Thread Jeff Squyres (jsquyres) via devel
Gentle reminder: even though SC is occurring next week, we'll still have the 
regular Tuesday Open MPI webex on 15 Nov (at the regular time -- 11am US 
Eastern -- and with the regular Webex link).

--
Jeff Squyres
jsquy...@cisco.com


Re: [OMPI devel] Fwd: --mca btl_base_verbose 30 not working in version 5.0

2022-11-07 Thread Jeff Squyres (jsquyres) via devel
Sorry; I missed that this email came in a week ago.  

The "btl_base_verbose" MCA param only works on the BTL components.  The Linux 
"hostname(1)" command is not an MPI application, and therefore does not utilize 
any of the BTL components.  Hence, you can set btl_base_verbose to whatever you 
want, but it'll be ignored by non-MPI applications (but is harmless).

--
Jeff Squyres
jsquy...@cisco.com

From: devel  on behalf of 龙龙 via devel 

Sent: Sunday, October 30, 2022 10:34 AM
To: devel@lists.open-mpi.org 
Cc: 龙龙 
Subject: [OMPI devel] Fwd: --mca btl_base_verbose 30 not working in version 5.0



-- Forwarded message -
发件人: mrlong mailto:mrlong...@gmail.com>>
Date: 2022年10月30日周日 22:03
Subject: --mca btl_base_verbose 30 not working in version 5.0
To: mailto:us...@lists.open-mpi.org>>



mpirun --mca btl self,sm,tcp --mca btl_base_verbose 30 -np 2 --machinefile 
hostfile  hostname

Why this sentence does not print IP addresses are routable in openmpi 5.0.0.rc9?



[OMPI devel] Github CI change on "main"

2022-09-23 Thread Jeff Squyres (jsquyres) via devel
We made a change to the github actions CI on "main" this morning.

If your PR is stuck waiting for the "Check Commits" CI to run (i.e., it just 
shows a yellow dot and doesn't change), please rebase your PR to the HEAD of 
main, and then that CI should run.

Sorry for the hassle!

--
Jeff Squyres
jsquy...@cisco.com


Re: [OMPI devel] Component configure change proposal

2022-09-06 Thread Jeff Squyres (jsquyres) via devel
My vote: yes, let's remove this functionality.

--
Jeff Squyres
jsquy...@cisco.com

From: devel  on behalf of Barrett, Brian via 
devel 
Sent: Tuesday, September 6, 2022 2:25 PM
To: Open MPI Developers 
Cc: Barrett, Brian 
Subject: [OMPI devel] Component configure change proposal


Hi all -



I filed https://github.com/open-mpi/ompi/pull/10769 this morning, proposing to 
remove two capabilities from OMPI's configure script:



  1.  The ability to take an OMPI tarball, and drop in a component source tree, 
then run configure to build OMPI with that component.
  2.  The ability to use full configure scripts (instead of configure.m4 stubs 
like most components today).



Note that both functionalities have been broken for at least 3 months in main, 
although possibly in a way that would only be somewhat broken and may actually 
result in a working build (but with many shell errors during configure).  Since 
we don’t have good tests, have provably broken this path recently, we should 
remove it unless there’s a clear user today.  So this is really a call for any 
users of this functionality to identify themselves.  If I don’t hear back by 
Sept 19, I will consider silence to be assent.



Brian


[OMPI devel] Open MPI Java MPI bindings

2022-08-09 Thread Jeff Squyres (jsquyres) via devel
During a planning meeting for Open MPI v5.0.0 today, the question came up: is 
anyone using the Open MPI Java bindings?

These bindings are not​ official MPI Forum bindings -- they are an Open 
MPI-specific extension.  They were added a few years ago as a result of a 
research project.

We ask this question because we're wondering if it's worthwhile to bring these 
bindings forward to the v5.0.x series, or whether we should remove them from 
v5.0.x, and just leave them available back in the v4.0.x and v4.1.x series.

Please reply here to this list if you are using the Open MPI Java bindings, or 
know of anyone who is using them.

Thank you!

--
Jeff Squyres
jsquy...@cisco.com


Re: [OMPI devel] How to progress MPI_Recv using custom BTL for NIC under development

2022-08-03 Thread Jeff Squyres (jsquyres) via devel
Glad you solved the first issue!

With respect to debugging, if you don't have a parallel debugger, you can do 
something like this: 
https://www.open-mpi.org/faq/?category=debugging#serial-debuggers

If you haven't done so already, I highly suggest configuring Open MPI with 
"CFLAGS=-g -O0".

As for the modex, it does actually use TCP under the covers, but that shouldn't 
matter to you: the main point is that the BTL is not used for exchanging modex 
information.  Hence, whatever your BTL module puts into the modex and gets out 
of the modex should happen asynchronously without involving the BTL.

--
Jeff Squyres
jsquy...@cisco.com


From: devel  on behalf of Michele Martinelli 
via devel 
Sent: Wednesday, August 3, 2022 12:49 PM
To: devel@lists.open-mpi.org
Cc: Michele Martinelli
Subject: Re: [OMPI devel] How to progress MPI_Recv using custom BTL for NIC 
under development

thank you for the answer. Actually I think I solved that problem some
days ago, basically (if I correctly understand) MPI "adds" in some sense
an header to the data sent (please correct me if I'm wrong), which is
then used by ob1 to match the data arrived with the mpi_recv posted by
the user. The problem was then a poorly reconstructed header on the
receiving side.

unfortunately my happiness didn't last long because I have already found
another problem: it seems that the peers are not actually exchanging the
correct information via the modex protocol (not sure which kind of
network connection they are using in that phase), receiving "local" data
instead of the remote ones, but I just started debugging this, maybe I
could open a new thread specific on this.

Michele

Il 03/08/22 15:43, Jeff Squyres (jsquyres) ha scritto:
> Sorry for the huge delay in replies -- it's summer / vacation season, and I 
> think we (as a community) are a little behind in answering some of these 
> emails.  :-(
>
> It's been quite a while since I have been in the depths of BTL internals; I'm 
> afraid I don't remember the details offhand.
>
> When I was writing the usnic BTL, I know I found it useful to attach a 
> debugger on the sending and/or receiving side processes, and actually step 
> through both my BTL code and the OB1 PML code to see what was happening.  I 
> frequently found that either my BTL wasn't correctly accounting for network 
> conditions, or it wasn't passing information up to OB1 that it expected 
> (e.g., it passed the wrong length, or the wrong ID number, or ...something 
> else).  You can actually follow what happens in OB1 when your BTL invokes the 
> cbfunc -- does it find a corresponding MPI_Request, and does it mark it 
> complete?  Or does it put your incoming fragment as an unexpected message for 
> some reason, and put it on the unexpected queue?  Look for that kind of stuff.
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
> 
> From: devel  on behalf of Michele 
> Martinelli via devel 
> Sent: Saturday, July 23, 2022 9:04 AM
> To: devel@lists.open-mpi.org
> Cc: Michele Martinelli
> Subject: [OMPI devel] How to progress MPI_Recv using custom BTL for NIC under 
> development
>
> Hi,
>
> I'm trying to develop a btl for a custom NIC. I studied the btl.h file
> to understand the flow of calls that are expected to be implemented in
> my component. I'm using a simple test (which works like a charm with the
> TCP btl) to test my development, the code is a simple MPI_Send + MPI_Recv:
>
> MPI_Init(NULL, NULL);
> int world_rank;
> MPI_Comm_rank(MPI_COMM_WORLD, _rank);
> int world_size;
> MPI_Comm_size(MPI_COMM_WORLD, _size);
> int ping_pong_count = 1;
> int partner_rank = (world_rank + 1) % 2;
> printf("MY RANK: %d PARTNER: %d\n",world_rank,partner_rank);
>   if (world_rank == 0) {
> ping_pong_count++;
> MPI_Send(_pong_count, 1, MPI_INT, partner_rank, 0,
> MPI_COMM_WORLD);
> printf("%d sent and incremented ping_pong_count %d to %d\n",
> world_rank, ping_pong_count, partner_rank);
>   } else {
> MPI_Recv(_pong_count, 1, MPI_INT, partner_rank, 0,
> MPI_COMM_WORLD, MPI_STATUS_IGNORE);
> printf("%d received ping_pong_count %d from %d\n",
>world_rank, ping_pong_count, partner_rank);
>   }
> MPI_Finalize();
>
> I see that in my component's btl code the functions called during the
> "MPI_send" phase are:
>
>1. mca_btl_mycomp_add_procs
>2. mca_btl_mycomp_prepare_src
>3. mca_btl_mycomp_send (where I set the return to 1, so the send phase
>   should be finished)
>
> I see then the print inside the test:

Re: [OMPI devel] How to progress MPI_Recv using custom BTL for NIC under development

2022-08-03 Thread Jeff Squyres (jsquyres) via devel
Sorry for the huge delay in replies -- it's summer / vacation season, and I 
think we (as a community) are a little behind in answering some of these 
emails.  :-(

It's been quite a while since I have been in the depths of BTL internals; I'm 
afraid I don't remember the details offhand.

When I was writing the usnic BTL, I know I found it useful to attach a debugger 
on the sending and/or receiving side processes, and actually step through both 
my BTL code and the OB1 PML code to see what was happening.  I frequently found 
that either my BTL wasn't correctly accounting for network conditions, or it 
wasn't passing information up to OB1 that it expected (e.g., it passed the 
wrong length, or the wrong ID number, or ...something else).  You can actually 
follow what happens in OB1 when your BTL invokes the cbfunc -- does it find a 
corresponding MPI_Request, and does it mark it complete?  Or does it put your 
incoming fragment as an unexpected message for some reason, and put it on the 
unexpected queue?  Look for that kind of stuff.

-- 
Jeff Squyres
jsquy...@cisco.com


From: devel  on behalf of Michele Martinelli 
via devel 
Sent: Saturday, July 23, 2022 9:04 AM
To: devel@lists.open-mpi.org
Cc: Michele Martinelli
Subject: [OMPI devel] How to progress MPI_Recv using custom BTL for NIC under 
development

Hi,

I'm trying to develop a btl for a custom NIC. I studied the btl.h file
to understand the flow of calls that are expected to be implemented in
my component. I'm using a simple test (which works like a charm with the
TCP btl) to test my development, the code is a simple MPI_Send + MPI_Recv:

   MPI_Init(NULL, NULL);
   int world_rank;
   MPI_Comm_rank(MPI_COMM_WORLD, _rank);
   int world_size;
   MPI_Comm_size(MPI_COMM_WORLD, _size);
   int ping_pong_count = 1;
   int partner_rank = (world_rank + 1) % 2;
   printf("MY RANK: %d PARTNER: %d\n",world_rank,partner_rank);
 if (world_rank == 0) {
   ping_pong_count++;
   MPI_Send(_pong_count, 1, MPI_INT, partner_rank, 0,
MPI_COMM_WORLD);
   printf("%d sent and incremented ping_pong_count %d to %d\n",
world_rank, ping_pong_count, partner_rank);
 } else {
   MPI_Recv(_pong_count, 1, MPI_INT, partner_rank, 0,
MPI_COMM_WORLD, MPI_STATUS_IGNORE);
   printf("%d received ping_pong_count %d from %d\n",
  world_rank, ping_pong_count, partner_rank);
 }
   MPI_Finalize();

I see that in my component's btl code the functions called during the
"MPI_send" phase are:

  1. mca_btl_mycomp_add_procs
  2. mca_btl_mycomp_prepare_src
  3. mca_btl_mycomp_send (where I set the return to 1, so the send phase
 should be finished)

I see then the print inside the test:

 0 sent and incremented ping_pong_count 2 to 1

and this should conclude the MPI_Send phase.
Then I implemented in the btl_mycomp_component_progress function a call to:

 mca_btl_active_message_callback_t *reg =
mca_btl_base_active_message_trigger + tag;
 reg->cbfunc(_btl->super, );

I saw the same code in all the other BTLs and I thought this was enough
to "unlock" the MPI_Recv "polling". But actually I see my test hangs,
probably "waiting" for something that never happens (?).

I also took a look in the ob1 mca_pml_ob1_recv_frag_callback_match
function (which I suppose to be the reg->cbfunc), and it seems to get to
the end of the function, actually matching my frag.

So my question is: how can I say to the framework that I finished my
work and so the function can return to the user application? What am I
doing wrong?
Is there a way to understand where and what my code is waiting for?


Best



Re: [OMPI devel] Rationale behind memcpy chunk size (in smsc/xpmem)

2022-08-03 Thread Jeff Squyres (jsquyres) via devel
Sorry for the delay in replies -- it's summer / vacation season, and I think we 
(as a community) are a little behind in answering some of these emails.  :-(

It's hard to say for any given machine, but a bunch of different hardware 
factors can come into play, such as:

- L1, L2, L3 cache sizes
- Cache contention
- Memory controller connectivity and locality

I.e., exactly which hardware resources are the memcpy()'s in question using, 
and how do they interact with each other?  How much overhead is produced, 
and/or how much contention ensues when multiple requests are in flight 
simultaneously?  For example, it may be counter-intuitive, but sometimes 
injecting a small amount of delay in a software pipeline can allow hardware 
resources to not become overwhelmed, and therefore the overall execution 
becomes more efficient, and therefore consume less wall-clock execution time.  
Hence, doing 2 x 1MB memcpy()'s (to effect a 2MB MPI_send) may actually be 
overall more efficient, even though the individual parts of the transaction are 
less efficient.  This is a complete guess, and may have nothing to do with your 
system, but it's one of many possibilities.

Another possible factor: the specific memcpy() implementation is highly 
relevant.  It's been a few years since I've paid close attention to memcpy(), 
but at one time, there was significant variation in the quality of memcpy() 
implementations between different compilers and/or versions of libc.  I don't 
know if this is still a factor, or whether memcpy() is pretty well optimized in 
most situations these days.  Additionally, alignment can be an issue (although 
for message sizes of 2MB, I'm guessing your buffer is page-aligned, and this 
probably isn't an issue).

All that being said, I'm not intimately familiar with the internals of XPMEM, 
so I don't know what userspace/kernel space mechanisms will come into play for 
mapping the shared memory (e.g., is it lazily mapping the shared memory?).

Also, you're probably doing this already, but these kinds of things are worth 
mentioning: make sure your performance benchmarks are testing the right things: 
do warmup transfers, make sure you're not swapping, make sure all the processes 
and memory are pinned properly, make sure you're on an otherwise-quiet machine, 
... etc.  All the Usual Benchmarking Things.

--
Jeff Squyres
jsquy...@cisco.com


From: devel  on behalf of Giorgos Katevainis 
via devel 
Sent: Thursday, July 28, 2022 9:33 AM
To: Open MPI Developers
Cc: Giorgos Katevainis
Subject: [OMPI devel] Rationale behind memcpy chunk size (in smsc/xpmem)

Hello all,

I've come across the "memcpy_chunk_size" MCA parameter in smsc/xpmem, which 
effectively causes
memory copies to take place in chunks (used in mca_smsc_xpmem_memmove()). The 
comment reads:

"Maximum size to copy with a single call to memcpy. On some systems a smaller 
or larger number may
provide better performance (default: 256k)"

And I have indeed observed performance difference by adjusting it! E.g. in a 
simple point-to-point
test, 2 MB messages do significantly better with the parameter set to 1 MB vs 2 
MB. But... why? I
suppose I could imagine a memcpy of larger size being more efficient, but what 
would cause many
small ones to end up being quicker than a single large one? Might it have 
something to do with
memcpy intrinsics and different implementation for different sizes?

If someone knows what's going on under the hood and/or could direct me to any 
relevant resources, I
would greatly appreciate it!

George


[OMPI devel] Passing of an MPI luminary: Rusty Lusk

2022-05-23 Thread Jeff Squyres (jsquyres) via devel
In case you had not heard, Dr. Ewing "Rusty" Lusk passed away at age 78 last 
week.  Rusty was one of the founders and prime movers of the entire MPI 
ecosystem: the MPI Forum, the MPI standard, and MPICH.  Without Rusty, our 
community would not exist.  In addition to all of that, he was an all-around 
great guy: he was a thoughtful scientist and engineer, a kind mentor, and a 
genuinely nice guy.  Rusty was on my Ph.D. committee, and I was fortunate 
enough to work with him on a few projects over the years.

Thank you for everything, Rusty.

https://obituaries.neptunesociety.com/obituaries/downers-grove-il/ewing-lusk-10754811/amp

-- 
Jeff Squyres
jsquy...@cisco.com

Re: [OMPI devel] What PMIx version(s) does v5.0.0 and main support?

2022-05-04 Thread Jeff Squyres (jsquyres) via devel
We discussed this on the OMPI call yesterday, but I am not knowledgeable of the 
consequences of only supporting building Open MPI main and v5.x against PMIx 
v4.x are (vs. also supporting building Open MPI main+v5.x against PMIx v3.2.x, 
or even PMIx v3.x).

I think we might need to talk to the PMIx community and find out what those 
consequences are.  Then we can decide what we'd like to do here for Open MPI 
main/v5.0.x.

--
Jeff Squyres
jsquy...@cisco.com


From: devel  on behalf of Austen W Lauria via 
devel 
Sent: Tuesday, May 3, 2022 12:07 PM
To: Jeff Squyres (jsquyres) via devel
Cc: Austen W Lauria
Subject: [OMPI devel] What PMIx version(s) does v5.0.0 and main support?

There is a question as to what PMIx versions OMPI supports, specifically
with respect to v5.0.0 and main. This came to our attention where a user
tried to build OMPI v5 with PMIx v3.2.3, and ran into compile issues:

https://github.com/open-mpi/ompi/issues/10341

This leads to some open questions:

1. Is there a need to support v3.2.3 PMIx or lower with v5.0.0?
   - If so, these compile errors need to get stamped out. Sessions in v5.0.0 
would also be impacted,
 and will need to be disabled in this case since it relies on PMIx v4 
features not available in the v3 series.
 It would also be good to either add an MTT build and/or a CI test to make 
sure this does not break in the future.

2. Could/should OMPI compiled with a v4 PMIx work with SLURM compiled with a 
v3.2 PMIx?
 - There are some cross-version-compatibility features with PMIx, so it may 
be possible.

Finally, the PMIx version(s) as well as other required OMPI dependencies need
to be documented. There is an open issue here to track that here: 
https://github.com/open-mpi/ompi/issues/10345.

Thanks


Re: [OMPI devel] RDMA and OMPI implementation

2022-04-21 Thread Jeff Squyres (jsquyres) via devel
In UCX's case, the choice is almost entirely driven by the UCX library.  You'll 
need to look at the UCX code and/or ask NVIDIA.

--
Jeff Squyres
jsquy...@cisco.com


From: Masoud Hemmatpour 
Sent: Thursday, April 21, 2022 7:57 AM
To: Jeff Squyres (jsquyres)
Cc: Open MPI Developers
Subject: Re: [OMPI devel] RDMA and OMPI implementation



Thanks again for your answer and I hope I dont bother you with my questions! If 
I can ask my last question here. I would say how can I see a complete list of 
such factors like message size, memory map, ... etc?
Is there any reading or should I look at the code, if any, could you please 
give me a starting point to look at it? In the case of UCX and UCX-enabled 
network interfaces (such as IB) is it a UCX decision or OpenMPI decision to use 
or not RDMA?

Sorry for my long question, and thank you again!






On Thu, Apr 21, 2022 at 1:09 PM Jeff Squyres (jsquyres) 
mailto:jsquy...@cisco.com>> wrote:
It means that your underlying network transport supports RDMA.

To be clear, if you built Open MPI with UCX support, and you run on a system 
with UCX-enabled network interfaces (such as IB), Open MPI should automatically 
default to using those UCX interfaces.  This means you'll get all the benefits 
of an HPC-class networking transport (low latency, hardware offload, ... etc.).

For any given send/receive in your MPI application, in the right circumstances 
(message size, memory map, ... etc.), Open MPI will use RDMA to effect a 
network transfer.  There are many different run-time issues that will drive the 
choice of whether any individual network transfer actually uses RDMA or not.

--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>


From: Masoud Hemmatpour mailto:mashe...@gmail.com>>
Sent: Thursday, April 21, 2022 2:38 AM
To: Open MPI Developers
Cc: Jeff Squyres (jsquyres)
Subject: Re: [OMPI devel] RDMA and OMPI implementation


Thank you very much for your description! Actually, I read this issue on github:

Is OpenMPI supporting RDMA?<https://github.com/open-mpi/ompi/issues/5789>

If I have IB and I install and use UCX, does this guarantee that I am using 
RDMA or still it does not guarantee?


Thanks again,









On Thu, Apr 21, 2022 at 12:34 AM Jeff Squyres (jsquyres) via devel 
mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>>
 wrote:
Let me add a little more color to William's response.  The general theme is: it 
depends on what the underlying network provides.

Some underlying networks natively support one-sided operations like PUT / WRITE 
and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.).  Some don't (like TCP).

Open MPI will adapt to use whatever transports the underlying network supports.

Additionally, the determination of whether Open MPI uses a "two sided" or "one 
sided" type of network transport operation depends on a bunch of other factors. 
 The most efficient method to get a message from sender to receive may depend 
on issues such as the size of the message, the memory map of the message, the 
current network resource utilization, the specific MPI operation, ... etc.

Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided 
operations.  So if you want to use "RDMA", you may need to use an NVIDIA-based 
network (e.g., IB or RoCE).  That's not the only type of network one-sided 
operations available, but it's common.

--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com><mailto:jsquy...@cisco.com<mailto:jsquy...@cisco.com>>


From: devel 
mailto:devel-boun...@lists.open-mpi.org><mailto:devel-boun...@lists.open-mpi.org<mailto:devel-boun...@lists.open-mpi.org>>>
 on behalf of Zhang, William via devel 
mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>>
Sent: Wednesday, April 20, 2022 6:12 PM
To: Open MPI Developers
Cc: Zhang, William
Subject: Re: [OMPI devel] RDMA and OMPI implementation

Hello Masoud,

Responded inline

Thanks,
William

From: devel 
mailto:devel-boun...@lists.open-mpi.org><mailto:devel-boun...@lists.open-mpi.org<mailto:devel-boun...@lists.open-mpi.org>>>
 on behalf of Masoud Hemmatpour via devel 
mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>>
Reply-To: Open MPI Developers 
mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>>
Date: Wednesday, April 20, 2022 at 5:29 AM
To: Open MPI Developers 
mailto:devel@lists.open-mpi.org><mailto:devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>>
Cc: Masoud Hemmatpour 
mailto:mashe...@gmail.com><mailto:mashe...@gmail.com<mai

Re: [OMPI devel] RDMA and OMPI implementation

2022-04-21 Thread Jeff Squyres (jsquyres) via devel
It means that your underlying network transport supports RDMA.

To be clear, if you built Open MPI with UCX support, and you run on a system 
with UCX-enabled network interfaces (such as IB), Open MPI should automatically 
default to using those UCX interfaces.  This means you'll get all the benefits 
of an HPC-class networking transport (low latency, hardware offload, ... etc.).

For any given send/receive in your MPI application, in the right circumstances 
(message size, memory map, ... etc.), Open MPI will use RDMA to effect a 
network transfer.  There are many different run-time issues that will drive the 
choice of whether any individual network transfer actually uses RDMA or not.

--
Jeff Squyres
jsquy...@cisco.com


From: Masoud Hemmatpour 
Sent: Thursday, April 21, 2022 2:38 AM
To: Open MPI Developers
Cc: Jeff Squyres (jsquyres)
Subject: Re: [OMPI devel] RDMA and OMPI implementation


Thank you very much for your description! Actually, I read this issue on github:

Is OpenMPI supporting RDMA?<https://github.com/open-mpi/ompi/issues/5789>

If I have IB and I install and use UCX, does this guarantee that I am using 
RDMA or still it does not guarantee?


Thanks again,









On Thu, Apr 21, 2022 at 12:34 AM Jeff Squyres (jsquyres) via devel 
mailto:devel@lists.open-mpi.org>> wrote:
Let me add a little more color to William's response.  The general theme is: it 
depends on what the underlying network provides.

Some underlying networks natively support one-sided operations like PUT / WRITE 
and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.).  Some don't (like TCP).

Open MPI will adapt to use whatever transports the underlying network supports.

Additionally, the determination of whether Open MPI uses a "two sided" or "one 
sided" type of network transport operation depends on a bunch of other factors. 
 The most efficient method to get a message from sender to receive may depend 
on issues such as the size of the message, the memory map of the message, the 
current network resource utilization, the specific MPI operation, ... etc.

Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided 
operations.  So if you want to use "RDMA", you may need to use an NVIDIA-based 
network (e.g., IB or RoCE).  That's not the only type of network one-sided 
operations available, but it's common.

--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>


From: devel 
mailto:devel-boun...@lists.open-mpi.org>> on 
behalf of Zhang, William via devel 
mailto:devel@lists.open-mpi.org>>
Sent: Wednesday, April 20, 2022 6:12 PM
To: Open MPI Developers
Cc: Zhang, William
Subject: Re: [OMPI devel] RDMA and OMPI implementation

Hello Masoud,

Responded inline

Thanks,
William

From: devel 
mailto:devel-boun...@lists.open-mpi.org>> on 
behalf of Masoud Hemmatpour via devel 
mailto:devel@lists.open-mpi.org>>
Reply-To: Open MPI Developers 
mailto:devel@lists.open-mpi.org>>
Date: Wednesday, April 20, 2022 at 5:29 AM
To: Open MPI Developers 
mailto:devel@lists.open-mpi.org>>
Cc: Masoud Hemmatpour mailto:mashe...@gmail.com>>
Subject: [EXTERNAL] [OMPI devel] RDMA and OMPI implementation


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.

Hello Everyone,

Sorry, MPI is quite new for me, in particular the implementation. If you don't 
mind, I have some very basic questions regarding the OMPI implementation.

If I use one-sided MPI operations (Get, and Put) forcefully I use RDMA? – It 
depends, but it’s not guaranteed. For example, in Open MPI 4.0.x, there was the 
osc/pt2pt component that implemented osc operations using send/receive. Or for 
example, with calls to libfabric’s osc api, it depends on the implementation of 
the underlying provider.
Is it possible to have one-sided without RDMA? - Yes

In general, other types of MPI operations like Send/Receive or collective 
operations are implemented using RDMA or not necessarily? – Not necessarily. 
For example, using TCP won’t use RDMA. The underlying communication protocol 
could very well implement send/receive using RDMA though.

How can I be sure that I am using RDMA for a specific operation? – I’m not sure 
there’s an easy way to do this, I think you have to have some understanding of 
what communication protocol you’re using and what that protocol is doing.

Thank you very much in advance for your help!
Best Regards,



--
Best Regards,
Masoud Hemmatpour, PhD



Re: [OMPI devel] RDMA and OMPI implementation

2022-04-20 Thread Jeff Squyres (jsquyres) via devel
Let me add a little more color to William's response.  The general theme is: it 
depends on what the underlying network provides.

Some underlying networks natively support one-sided operations like PUT / WRITE 
and GET / READ (e.g., IB/RDMA, RoCE/RDMA, ... etc.).  Some don't (like TCP).

Open MPI will adapt to use whatever transports the underlying network supports.

Additionally, the determination of whether Open MPI uses a "two sided" or "one 
sided" type of network transport operation depends on a bunch of other factors. 
 The most efficient method to get a message from sender to receive may depend 
on issues such as the size of the message, the memory map of the message, the 
current network resource utilization, the specific MPI operation, ... etc.

Also, be aware that "RDMA" commonly refers to InfiniBand-style one-sided 
operations.  So if you want to use "RDMA", you may need to use an NVIDIA-based 
network (e.g., IB or RoCE).  That's not the only type of network one-sided 
operations available, but it's common.

--
Jeff Squyres
jsquy...@cisco.com


From: devel  on behalf of Zhang, William via 
devel 
Sent: Wednesday, April 20, 2022 6:12 PM
To: Open MPI Developers
Cc: Zhang, William
Subject: Re: [OMPI devel] RDMA and OMPI implementation

Hello Masoud,

Responded inline

Thanks,
William

From: devel  on behalf of Masoud Hemmatpour 
via devel 
Reply-To: Open MPI Developers 
Date: Wednesday, April 20, 2022 at 5:29 AM
To: Open MPI Developers 
Cc: Masoud Hemmatpour 
Subject: [EXTERNAL] [OMPI devel] RDMA and OMPI implementation


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.

Hello Everyone,

Sorry, MPI is quite new for me, in particular the implementation. If you don't 
mind, I have some very basic questions regarding the OMPI implementation.

If I use one-sided MPI operations (Get, and Put) forcefully I use RDMA? – It 
depends, but it’s not guaranteed. For example, in Open MPI 4.0.x, there was the 
osc/pt2pt component that implemented osc operations using send/receive. Or for 
example, with calls to libfabric’s osc api, it depends on the implementation of 
the underlying provider.
Is it possible to have one-sided without RDMA? - Yes

In general, other types of MPI operations like Send/Receive or collective 
operations are implemented using RDMA or not necessarily? – Not necessarily. 
For example, using TCP won’t use RDMA. The underlying communication protocol 
could very well implement send/receive using RDMA though.

How can I be sure that I am using RDMA for a specific operation? – I’m not sure 
there’s an easy way to do this, I think you have to have some understanding of 
what communication protocol you’re using and what that protocol is doing.

Thank you very much in advance for your help!
Best Regards,



Re: [OMPI devel] Open MPI source RPM / specfile

2022-04-11 Thread Jeff Squyres (jsquyres) via devel
A minor clarification: I am not the primary maintainer of the RPM specfile.

Patches would be welcome from other community members -- especially those 
organizations who are using it.

--
Jeff Squyres
jsquy...@cisco.com


From: devel  on behalf of Zhang, Wei via 
devel 
Sent: Monday, April 4, 2022 4:51 PM
To: Open MPI Developers
Cc: Zhang, Wei
Subject: Re: [OMPI devel] Open MPI source RPM / specfile

Hi Jeff,

The AWS EFA team also uses the RPM specfile to build openmpi RPM, and 
distribute the RPM. We would also prefer that you continue to maintain it.

Sincerely,

Wei Zhang


On 4/4/22, 1:47 PM, "devel on behalf of Goldman, Adam via devel" 
 wrote:

CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



Hi Jeff,

We (Intel IEFS Team) do not use the SRPM, but we do use the RPM specfile 
and scripts to build our own RPMs.
We would prefer if you can continue to maintain this going forward.

Regards,
Adam Goldman
Intel Corporation

-Original Message-
From: devel  On Behalf Of Jeff Squyres 
(jsquyres) via devel
Sent: Friday, April 1, 2022 11:29 AM
To: Open MPI Developers 
    Cc: Jeff Squyres (jsquyres) 
Subject: [OMPI devel] Open MPI source RPM / specfile

Open question to the developer community: is anyone using the Open MPI SRPM 
that we release?

Related question: is anyone using the RPM specfile and/or scripts in 
contrib for building Open MPI RPMs?

I ask for a specific reason: we just realized we broke the RPM stuff when 
updating the v5.0.x man pages/documentation.  We can go fix the RPM stuff, 
...but is it worth it?  We made the specfile/SRPM a long time ago when Open MPI 
wasn't yet widely distributed by third parties.  It seems like downstream 
packagers have now taken over this role, and have all their own packaging 
infrastructure.  As such, I don't know if anyone is actively using / cares 
about the Open MPI specfile / SRPM.

Should we port it forward into the v5.0.x series?  Or should we delete it 
from v5.0.0 and leave it for <= v5.0.0?

--
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Open MPI source RPM / specfile

2022-04-01 Thread Jeff Squyres (jsquyres) via devel
Open question to the developer community: is anyone using the Open MPI SRPM 
that we release?

Related question: is anyone using the RPM specfile and/or scripts in contrib 
for building Open MPI RPMs?

I ask for a specific reason: we just realized we broke the RPM stuff when 
updating the v5.0.x man pages/documentation.  We can go fix the RPM stuff, 
...but is it worth it?  We made the specfile/SRPM a long time ago when Open MPI 
wasn't yet widely distributed by third parties.  It seems like downstream 
packagers have now taken over this role, and have all their own packaging 
infrastructure.  As such, I don't know if anyone is actively using / cares 
about the Open MPI specfile / SRPM.

Should we port it forward into the v5.0.x series?  Or should we delete it from 
v5.0.0 and leave it for <= v5.0.0?

-- 
Jeff Squyres
jsquy...@cisco.com

Re: [OMPI devel] Renaming Open MPI's git "master" branch

2022-03-25 Thread Jeff Squyres (jsquyres) via devel
The Github branch rename has been completed.  Actions developers should take:

  1.  Update your Github fork: rename "master" to "main".  Failure to do this 
will likely lead to wailing and gnashing of teeth at some point.
 *   Navigate to your Github fork --> Settings --> Branches (e.g., 
https://github.com/jsquyres/ompi/settings/branches).
 *   Click on the pencil button to the right of the default "master" branch.
 *   Change "main" to "master" and click "Rename branch".

  2.  Update your local Git clones of the OMPI tree: for example:

git branch --move master main
git fetch origin
git branch --set-upstream-to origin/main main
git remote set-head origin -a

  3.  Update your MTT config:
 *   The simplest thing to do is to s/master/main/g in all your MTT config 
files.
 *   The old "master" nightly snapshot download URL will work for the next 
7 days (it will stop working Friday, 1 April 2022), where "work" means:
*   The URL will exist and HTTP GETs will not generate 404s
*   No new tarballs will be updated

Stuff developers probably want to know:

  1.  All pull requests in https://github.com/open-mpi/ompi/pulls that targeted 
"master" have been automatically re-targeted to "main"
 *   There was one exception (PR #3262).  The owner of that PR was notified.
  2.  The first snapshot tarball from the "main" branch has been created: 
https://www.open-mpi.org/nightly/main/

Please let us know if you have any questions or problems.

--
Jeff Squyres
jsquy...@cisco.com


From: Jeff Squyres (jsquyres) 
Sent: Friday, March 25, 2022 10:46 AM
To: Open MPI Developers
Subject: Re: Renaming Open MPI's git "master" branch

Gentle reminder that Brian and I will start this change in about 15 minutes.

--
Jeff Squyres
jsquy...@cisco.com


From: Jeff Squyres (jsquyres) 
Sent: Tuesday, March 22, 2022 12:30 PM
To: Open MPI Developers
Subject: Re: Renaming Open MPI's git "master" branch

This was discussed both off-list and on the weekly Open MPI dev call.  Here's 
the points that were discussed, in no particular order:

  *   We definitely need to give a heads-up to build-from-source projects like 
Spack and EasyBuild.
 *   It's not common, but it is possible to build Open MPI's master branch 
from Spack / EasyBuild.  Hence, we should notify these communities to update 
the git branch name from "master" to "main".
 *   Howard offered to open a Spack PR to update this.
 *   We'll also ping Kenneth from EasyBuild.
 *   There was some concern that we should give such communities a little 
time to adjust to the new name.  After some discussion, the consensus is that 
this is likely not necessary a) because it's not common for these tools to 
build from master, and b) it should be a quick/easy fix for them to update the 
branch name.  Let's just get it done.
  *   "git symbolic-ref refs/head/master refs/head/main" may be helpful for 
developers to transition themselves to the new name locally.
  *   When a branch is renamed at GitHub, some degree of automatic renaming 
will occur (e.g., surfing to 
https://github.com/open-mpi/ompi/blob/master/README.md will automatically get 
HTTP redirected).
 *   See https://github.com/github/renaming for details about what does -- 
and does not -- happen.
 *
  *   There were some questions as to whether "main" is a good choice of branch 
name.  There's been a bunch of debate about this over the past year.
 *   Git 2.35.1 (the most recent) version still defaults to "master".
 *   Github.com defaults to "main".
 *   Other names like "dev" and "devel" were also proposed for Open MPI.
 *   My $0.02: Github defaults to "main", and that effectively represents 
what many people use.  Let's just go with that.
  *   The "ompi" repo is the first one to get this branch name change.
 *   This change is mainly being driven by the Open MPI's ReadTheDocs 
initiative that publishes docs by branch name -- we didn't want with URLs 
containing "/master/" getting bookmarked, linked to, ... etc.  Meaning: we have 
to change the name now, before v5.0.0 is released, and the new RTD OMPI docs 
web site goes mainstream.
 *   We will eventually also update the other Open MPI git repos for master 
--> main, but those should likely be less painful.  Stay tuned for further info.

Unless we hear any major objections, Brian and I will be making this change 
this upcoming Friday, 25 March, 2022, from 11am-1pm US Eastern time.  In 
general, the changes will be:

  *   Rename the master branch at github.
 *   Delete the "master" branch out of our personal forks (this is STRONGLY 
advised for everyon

Re: [OMPI devel] Renaming Open MPI's git "master" branch

2022-03-25 Thread Jeff Squyres (jsquyres) via devel
Gentle reminder that Brian and I will start this change in about 15 minutes.

--
Jeff Squyres
jsquy...@cisco.com


From: Jeff Squyres (jsquyres) 
Sent: Tuesday, March 22, 2022 12:30 PM
To: Open MPI Developers
Subject: Re: Renaming Open MPI's git "master" branch

This was discussed both off-list and on the weekly Open MPI dev call.  Here's 
the points that were discussed, in no particular order:

  *   We definitely need to give a heads-up to build-from-source projects like 
Spack and EasyBuild.
 *   It's not common, but it is possible to build Open MPI's master branch 
from Spack / EasyBuild.  Hence, we should notify these communities to update 
the git branch name from "master" to "main".
 *   Howard offered to open a Spack PR to update this.
 *   We'll also ping Kenneth from EasyBuild.
 *   There was some concern that we should give such communities a little 
time to adjust to the new name.  After some discussion, the consensus is that 
this is likely not necessary a) because it's not common for these tools to 
build from master, and b) it should be a quick/easy fix for them to update the 
branch name.  Let's just get it done.
  *   "git symbolic-ref refs/head/master refs/head/main" may be helpful for 
developers to transition themselves to the new name locally.
  *   When a branch is renamed at GitHub, some degree of automatic renaming 
will occur (e.g., surfing to 
https://github.com/open-mpi/ompi/blob/master/README.md will automatically get 
HTTP redirected).
 *   See https://github.com/github/renaming for details about what does -- 
and does not -- happen.
 *
  *   There were some questions as to whether "main" is a good choice of branch 
name.  There's been a bunch of debate about this over the past year.
 *   Git 2.35.1 (the most recent) version still defaults to "master".
 *   Github.com defaults to "main".
 *   Other names like "dev" and "devel" were also proposed for Open MPI.
 *   My $0.02: Github defaults to "main", and that effectively represents 
what many people use.  Let's just go with that.
  *   The "ompi" repo is the first one to get this branch name change.
 *   This change is mainly being driven by the Open MPI's ReadTheDocs 
initiative that publishes docs by branch name -- we didn't want with URLs 
containing "/master/" getting bookmarked, linked to, ... etc.  Meaning: we have 
to change the name now, before v5.0.0 is released, and the new RTD OMPI docs 
web site goes mainstream.
 *   We will eventually also update the other Open MPI git repos for master 
--> main, but those should likely be less painful.  Stay tuned for further info.

Unless we hear any major objections, Brian and I will be making this change 
this upcoming Friday, 25 March, 2022, from 11am-1pm US Eastern time.  In 
general, the changes will be:

  *   Rename the master branch at github.
 *   Delete the "master" branch out of our personal forks (this is STRONGLY 
advised for everyone else, too!)
  *   Update / rename nightly snapshot tarballs.
  *   Update our MTT configurations to match the new nightly snapshot tarball 
name / location.
  *   Update docs.open-mpi.org to build the new branch and remove the old branch
 *   Update any internal docs references to "master" on main, v5.0.x, 
v4.0.x, v4.1.x
  *   Update any references to "master" on www.open-mpi.org
  *   Ask Howard to make his Spack PR
  *   Notify the EasyBuild community
  *   Email this devel list to confirm that it's all done.

For MTT, we will hopefully be able to leave the old "master" S3 folder around 
for a week or two.  This will give people time to update their MTT configs.  
There will definitely be an EOL date on that folder, however -- we're not going 
to maintain the old name for very long.

In general, we don't plan to support many behind-the-scenes master --> main 
remappings to allow people to continue to use the old "master" name.  We'd 
rather people actually updated to the new name.

Please let me know if you have any questions, suggestions, or concerns.  Thanks!

--
Jeff Squyres
jsquy...@cisco.com


From: Jeff Squyres (jsquyres)
Sent: Friday, March 18, 2022 11:29 AM
To: Open MPI Developers
Subject: Renaming Open MPI's git "master" branch

This is a bit overdue, but we're finally getting to this.

We are planning to rename Open MPI's git "master" branch to "main" one week 
from today: Friday, 25 Mar, 2022.

Brian and I will handle most of the logistics of this change, but there will 
likely be a little disruption for developers.  For example, next Friday, I 
would strongly suggest deleting the "master" branch out of your fork so that 
your fingers don't accidentally type "git check

Re: [OMPI devel] Script-based wrapper compilers

2022-03-24 Thread Jeff Squyres (jsquyres) via devel
Gilles --

Do you know if anyone is actually cross compiling?  I agree that this is in the 
"nice to have" category, but it is costing Brian time -- if no one is using 
this functionality, it's not worth the time.  If people are using this 
functionality, then it's potentially worth the time.

--
Jeff Squyres
jsquy...@cisco.com


From: devel  on behalf of Gilles Gouaillardet 
via devel 
Sent: Wednesday, March 23, 2022 10:28 PM
To: Open MPI Developers
Cc: Gilles Gouaillardet
Subject: Re: [OMPI devel] Script-based wrapper compilers

Brian,

My 0.02 US$

Script based wrapper compilers are very useful when cross compiling,
so ideally, they should be maintained.

Cheers,

Gilles

On Thu, Mar 24, 2022 at 11:18 AM Barrett, Brian via devel 
mailto:devel@lists.open-mpi.org>> wrote:
Does anyone still use the script based wrapper compilers?  I have been working 
on fixing a number of static library compile issues caused by us historically 
not having been great about tracking library dependencies and the 
OMPI/PMIx/PRRTE split.  Part of this is some fairly significant modifications 
to the wrapper compilers (here's the PMIx version: 
https://github.com/openpmix/openpmix/commit/e15de4b52f2d331297bbca31beb54b5a377557bc).
  It would be easiest to just remove the script based wrapper compilers, but 
I'll update them if someone uses them.

Thanks,

Brian



Re: [OMPI devel] Renaming Open MPI's git "master" branch

2022-03-22 Thread Jeff Squyres (jsquyres) via devel
This was discussed both off-list and on the weekly Open MPI dev call.  Here's 
the points that were discussed, in no particular order:

  *   We definitely need to give a heads-up to build-from-source projects like 
Spack and EasyBuild.
 *   It's not common, but it is possible to build Open MPI's master branch 
from Spack / EasyBuild.  Hence, we should notify these communities to update 
the git branch name from "master" to "main".
 *   Howard offered to open a Spack PR to update this.
 *   We'll also ping Kenneth from EasyBuild.
 *   There was some concern that we should give such communities a little 
time to adjust to the new name.  After some discussion, the consensus is that 
this is likely not necessary a) because it's not common for these tools to 
build from master, and b) it should be a quick/easy fix for them to update the 
branch name.  Let's just get it done.
  *   "git symbolic-ref refs/head/master refs/head/main" may be helpful for 
developers to transition themselves to the new name locally.
  *   When a branch is renamed at GitHub, some degree of automatic renaming 
will occur (e.g., surfing to 
https://github.com/open-mpi/ompi/blob/master/README.md will automatically get 
HTTP redirected).
 *   See https://github.com/github/renaming for details about what does -- 
and does not -- happen.
 *
  *   There were some questions as to whether "main" is a good choice of branch 
name.  There's been a bunch of debate about this over the past year.
 *   Git 2.35.1 (the most recent) version still defaults to "master".
 *   Github.com defaults to "main".
 *   Other names like "dev" and "devel" were also proposed for Open MPI.
 *   My $0.02: Github defaults to "main", and that effectively represents 
what many people use.  Let's just go with that.
  *   The "ompi" repo is the first one to get this branch name change.
 *   This change is mainly being driven by the Open MPI's ReadTheDocs 
initiative that publishes docs by branch name -- we didn't want with URLs 
containing "/master/" getting bookmarked, linked to, ... etc.  Meaning: we have 
to change the name now, before v5.0.0 is released, and the new RTD OMPI docs 
web site goes mainstream.
 *   We will eventually also update the other Open MPI git repos for master 
--> main, but those should likely be less painful.  Stay tuned for further info.

Unless we hear any major objections, Brian and I will be making this change 
this upcoming Friday, 25 March, 2022, from 11am-1pm US Eastern time.  In 
general, the changes will be:

  *   Rename the master branch at github.
 *   Delete the "master" branch out of our personal forks (this is STRONGLY 
advised for everyone else, too!)
  *   Update / rename nightly snapshot tarballs.
  *   Update our MTT configurations to match the new nightly snapshot tarball 
name / location.
  *   Update docs.open-mpi.org to build the new branch and remove the old branch
 *   Update any internal docs references to "master" on main, v5.0.x, 
v4.0.x, v4.1.x
  *   Update any references to "master" on www.open-mpi.org
  *   Ask Howard to make his Spack PR
  *   Notify the EasyBuild community
  *   Email this devel list to confirm that it's all done.

For MTT, we will hopefully be able to leave the old "master" S3 folder around 
for a week or two.  This will give people time to update their MTT configs.  
There will definitely be an EOL date on that folder, however -- we're not going 
to maintain the old name for very long.

In general, we don't plan to support many behind-the-scenes master --> main 
remappings to allow people to continue to use the old "master" name.  We'd 
rather people actually updated to the new name.

Please let me know if you have any questions, suggestions, or concerns.  Thanks!

--
Jeff Squyres
jsquy...@cisco.com


From: Jeff Squyres (jsquyres)
Sent: Friday, March 18, 2022 11:29 AM
To: Open MPI Developers
Subject: Renaming Open MPI's git "master" branch

This is a bit overdue, but we're finally getting to this.

We are planning to rename Open MPI's git "master" branch to "main" one week 
from today: Friday, 25 Mar, 2022.

Brian and I will handle most of the logistics of this change, but there will 
likely be a little disruption for developers.  For example, next Friday, I 
would strongly suggest deleting the "master" branch out of your fork so that 
your fingers don't accidentally type "git checkout master" and you end up on 
some very-outdated git reference.

We'll send out more information soon, but wanted to give everyone a heads-up 
that this is coming, and to prepare for a little disruption.

--
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Renaming Open MPI's git "master" branch

2022-03-18 Thread Jeff Squyres (jsquyres) via devel
This is a bit overdue, but we're finally getting to this.

We are planning to rename Open MPI's git "master" branch to "main" one week 
from today: Friday, 25 Mar, 2022.

Brian and I will handle most of the logistics of this change, but there will 
likely be a little disruption for developers.  For example, next Friday, I 
would strongly suggest deleting the "master" branch out of your fork so that 
your fingers don't accidentally type "git checkout master" and you end up on 
some very-outdated git reference.

We'll send out more information soon, but wanted to give everyone a heads-up 
that this is coming, and to prepare for a little disruption.

-- 
Jeff Squyres
jsquy...@cisco.com

[OMPI devel] First Open MPI v5.0.x docs have been merged

2022-03-07 Thread Jeff Squyres (jsquyres) via devel
Our first cut of the ReStructured Text (RST)/Sphinx HTML docs and man pages 
have been committed to master -- woo hoo!

There's a few things the developer community needs to know.

Web site

The docs are now live on https://docs.open-mpi.org (hosted by readthedocs.io).

Right now, there's just "master" docs available (i.e., the docs from the git 
branch named "master").  If you merge a change to the docs on master, the docs 
on the web site will automatically update.

Since it's just master so far, we're not advertising these docs yet.  Indeed, 
there's a lot of 5.0.0-specific content on these docs -- we don't really want 
to send 4.1.x users to these docs yet.

Eventually, the web site will have the 5.0.0 (and 5.0.1 and 5.0.2 and ...) docs 
up there, too.  But for now: just master.

Git sources

The source for the docs live in the "docs/" directory on master.  None of the 
generated HTML or nroff files are in the repository; only the RST source.

If you get a git clone, you need to have Sphinx (the tool we use to render the 
docs into HTML and nroff) installed and in your environment before running 
configure if you want to build the HTML docs and man pages.  See 
https://docs.open-mpi.org/en/master/developers/sphinx.html for details.

We'll going to let the whole new docs system soak for a little while before 
cherry picking to the v5.0.x branch.

Updating the docs: pull requests

PLEASE MAKE PULL REQUESTS TO UPDATE THE DOCS!! 

You can just edit the RST files under docs/, just like any other files.

In the docs/ directory, you can just "make".  Assuming you had Sphinx in your 
environment before you ran configure, it will build the docs right there:

  *   The HTML pages can be found under "docs/_build/html/index.html".  You can 
just open that locally in a web browser to see your changes.
  *   The man pages are in "docs/_build/man".  You can just use "man 
docs/_build/man/MPI_Send.3" (for example) to view them.

When you create a pull request, the docs on your PR will be rendered to a 
temporary/PR-specific location on readthedocs.io.  Go down to the bottom of the 
PR and you'll see a CI item for "docs/readthedocs.org:ompi".  If you click the 
"Details" link, it'll take you to this PR's build of the docs.

If you introduce any RST/Sphinx warnings or errors in your PR, CI will fail, 
and you will not be able to merge your PR (pro tip: fix the warnings / errors, 
and then you'll be able to merge your PR).

Distribution tarballs

Similar to other tools that we Open MPI developers use, end users do NOT need 
to have Sphinx installed.  Official Open MPI tarballs will include the 
pre-built HTML docs and man pages.

When users install from an Open MPI tarball, the man pages will be installed as 
usual.

The HTML docs are not​ installed, but users can view them locally in 
docs/_build/index.html with a local web browser (e.g., if they're at a facility 
that has restricted access to the internet, and they can't reach 
https://docs.open-mpi.org/).

ReStructured Text? What the heck is that?

All the new docs -- HTML pages and man pages -- are written in ReStructured 
Text (RST).  Think of RST as Markdown on steroids.  It's a little more 
complicated than Markdown, but not much.  It's much more powerful than 
Markdown, though.

See https://docs.open-mpi.org/en/master/developers/rst-for-markdown-expats.html 
for a quick intro to Open MPI's use of RST.

New configury

I tested all the configury and build stuff as best as I could, but now that 
this is on master and open to a wider audience, we'll likely shake out a few 
more bugs.  Please bear with us; we'll get the bugs fixed as soon as possible.

As mentioned above, you basically need to have the sphinx-build(1) executable 
in your environment before running configure if you want to build the HTML docs 
and man pages.  If you don't have Sphinx, "make dist" will (intentionally) fail.

Building the docs from a fresh git clone may take 3-5 minutes (there's hundreds 
of man pages -- this is what takes the majority of time).  There's stuff to 
watch as the build progresses.

Sphinx is stateful; subsequent builds only rebuild files that have changed.  If 
nothing has changed, a no-op build only takes a few seconds.

"make clean" will not​ remove the generated docs.  "make maintainer-clean" will 
remove the generated docs (i.e., rm -rf docs/_build).

Pandoc

Pandoc is dead; long live Sphinx!

(we were going to use Pandoc for v5.0.0 docs, but that effort is now dead and 
fully replaced with all the RST/Sphinx stuff.  There's no more requirement for 
Pandoc)

What about Open MPI versions prior to v5.0.0?

The existing docs for all prior versions of Open MPI will be preserved.  We'll 
clearly distinguish between the =v5.0 docs.

Stay tuned for more info on this subject.

HTML docs content is not yet complete

The docs have been merged to master intentionally before all the content was 
complete.  There's a LOT of docs there, but you'll also see a "to do" page, and 
a 

Re: [OMPI devel] install from source

2021-11-22 Thread Jeff Squyres (jsquyres) via devel
I'm a little confused -- I don't see any failures in the config.out or make.out 
that you sent...?


On Nov 21, 2021, at 7:50 AM, Masoud Hemmatpour 
mailto:mashe...@gmail.com>> wrote:

Thank you Jeff for your reply! Sorry for that. Please, find attached the new 
tarball:

ompi-output.log<https://drive.google.com/file/d/13893ncYRx7j0jxrchkXnpWzF-VinuFkF/view?usp=sharing>

Thanks!




On Mon, Nov 15, 2021 at 5:26 PM Jeff Squyres (jsquyres) 
mailto:jsquy...@cisco.com>> wrote:
Sorry for the delay -- I was out late last week.

Unfortunately, this tarball contains a "make.out" file that is actually the 
output from configure.

Can you grab the output from "make V=1" again?

Also, you don't need to compress the files inside the tarball -- the compressed 
tarball should be sufficient.

Thanks!


On Nov 11, 2021, at 3:30 AM, Masoud Hemmatpour 
mailto:mashe...@gmail.com>> wrote:


Thanks for the information. Please, find the logs in the following link:

ompi-output.log<https://drive.google.com/file/d/13893ncYRx7j0jxrchkXnpWzF-VinuFkF/view?usp=sharing>


Thank you again,





On Fri, Nov 5, 2021 at 5:59 PM Jeff Squyres (jsquyres) 
mailto:jsquy...@cisco.com>> wrote:
Can you send all the information listed here:

https://www.open-mpi.org/community/help/


On Nov 5, 2021, at 12:33 PM, Masoud Hemmatpour via devel 
mailto:devel@lists.open-mpi.org>> wrote:


Sorry,

This is my configure:

./configure --prefix=/home/me/src_cx/ompi/install-debug/ 
--with-ucx=/home/me/src_cx/ucx/install-debug --disable-man-pages 
--disable-shared --enable-static

However, both cases do not work for me.

Thanks,

On Fri, Nov 5, 2021 at 5:11 PM Masoud Hemmatpour 
mailto:mashe...@gmail.com>> wrote:

Hello,

Actually, I am trying to install openmpi from source. I face the following 
error in make:


/usr/bin/ld: cannot find -lOpenCL
collect2: error: ld returned 1 exit status
Makefile:1879: recipe for target 'opal_wrapper' failed
make[2]: *** [opal_wrapper] Error 1
make[2]: Leaving directory 
'/home/mashemat/src_cx/ompi/build/opal/tools/wrappers'
Makefile:2383: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/mashemat/src_cx/ompi/build/opal'
Makefile:1901: recipe for target 'all-recursive' failed

make: *** [all-recursive] Error 1


here is my configure:

./configure --prefix=/home/me/src_cx/ompi/install-debug/ 
--with-ucx=/home/me/src_cx/ucx/install-debug --disable-man-pages

I appreciate if you let me know the missing point.

Thanks,

















--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>





--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>





--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>





Re: [OMPI devel] install from source

2021-11-15 Thread Jeff Squyres (jsquyres) via devel
Sorry for the delay -- I was out late last week.

Unfortunately, this tarball contains a "make.out" file that is actually the 
output from configure.

Can you grab the output from "make V=1" again?

Also, you don't need to compress the files inside the tarball -- the compressed 
tarball should be sufficient.

Thanks!


On Nov 11, 2021, at 3:30 AM, Masoud Hemmatpour 
mailto:mashe...@gmail.com>> wrote:


Thanks for the information. Please, find the logs in the following link:

ompi-output.log<https://drive.google.com/file/d/13893ncYRx7j0jxrchkXnpWzF-VinuFkF/view?usp=sharing>


Thank you again,





On Fri, Nov 5, 2021 at 5:59 PM Jeff Squyres (jsquyres) 
mailto:jsquy...@cisco.com>> wrote:
Can you send all the information listed here:

https://www.open-mpi.org/community/help/


On Nov 5, 2021, at 12:33 PM, Masoud Hemmatpour via devel 
mailto:devel@lists.open-mpi.org>> wrote:


Sorry,

This is my configure:

./configure --prefix=/home/me/src_cx/ompi/install-debug/ 
--with-ucx=/home/me/src_cx/ucx/install-debug --disable-man-pages 
--disable-shared --enable-static

However, both cases do not work for me.

Thanks,

On Fri, Nov 5, 2021 at 5:11 PM Masoud Hemmatpour 
mailto:mashe...@gmail.com>> wrote:

Hello,

Actually, I am trying to install openmpi from source. I face the following 
error in make:


/usr/bin/ld: cannot find -lOpenCL
collect2: error: ld returned 1 exit status
Makefile:1879: recipe for target 'opal_wrapper' failed
make[2]: *** [opal_wrapper] Error 1
make[2]: Leaving directory 
'/home/mashemat/src_cx/ompi/build/opal/tools/wrappers'
Makefile:2383: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/mashemat/src_cx/ompi/build/opal'
Makefile:1901: recipe for target 'all-recursive' failed

make: *** [all-recursive] Error 1


here is my configure:

./configure --prefix=/home/me/src_cx/ompi/install-debug/ 
--with-ucx=/home/me/src_cx/ucx/install-debug --disable-man-pages

I appreciate if you let me know the missing point.

Thanks,

















--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>





--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>





Re: [OMPI devel] install from source

2021-11-05 Thread Jeff Squyres (jsquyres) via devel
Can you send all the information listed here:

https://www.open-mpi.org/community/help/


On Nov 5, 2021, at 12:33 PM, Masoud Hemmatpour via devel 
mailto:devel@lists.open-mpi.org>> wrote:


Sorry,

This is my configure:

./configure --prefix=/home/me/src_cx/ompi/install-debug/ 
--with-ucx=/home/me/src_cx/ucx/install-debug --disable-man-pages 
--disable-shared --enable-static

However, both cases do not work for me.

Thanks,

On Fri, Nov 5, 2021 at 5:11 PM Masoud Hemmatpour 
mailto:mashe...@gmail.com>> wrote:

Hello,

Actually, I am trying to install openmpi from source. I face the following 
error in make:


/usr/bin/ld: cannot find -lOpenCL
collect2: error: ld returned 1 exit status
Makefile:1879: recipe for target 'opal_wrapper' failed
make[2]: *** [opal_wrapper] Error 1
make[2]: Leaving directory 
'/home/mashemat/src_cx/ompi/build/opal/tools/wrappers'
Makefile:2383: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/mashemat/src_cx/ompi/build/opal'
Makefile:1901: recipe for target 'all-recursive' failed

make: *** [all-recursive] Error 1


here is my configure:

./configure --prefix=/home/me/src_cx/ompi/install-debug/ 
--with-ucx=/home/me/src_cx/ucx/install-debug --disable-man-pages

I appreciate if you let me know the missing point.

Thanks,

















--
Jeff Squyres
jsquy...@cisco.com





[OMPI devel] Coverity run from last night

2021-10-13 Thread Jeff Squyres (jsquyres) via devel
We finally fixed the Coverity run yesterday, and it ran last night for the 
first time in a little while.

As usual, a bunch of useless items came up, but some legit-looking issues also 
came up.  The following component owners should probably look at the Coverity 
report from last night:

- common/ofi
- pstrg/lustre
- osc/sm
- osc/rdma
- grequest 
- prte (bipartite graph, odls base, schitzo/ompi, plm/slurm)
- pmix (util/pif.c)

-- 
Jeff Squyres
jsquy...@cisco.com





[OMPI devel] Open MPI 5.0.0: drop support for gcc 4.4.7?

2021-09-21 Thread Jeff Squyres (jsquyres) via devel
All --

Unless someone has a strong reason for keeping support for GCC 4.4.7 (i.e., the 
default GCC compiler that shipped in RHEL 6), Open MPI is going to drop support 
for it in v5.0.0.

The reason for this is that PRTE and PMIX no longer compile successfully with 
GCC 4.4.7 (and Open MPI gets a lot of compiler warnings with GCC 4.4.7).  These 
packages ***could be updated to support GCC 4.4.7 if someone cares***.  But 
we'll need someone to contribute pull requests to do so.

If no one plans to contribute pull requests for this in the near future, we're 
doing to drop support for GCC 4.4.7 in Open MPI v5.0.0.  See 
https://github.com/open-mpi/ompi/pull/9398.

* Note that RHEL 7 ships with GCC v4.8.5.
* The Open MPI community regularly tests with GCC v4.8.1.

Hence, Open MPI >= v5.0.0 will support GCC >= v4.8.1.  Specifically: Open MPI's 
configure script will abort with helpful error message for versions of GCC < 
v4.8.1.

Finally, note that this announcement is solely about Open MPI >= v5.0.0.  The 
Open MPI v4.0.x and v4.1.x series will continue to support GCC 4.4.7.

-- 
Jeff Squyres
jsquy...@cisco.com





[OMPI devel] Annual Open MPI repo access audit

2021-08-17 Thread Jeff Squyres (jsquyres) via devel
The following organizations need to respond to the annual Open MPI access audit 
ASAP:

• AWS
• HPE
• Intel
• NVIDIA
• RIST

Please see Slack or contact me for the URL where to complete your section of 
the audit.

Thanks.

-- 
Jeff Squyres
jsquy...@cisco.com





Re: [OMPI devel] Help us find good times for design discussions.

2021-07-19 Thread Jeff Squyres (jsquyres) via devel
To close the loop on this for the mail archives: we settled in dates/times.

Please see the wiki for more details, and to add your own items to the agenda:

https://github.com/open-mpi/ompi/wiki/Meeting-2021-07


On Jul 5, 2021, at 11:58 AM, Geoffrey Paulsen via devel 
mailto:devel@lists.open-mpi.org>> wrote:

Open MPI developers,

   We have not been able to meet together face to face for quite some time.  
We'd like to schedule a few 2-hour blocks for detailed discussions on topics of 
interest.

  Please fill out 
https://doodle.com/poll/rd7szze3agmyq4m5?utm_source=poll_medium=link, 
include your name and time blocks that might work for you.

  Also please add any agenda items to discuss on our wiki page here: 
https://github.com/open-mpi/ompi/wiki/Meeting-2021-07

  Thanks,
  Geoff Paulsen




--
Jeff Squyres
jsquy...@cisco.com





Re: [OMPI devel] C style rules / reformatting

2021-05-18 Thread Jeff Squyres (jsquyres) via devel
That sounds awesome; many thanks, Joseph.


On May 18, 2021, at 10:53 AM, Joseph Schuchart via devel 
mailto:devel@lists.open-mpi.org>> wrote:

Jeff, all,

I am looking into uncrustify [1], which is highly configurable and has the 
ability to ignore many of its >700 rules available. Some of the ignored rules 
seem broken, for which I have filed a bug already [2].

Right now it seems that there is no way to ignore the indentation of continued 
statements, meaning that it will always apply some indentation rule to 
continuing lines [3]. Since we don't have rules for continued statements I want 
to get uncrustify to just ignore them (I'm sure we have plenty of different 
formattings there). This might require a patch eventually.

There may be other things such as the yoda comparisons which are not yet 
available but could probably be added.

Overall, I think it can do what we want: enable a small number of rules and let 
us know whether it would change any formatting on a given file, which we can 
use in CI to test for compliance. The Score-P project has done that in the past 
(not sure what the status is right now).

The project seems actively maintained and receptive to issues and feature 
requests. I will continue looking into it, although it might take some time. If 
anyone has some cycles (or a student) to spare we might get it to where we want 
it fairly quickly.


Cheers
Joseph

[1] https://github.com/uncrustify/uncrustify
[2] https://github.com/uncrustify/uncrustify/issues/3173
[3] https://github.com/uncrustify/uncrustify/issues/3174

On 5/17/21 9:59 PM, Jeff Squyres (jsquyres) via devel wrote:
FYI: It was decided last week that we will abandon the current effort to 
reformat master / v5.0.x according to style rules.
SHORT VERSION
We have already reformatted opal/ and tests/.  But the two PRs for reformatting 
ompi/ brought up a whole bunch of issues that do not seem resolvable via 
clang-format.  As such, we're just walking away: we're not going to revert the 
reformatting that was done to opal/ and tests/ on master and v5.0.x, but we're 
just going to close the ompi/ reformatting PRs without merging.
Many thanks to Nathan who invested a lot of time in this; I'm sorry it didn't 
fully work out.  :-(
MORE DETAIL
It turns out that clang-format actually parses the C code into internal 
language primitives and then re-renders the code according to all the style 
choices that you configure.  Meaning: you have to make decisions about every 
single style choice (e.g., whether to put "&&" at the beginning or end of the 
line, when expressions span multiple lines).
This is absolutely not what we want to do.  
https://github.com/open-mpi/ompi/wiki/CodingStyle is intentionally very "light 
touch": it only specifies a bare minimum of style rules -- the small number of 
things that we could all agree on.  Everything outside of those rules is not 
regulated.
Clang-format simply doesn't work that way: you have to make a decision for 
every single style choice.
So we'll close https://github.com/open-mpi/ompi/pull/8816 and 
https://github.com/open-mpi/ompi/pull/8923 without merging them.
If someone would like to find a better tool that can:
a) re-format the ompi/ and oshmem/ trees according to our "light touch" rules
b) fail a CI test when a PR introduces a delta that results in code breaking 
the "light touch" rules
Then great: let's have that conversation.  But clang-format is not going to 
work for us, unfortunately.  :-(


--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>





Re: [OMPI devel] [jonathan...@systex.com: A question about Open MPI logo]

2021-05-18 Thread Jeff Squyres (jsquyres) via devel
Greetings Chris!

Thanks for bringing this to our attention.

The short version is that we're pretty permissive with our logo -- e.g., follow 
the same terms as our code license (BSD), and everything should be fine.

I'll reply directly to the original author.

I believe that Brian Barrett and I are the designated contacts for SPI; you can 
feel free to reach out to us directly in the future.

Thanks!


On May 18, 2021, at 7:22 AM, Chris Lamb via devel 
mailto:devel@lists.open-mpi.org>> wrote:

Hi OpenMPI devs,

My name is Chris Lamb and I'm one of the directors of SPI Inc [0]. As
you likely already know, OpenMPI is an affiliated project of SPI.

We received the email (see below my message) asking for permission to
use the OpenMPI logo. Please could you either (a) go ahead and rely
directly to Jonathan co/ Systex, or (b) let me know how you would like
SPI to reply.

Alternatively, if this is the wrong mailing list for this question,
please let me know where I should direct it instead.

Best wishes,

Chris
c/o SPI Inc.

[0] https://www.spi-inc.org/


- Original message -
From: Stephen Frost mailto:sfr...@snowman.net>>
To: bo...@spi-inc.org
Subject: (forw) [jonathan...@systex.com: A 
question about Open MPI logo]
Date: Monday, 17 May 2021 7:23 PM

- Forwarded message from Jonathan Lin-林志宗-精誠-AI & Cloud研發處 
mailto:jonathan...@systex.com> 
>> -

Date: Tue, 16 Mar 2021 02:10:14 +
From: Jonathan Lin-林志宗-精誠-AI & Cloud研發處 
mailto:jonathan...@systex.com> 
>>
To: "offic...@spi-inc.org 
>" 
mailto:offic...@spi-inc.org> 
>>
CC: Jay Hsueh mailto:jayhs...@garaotus.com> 
>>
Message-ID: 
<7b66fed4c937dc47bc3d1fa7f5a2e3e33761e...@mbx05.systex.tw
 
>>
X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_40,HTML_MESSAGE, 
MAY_BE_FORGED,RCVD_IN_DNSWL_HI,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, 
SPF_FAIL,SPF_HELO_NONE autolearn=ham
autolearn_force=no version=3.4.2
Subject: A question about Open MPI logo

Hi,



I am here from Systex corporation in Taiwan.

We have a software product that integrates and uses your Open MPI solution.

In order to let our users know more clearly that our products adopt your 
solution, we want to display your Open MPI logo on our products.

[cid:image001.png@01D71A4C.8D4EFE40]

So, I would like to ask, can we use your Open MPI logo for free and display it 
on our products? Should we apply for a logo authorization file from you to use 
your Open MPI logo for free?

Look forward for your response.


Jonathan

[SYSTEX]
This message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and delete 
this e-mail from your system. E-mail transmission cannot be guaranteed to be 
secure or error-free or virus-free as information could be intercepted, 
corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The 
sender therefore does not accept liability for any errors or omissions in the 
contents of this message, which arise as a result of e-mail transmission. If 
verification is required please request a hard-copy version. Systex 
Corporation, No.318, Ruiguang Rd., Neihu Dist., Taipei City 114, Taiwan, R.O.C.,



- End forwarded message -
- Forwarded message from jonathan 
mailto:jonathan...@systex.com> 
>> -

Date: Tue, 23 Mar 2021 16:38:41 +0800
From: jonathan mailto:jonathan...@systex.com> 
>>
To: presid...@spi-inc.org 
>, 
vicepresid...@spi-inc.org 
>, 
secret...@spi-inc.org 
>
CC: Jay Hsueh mailto:jayhs...@garaotus.com> 
>>
Message-ID: 
mailto:f8f41830-504d-437f-a688-d5e28c26a...@systex.com>
 
>>
X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,HTML_MESSAGE, 
MAY_BE_FORGED,RCVD_IN_DNSWL_HI,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, 
SPF_HELO_NONE autolearn=ham autolearn_force=no
version=3.4.2
Subject: About openMPI logo

Hi,

I am here 

Re: [OMPI devel] C style rules / reformatting

2021-05-18 Thread Jeff Squyres (jsquyres) via devel
Thanks for the datapoint, Wesley!

On May 18, 2021, at 10:20 AM, Wesley Bland 
mailto:w...@wesbland.com>> wrote:

Hey folks,

As a datapoint, in MPICH, we use GNU 
indent<https://www.gnu.org/software/indent/> for this.

Script to format code (either one file or the entire code tree) - 
https://github.com/pmodels/mpich/blob/main/maint/code-cleanup.bash
Pre-commit hook that doubles as a CI checker - 
https://github.com/pmodels/mpich/blob/3e6c3ae11df33d6c02a025aaa5b145bd262d1c32/maint/hooks/pre-commit#L52-L94

There are (as always) a few gotchas, but overall it works pretty well. The main 
annoyance we discovered was that at least between version 2.2.11 and 2.2.12, 
there were changes made that changed the output formatting with the same set of 
input arguments, which is obviously a problem if people use different versions. 
So we lock in version 2.2.11. Unfortunately, that version is not the easiest to 
find and by default didn’t work on Mac OS IIRC (patch to fix Homebrew formula 
here<https://gist.github.com/wesbland/501063f151c1eb815d8001abf2285cbe>). You 
might be better served by moving on to the most recent version, but you 
probably want to pick a version and stick with it until you verify that it 
won’t cause problems otherwise. Maybe we should figure out a way to have 
multiple versions live alongside each other so people that work on both 
projects could have both versions. The project doesn’t seem to be particularly 
active so this probably isn’t a problem that will come up often.

We picked K formatting as a starting point and then customized a few 
things<https://github.com/pmodels/mpich/blob/3e6c3ae11df33d6c02a025aaa5b145bd262d1c32/maint/code-cleanup.bash#L26-L72>
 based on our tastes. There are relatively sane defaults and a few different 
“sets” of defaults depending on what you prefer.

Let me know if you have any questions. I’d be happy to provide any pointers.

Thanks,
Wes

On May 17, 2021, at 2:59 PM, Jeff Squyres (jsquyres) via devel 
mailto:devel@lists.open-mpi.org>> wrote:

FYI: It was decided last week that we will abandon the current effort to 
reformat master / v5.0.x according to style rules.

SHORT VERSION

We have already reformatted opal/ and tests/.  But the two PRs for reformatting 
ompi/ brought up a whole bunch of issues that do not seem resolvable via 
clang-format.  As such, we're just walking away: we're not going to revert the 
reformatting that was done to opal/ and tests/ on master and v5.0.x, but we're 
just going to close the ompi/ reformatting PRs without merging.

Many thanks to Nathan who invested a lot of time in this; I'm sorry it didn't 
fully work out.  :-(

MORE DETAIL

It turns out that clang-format actually parses the C code into internal 
language primitives and then re-renders the code according to all the style 
choices that you configure.  Meaning: you have to make decisions about every 
single style choice (e.g., whether to put "&&" at the beginning or end of the 
line, when expressions span multiple lines).

This is absolutely not what we want to do.  
https://github.com/open-mpi/ompi/wiki/CodingStyle is intentionally very "light 
touch": it only specifies a bare minimum of style rules -- the small number of 
things that we could all agree on.  Everything outside of those rules is not 
regulated.

Clang-format simply doesn't work that way: you have to make a decision for 
every single style choice.

So we'll close https://github.com/open-mpi/ompi/pull/8816 and 
https://github.com/open-mpi/ompi/pull/8923 without merging them.

If someone would like to find a better tool that can:

a) re-format the ompi/ and oshmem/ trees according to our "light touch" rules
b) fail a CI test when a PR introduces a delta that results in code breaking 
the "light touch" rules

Then great: let's have that conversation.  But clang-format is not going to 
work for us, unfortunately.  :-(

--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>






--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>





[OMPI devel] C style rules / reformatting

2021-05-17 Thread Jeff Squyres (jsquyres) via devel
FYI: It was decided last week that we will abandon the current effort to 
reformat master / v5.0.x according to style rules.

SHORT VERSION

We have already reformatted opal/ and tests/.  But the two PRs for reformatting 
ompi/ brought up a whole bunch of issues that do not seem resolvable via 
clang-format.  As such, we're just walking away: we're not going to revert the 
reformatting that was done to opal/ and tests/ on master and v5.0.x, but we're 
just going to close the ompi/ reformatting PRs without merging.

Many thanks to Nathan who invested a lot of time in this; I'm sorry it didn't 
fully work out.  :-(

MORE DETAIL

It turns out that clang-format actually parses the C code into internal 
language primitives and then re-renders the code according to all the style 
choices that you configure.  Meaning: you have to make decisions about every 
single style choice (e.g., whether to put "&&" at the beginning or end of the 
line, when expressions span multiple lines).

This is absolutely not what we want to do.  
https://github.com/open-mpi/ompi/wiki/CodingStyle is intentionally very "light 
touch": it only specifies a bare minimum of style rules -- the small number of 
things that we could all agree on.  Everything outside of those rules is not 
regulated.

Clang-format simply doesn't work that way: you have to make a decision for 
every single style choice.

So we'll close https://github.com/open-mpi/ompi/pull/8816 and 
https://github.com/open-mpi/ompi/pull/8923 without merging them.

If someone would like to find a better tool that can:

a) re-format the ompi/ and oshmem/ trees according to our "light touch" rules
b) fail a CI test when a PR introduces a delta that results in code breaking 
the "light touch" rules

Then great: let's have that conversation.  But clang-format is not going to 
work for us, unfortunately.  :-(

-- 
Jeff Squyres
jsquy...@cisco.com





Re: [OMPI devel] NVIDIA 'nvfortran' cannot link libmpi_usempif08.la

2021-05-04 Thread Jeff Squyres (jsquyres) via devel
Filed https://github.com/open-mpi/ompi/issues/8919.

FWIW, Open MPI is on GitHub, not GitLab.


On May 4, 2021, at 9:55 AM, Paul Kapinos via devel 
mailto:devel@lists.open-mpi.org>> wrote:

JFYI: the sane issue is also in Open MPI 4.1.1.
I cannot open an Gitlab issue due to lack of account(*) so I would kindly ask 
somebody to open one, if possible.

Have a nice day
Paul Kapinos

(*  too many accounts in my life. )




On 4/16/21 6:02 PM, Paul Kapinos wrote:
Dear Open MPI developers,
trying to build OpenMPI/4.1.0 using the NVIDIA compilers [1] (version 21.1.xx, 
'rebranded' PGI compilers..) we ran into the below error at linking of 
libmpi_usempif08.la.
It seems something is going bad at the configure stage (detection of PIC 
flags?!). Note that last 'true' PGI compiler (tried: pgi_20.4) did not produce 
that issue.
A known workaround is to add '-fPIC' to the CFLAGS, CXXFLAGS, FCFLAGS (maybe 
not need to all of those).
(I do not add config.log this time to avoid lockot from the mailing list; of 
course I can provide this and all other kind of information)
Have a nice day,
Paul Kapinos
[1] https://developer.nvidia.com/hpc-compilers
  FCLD libmpi_usempif08.la
/usr/bin/ld: .libs/comm_spawn_multiple_f08.o: relocation R_X86_64_32S against 
`.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: .libs/startall_f08.o: relocation R_X86_64_32S against `.rodata' 
can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: .libs/testall_f08.o: relocation R_X86_64_32S against `.rodata' can 
not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: .libs/testany_f08.o: relocation R_X86_64_32S against `.rodata' can 
not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: .libs/testsome_f08.o: relocation R_X86_64_32S against `.rodata' 
can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: .libs/type_create_struct_f08.o: relocation R_X86_64_32S against 
`.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: .libs/type_get_contents_f08.o: relocation R_X86_64_32S against 
`.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: .libs/waitall_f08.o: relocation R_X86_64_32S against `.rodata' can 
not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: .libs/waitany_f08.o: relocation R_X86_64_32S against `.rodata' can 
not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: .libs/waitsome_f08.o: relocation R_X86_64_32S against `.rodata' 
can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: profile/.libs/pcomm_spawn_multiple_f08.o: relocation R_X86_64_32S 
against `.rodata' can not be used when making a shared object; recompile with 
-fPIC
/usr/bin/ld: profile/.libs/pstartall_f08.o: relocation R_X86_64_32S against 
`.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: profile/.libs/ptestall_f08.o: relocation R_X86_64_32S against 
`.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: profile/.libs/ptestany_f08.o: relocation R_X86_64_32S against 
`.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: profile/.libs/ptestsome_f08.o: relocation R_X86_64_32S against 
`.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: profile/.libs/ptype_create_struct_f08.o: relocation R_X86_64_32S 
against `.rodata' can not be used when making a shared object; recompile with 
-fPIC
/usr/bin/ld: profile/.libs/ptype_get_contents_f08.o: relocation R_X86_64_32S 
against `.rodata' can not be used when making a shared object; recompile with 
-fPIC
/usr/bin/ld: profile/.libs/pwaitall_f08.o: relocation R_X86_64_32S against 
`.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: profile/.libs/pwaitany_f08.o: relocation R_X86_64_32S against 
`.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: profile/.libs/pwaitsome_f08.o: relocation R_X86_64_32S against 
`.rodata' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: .libs/abort_f08.o: relocation R_X86_64_PC32 against symbol

`ompi_abort_f' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value


--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915



--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] inconsistency of behaviour and description of "openmpi-4.1.0$ ./configure --with-pmix=external ... "

2021-04-19 Thread Jeff Squyres (jsquyres) via devel
Yes, that does feel weird.  I've filed 
https://github.com/open-mpi/ompi/issues/8829 to track the issue.

On Apr 19, 2021, at 6:42 AM, Paul Kapinos via devel 
mailto:devel@lists.open-mpi.org>> wrote:

Dear Open MPI developer,

The help for ./configure in openmpi-4.1.0 says (see also below):
 > external" forces Open MPI to use an external installation of PMIx.

If I understood correctly, this (also) means that if there is *no* external 
PMIx available on the system, the ./configure run should fail and exit with 
non-zero exit code.

However, when trying this on a node with no PMIx library installed, we just 
found out that the configure finishes with no error message, and configure also 
internal PMIx,
 > checking if user requested internal PMIx support(external)... no
 > ...
 > PMIx support: Internal

This is surprising and feels like an error. Could you have a look at this? 
Thank you!

Have a nice day,
Paul Kapinos

P.S. grep for 'PMIx' in config-log
https://rwth-aachen.sciebo.de/s/xtNIx2dJlTy2Ams
(pastebin and gist both need accounts and I hate accounts).


./configure --help
  .
   --with-pmix(=DIR)   Build PMIx support. DIR can take one of three
   values: "internal", "external", or a valid directory
   name. "internal" (or no DIR value) forces Open MPI
   to use its internal copy of PMIx. "external" forces
   Open MPI to use an external installation of PMIx.
   Supplying a valid directory name also forces Open
   MPI to use an external installation of PMIx, and
   adds DIR/include, DIR/lib, and DIR/lib64 to the
   search path for headers and libraries. Note that
   Open MPI does not support --without-pmix.

--
Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
RWTH Aachen University, IT Center
Seffenter Weg 23,  D 52074  Aachen (Germany)
Tel: +49 241/80-24915





--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] Remove stale OPAL dss and opal_tree code

2021-02-18 Thread Jeff Squyres (jsquyres) via devel
On Feb 18, 2021, at 10:55 AM, Ralph Castain via devel 
mailto:devel@lists.open-mpi.org>> wrote:

I'm planning on removing the OPAL dss (pack/unpack) code as part of my work to 
reduce the code base I historically supported. The pack/unpack functionality is 
now in PMIx (has been since v3.0 was released), and so we had duplicate 
capabilities spread across OPAL and PRRTE. I have already removed the PRRTE 
code in favor of unifying on PMIx as the lowest common denominator.

The following PR removes OPAL dss and an "opal_tree" class that called it but 
wasn't used anywhere in the code: https://github.com/open-mpi/ompi/pull/8492

I have updated the very few places in OMPI/OSHMEM that touched the dss to use 
the PMIx equivalents. Please take a look and note any concerns on the PR. Minus 
any objections, I'll plan on committing this after next Tuesday's telecon.

Duplicate / dead code removal: excellent!

--
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] Support for AMD M100?

2021-02-11 Thread Jeff Squyres (jsquyres) via devel
There's not really any generic "accelerator" infrastructure in Open MPI itself 
-- there's a bunch of explicit CUDA support.

But even some of that moved downward into both Libfabric and UCX and (at least 
somewhat) out of OMPI.

That being said, we just added the AVX MPI_Op component -- equivalent 
components could be added for CUDA and/or AMD's GPU (what API does it use -- 
OpenCL?).  That being said, I would imagine that the data inputs would need to 
be very large to make it worthwhile (wall-clock execution-wise) to offload 
MPI_Op operations to a discrete GPU on the other side of the PCI bus.



On Feb 11, 2021, at 1:02 PM, Heinz, Michael William 
mailto:michael.william.he...@cornelisnetworks.com>>
 wrote:

Pretty much, yeah.

From: Jeff Squyres (jsquyres) mailto:jsquy...@cisco.com>>
Sent: Thursday, February 11, 2021 12:58 PM
To: Open MPI Developers List 
mailto:devel@lists.open-mpi.org>>
Cc: Heinz, Michael William 
mailto:michael.william.he...@cornelisnetworks.com>>
Subject: Re: [OMPI devel] Support for AMD M100?

On Feb 11, 2021, at 12:23 PM, Heinz, Michael William via devel 
mailto:devel@lists.open-mpi.org>> wrote:

Has the subject of supporting AMD’s new GPUs come up?

We’re discussing supporting it in PSM2 but it occurred to me that that won’t 
help much if higher-level APIs don’t support it, too…

You mean supporting the AMD GPU in the same way that we have CUDA support for 
NVIDIA GPUs?

--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>


--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>



Re: [OMPI devel] Support for AMD M100?

2021-02-11 Thread Jeff Squyres (jsquyres) via devel
On Feb 11, 2021, at 12:23 PM, Heinz, Michael William via devel 
mailto:devel@lists.open-mpi.org>> wrote:

Has the subject of supporting AMD’s new GPUs come up?

We’re discussing supporting it in PSM2 but it occurred to me that that won’t 
help much if higher-level APIs don’t support it, too…

You mean supporting the AMD GPU in the same way that we have CUDA support for 
NVIDIA GPUs?

--
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Github Actions: cherry-pick commit message checker

2021-02-09 Thread Jeff Squyres (jsquyres) via devel
I talked about this PR on the webex today:

https://github.com/open-mpi/ompi/pull/8431

Comments / feedback would be welcome.

Unless I hear objections, I plan to merge this on master COB tomorrow (10 Feb 
2021).

Thanks!

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] configure problem on master

2021-02-05 Thread Jeff Squyres (jsquyres) via devel
Which library?

On Feb 5, 2021, at 4:46 PM, Gabriel, Edgar via devel 
mailto:devel@lists.open-mpi.org>> wrote:

I just noticed another ‘new’ aspect of configure on master. The sharedfp/sm 
component is now not compiled anymore, I double checked that it is still 
correctly detected and handled on 4.1, so it must be a recent change.

On master:
--snip--

--- MCA component sharedfp:sm (m4 configuration macro)
checking for MCA component sharedfp:sm compile mode... dso
checking semaphore.h usability... yes
checking semaphore.h presence... yes
checking for semaphore.h... yes
checking for sem_open... no
checking for semaphore.h... (cached) yes
checking for sem_init... no
checking if MCA component sharedfp:sm can compile... no

--snip --

while e.g. on 4.1 on exactly the same platform:

--snip--
--- MCA component sharedfp:sm (m4 configuration macro)
checking for MCA component sharedfp:sm compile mode... dso
checking semaphore.h usability... yes
checking semaphore.h presence... yes
checking for semaphore.h... yes
checking for sem_open... yes
checking for semaphore.h... (cached) yes
checking for sem_init... yes
checking if MCA component sharedfp:sm can compile... yes
---snip---

it looks like a library that is required for the semaphore operations was be 
default included previously, and is not anymore?
Thanks
Edgar

From: devel 
mailto:devel-boun...@lists.open-mpi.org>> On 
Behalf Of Gabriel, Edgar via devel
Sent: Thursday, February 4, 2021 2:15 PM
To: Open MPI Developers 
mailto:devel@lists.open-mpi.org>>
Cc: Gabriel, Edgar mailto:egabr...@central.uh.edu>>
Subject: Re: [OMPI devel] configure problem on master

excellent, thanks! I have meanwhile a more detailed suspicion:
--
looking for library without search path
checking for library containing llapi_file_create... no
looking for library in /opt/lustre/2.12.2/lib64
checking for library containing llapi_file_create... (cached) no
--

the liblustrapi library is in the /opt/lustre/2.12.2/lib64, so the configure 
script should not be using the cached value (no) but really check again. This 
seems to be the key difference between the ompi and openpmix scripts.

(for comparison the ompi output is

looking for library in lib
checking for library containing llapi_file_create... no
looking for library in lib64
checking for library containing llapi_file_create... -llustreapi )


Thanks
Edgar

From: devel 
mailto:devel-boun...@lists.open-mpi.org>> On 
Behalf Of Ralph Castain via devel
Sent: Thursday, February 4, 2021 2:02 PM
To: OpenMPI Devel mailto:devel@lists.open-mpi.org>>
Cc: Ralph Castain mailto:r...@open-mpi.org>>
Subject: Re: [OMPI devel] configure problem on master

Sounds like I need to resync the PMIx lustre configury with the OMPI one - I'll 
do that.


On Feb 4, 2021, at 11:56 AM, Gabriel, Edgar via devel 
mailto:devel@lists.open-mpi.org>> wrote:

I have a weird problem running configure on master on our cluster. Basically, 
configure fails when I request lustre support, but not from ompio but openpmix.

What makes our cluster setup maybe a bit special is that the lustre libraries 
are not installed in the standard path, but in /opt, and thus we provide the 
--with-lustre=/opt/lustre/2.12.2 as an option.
If I remove the 3rd-party/openpmix/src/mca/pstrg/lustre component, the 
configure script finishes correctly.

I looked at the ompi vs. openmpix check_lustre configure scripts, I could not 
detect on a quick glance any difference that would explain why the script is 
failing in one instance but not the other, but the openpmix version does seem 
to go through some additional hoops (checking separately for the include 
directory, the lib and lib64 directories etc), so it might be a difference  in 
the PMIX_ macros vs. the OPAL_ macros.

--snip--

--- MCA component fs:lustre (m4 configuration macro)
checking for MCA component fs:lustre compile mode... dso
checking --with-lustre value... sanity check ok (/opt/lustre/2.12.2)
checking looking for lustre libraries and header files in... 
(/opt/lustre/2.12.2)
checking lustre/lustreapi.h usability... yes
checking lustre/lustreapi.h presence... yes
checking for lustre/lustreapi.h... yes
looking for library in lib
checking for library containing llapi_file_create... no
looking for library in lib64
checking for library containing llapi_file_create... -llustreapi
checking if liblustreapi requires libnl v1 or v3...
checking for required lustre data structures... yes
checking if MCA component fs:lustre can compile... yes

--snip --

--- MCA component pstrg:lustre (m4 configuration macro)
checking for MCA component pstrg:lustre compile mode... dso
checking --with-lustre value... sanity check ok (/opt/lustre/2.12.2)
checking looking for lustre libraries and header files in... 
(/opt/lustre/2.12.2)
looking for header in /opt/lustre/2.12.2
checking lustre/lustreapi.h usability... no
checking lustre/lustreapi.h presence... no
checking for lustre/lustreapi.h... no
looking for header in /opt/lustre/2.12.2/include
checking 

Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-02-01 Thread Jeff Squyres (jsquyres) via devel
On Jan 27, 2021, at 7:19 PM, Gilles Gouaillardet  wrote:
> 
> What I meant is the default Linux behavior is to first lookup dependencies in 
> the rpath, and then fallback to LD_LIBRARY_PATH
> *unless* -Wl,--enable-new-dtags was used at link time.
> 
> In the case of Open MPI, -Wl,--enable-new-dtags is added to the MPI wrappers,
> but Open MPI is *not* built with this option.

Oh, I see where I got confused: Open MPI (core and DSO components) is built 
with -rpath, but not --enable-new-dtags.

Hmm.  ...trying to remember why we would have made that choice...

I don't see any obvious reason cited in the git history.  Do you remember?

> That means, that by default, mca_pml_ucx.so and friends will get libuc?.so 
> libraries at runtime from rpath
> (and that cannot be overridden by LD_LIBRARY_PATH).

Gotcha.

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] How to build Open MPI so the UCX used can be changed at runtime?

2021-01-27 Thread Jeff Squyres (jsquyres) via devel
On Jan 27, 2021, at 2:00 AM, Gilles Gouaillardet via devel 
mailto:devel@lists.open-mpi.org>> wrote:

Tim,

a simple option is to

configure ... LDFLAGS="-Wl,--enable-new-dtags"


If Open MPI is built with this option, then LD_LIBRARY_PATH takes precedence 
over rpath

(the default is the opposite as correctly pointed by Yossi in an earlier 
message)

Are you sure about the default?  I just did a default Open MPI v4.1.0 build on 
Linux with gcc 8.x:

$ mpicc --showme
gcc -I/home/jsquyres/bogus/include -pthread -Wl,-rpath 
-Wl,/home/jsquyres/bogus/lib -Wl,--enable-new-dtags -L/home/jsquyres/bogus/lib 
-lmpi

--
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Open MPI 4.1.0rc4

2020-11-24 Thread Jeff Squyres (jsquyres) via devel
We're getting close: 4.1.0rc4 is now available.

https://www.open-mpi.org/software/ompi/v4.1/

The list of things left before v4.1.0 is final is getting very, very short.

Changes since rc3:

- Configury fixes for macOS Big Sur
- Minor one-sided RDMA performance improvements
- Fix OSHMEM compile error with some compilers
- hcoll: Scatterv MPI_IN_PLACE fixes
- mtl/ofi: Check cq_data_size without querying providers again
- Fix computation of process relative locality

We made an rc3 a short while ago, but I neglected to send the email about it.  
So here's the list of differences since rc2:

- HAN and ADAPT coll modules
- UCX zero-size datatype transfer fixes
- Take more steps towards "reproducible" builds
- AVX fixes
- Updates for mpirun(1) man page about "slots" and PE=x values
- Fix buffer allocation for large environment variables
- Fix cpu-list for non-uniform nodes
- Update Internal PMIx to OpenPMIx v3.2.1
- Disable man pages for internal OpenPMIx
- Fix some symbol pollution
- Make Type_create_resized set FLAG_USER_UB
- Fix OFI configury CPPFLAGS to find fabric.h
- mtl/ofi: Do not fail if error CQ is empty
- mtl/ofi: Fix erroneous FI_PEEK/FI_CLAIM usage
- Update coll tuned thresholds for modern HPC environments

-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Open MPI documentation

2020-11-16 Thread Jeff Squyres (jsquyres) via devel
Over the past few months, I've been musing about Open MPI's documentation.

Short version
=

For v5.0.0, I propose:

1. Move the README to reStructured Text ("RST"), and split it up into multiple 
pages.
   * Use Sphinx (https://www.sphinx-doc.org/en/master/) to render this RST into 
HTML.
   * Host the HTML on readthedocs.io.
2. Also (eventually) move the Open MPI man pages into this RST tree.
3. Also (eventually) move the FAQ into this RST tree.

This would bring all 3 sources of our docs together into a single, cohesive set.

I took a swipe at #1 -- see 
https://aws.open-mpi.org/~jsquyres/rst-docs-unofficial/.
--> This link will likely only work for some amount of time in Nov 2020.

I also propose that we RST-ize the v4.1 README into RST and also host it on 
readthedocs.io.

More detail
===

Our current docs are kind of a mess.

1. The README has a *ton* of information in it, but it's a giant wall of text, 
and I'll bet very few people actually read it.
2. The FAQ also has a ton of information, but it suffers from version skew, and 
is difficult to maintain.
3. The man pages have also bit rotted a bit, but there's a ton of good info in 
there, too.

The goal would be to bring all 3 of these together into a cohesive set of docs 
in an aesthetically pleasing, web-friendly, Google-friendly, and 
maintainer-friendly way.

The Sphinx package seems to do this pretty well:

- Everything is written in RST and/or Markdown (although RST is more 
feature-full than Markdown).
- You can render to static HTML to a local directory and then browse it with 
your local web browser
- Free hosting of open source Sphinx-generated documentation is provided on 
readthedocs.io
- readthedocs.io also offers GitHub webhooks to automatically re-render if the 
docs ever change in your GitHub repo (sweet!)

We can either include the Sphinx-generated HTML on the main Open MPI web site 
or just use the free hosting on readthedocs.io (I'm leaning towards using their 
hosting, but still need to understand that a bit more).  One of the advantages 
of readthedocs.io-hosted docs is that they are versioned.  This would seem to 
solve the version-skew problems with our current FAQ.

The FAQ will require hand-conversion to RST.  I'll do that, but it'll take me 
some time to get it done.

As some of you may know, some students have taken a swipe at Markdown-izing the 
man pages (we opted for Markdown before we started looking at Sphinx / 
readthedocs.io).  By the end of their semester (i.e., within a week or three), 
they'll have a few dozen of the man pages converted to Markdown.  They also 
have a Python script to do most of the conversion nroff->markdown.  So even if 
they don't finish Markdown-izing all the man pages, it's possible for someone 
in the community to finish that effort.

These man pages could be imported into a Sphinx project as Markdown, or they 
could be converted to RST (MD to RST is a fairly straightforward conversion).

I think that if we actually manage to convert 3 sources into a single, cohesive 
RST tree, that will be a huge win for making the docs much more:

- readable by end users (smaller, individual pages with docs content)
- Google friendly
- maintainable by the community

However, all of this is on master (i.e., what will become v5.0.0).  Given that 
v5.0.0 may still take some time to get released. I think it might be worthwhile 
to:

1. Basically repeat the RST-ization of the README on the v4.1.x branch (which 
is relatively straightforward)
2. Publish the this RST-ized tree out to readthedocs.io
3. NOT convert the v4.1.x man pages or FAQ (which would be a ton a work) -- 
i.e., leave the FAQ and man pages as they are for v4.1.x

This will at least improve the README content for v4.1.x, and increase the 
Google-ability of our documentation content.  Over time -- i.e., when v5.0.0 is 
released -- we'll basically add the v5.0.0 version at readthedocs.io and 
include the FAQ and man pages in the content.

Thoughts?

-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Continued Open MPI mailing lists instability

2020-11-05 Thread Jeff Squyres (jsquyres) via devel
Over the past month or so, you may have noticed that sometimes mails you sent 
to the Open MPI mailing lists were delayed, sometimes by multiple days.

Our mailing list provider has been experiencing technical difficulties in 
keeping the Open MPI mailing lists flowing.  They tell us that they had a 
"Eureka!" moment yesterday and they think that they have fixed the underlying 
problem.

Our apologies for any confusion stemming from the delays in getting mails 
delivered.  Hopefully our lists will now be much more responsive!

-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Change in OMPI component build behavior on master (i.e., what will become v5.0)

2020-10-28 Thread Jeff Squyres (jsquyres) via devel
Folks --

It's worth pointing out https://github.com/open-mpi/ompi/pull/8132.

This PR will change Open MPI's behavior from building all components as DSOs to 
including them in the core Open MPI libraries.

The motivation for this change is to decrease filesystem activity upon job 
start.

Dlopen is still enabled by default, so if there are 3rd party components in the 
filesystem, Open MPI will still find / open / use them.

For example: by default, the TCP BTL will no longer be built as mca_btl_tcp.so 
-- it will be part of libopal.  That being said, you can use --enable-mca-dso 
to override this behavior for some/all of Open MPI's components. This can be 
quite useful in a developer scenario; for example, if you're writing/debugging 
exclusively in a component, you might want to:

./configure --enable-mca-dso=btl-tcp ...

So that the TCP BTL is still built as a standalone component for ease of 
iterating over compiling/installing just that component.

See the README changes on this PR to more fully explain --enable-mca-dso and 
--enable-mca-static.  The behavior of --enable|disable-dlopen was also 
clarified.

If no one objects to this change, I'll merge it next Tuesday after the weekly 
Open MPI webex.

-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Delays in Open MPI mailing list

2020-10-21 Thread Jeff Squyres (jsquyres) via devel
FYI: We've been having some problems with our mailing list provider over the 
past few weeks.

No mails have been lost, but sometimes mails queue up endlessly at our mailing 
list provider until a human IT staffer goes in, fixes a problem, and 
effectively releases all the mails that have queued up.  This can result in 
large delays between when you send an email and when it is actually delivered 
to the list.

Sorry about this!

Our mailing list provider is working on it, and hopes to have it resolved soon 
(e.g., today they're rebuilding our mailman server from scratch in the hopes 
that it will be more reliable than the previous setup).

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] UCX and older hardwre

2020-10-21 Thread Jeff Squyres (jsquyres) via devel
Looks like Yossi replied: 
https://github.com/open-mpi/ompi/issues/7968#issuecomment-713738841

He said the fix has been included in the UCX 1.9.0 release.


On Oct 21, 2020, at 10:38 AM, Barrett, Brian via devel 
mailto:devel@lists.open-mpi.org>> wrote:

UCX folks -

As part of https://github.com/open-mpi/ompi/issues/7968, a README item was 
added about the segfault in UCX for older IB hardware.  That note said the 
issue would be fixed in UCX 1.10.  Aurelien later added a note saying it was 
fixed in UCX 1.9.0-rc3.  Which version should be referenced in the README: 1.9 
or 1.10?  We are trying to get the documentation set for Open MPI 4.1 and 
master.

Thanks,

Brian and Jeff



--
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Hwloc debug output in default mpirun on master

2020-10-12 Thread Jeff Squyres (jsquyres) via devel
On master, it looks like there's somehow some default hwloc debugging 
information enabled by default.

For example:

-
$ hostname
mpi002
$ mpirun hostname
hwloc verbose debug enabled, may be disabled with HWLOC_DEBUG_VERBOSE=0 in the 
environment.
CPU phase discovery...
CPU phase discovery in component linux...
hwloc verbose debug enabled, may be disabled with HWLOC_DEBUG_VERBOSE=0 in the 
environment.
...lots of hwloc output snipped...
PCI(busid=:89:07.6 id=1137:00cf class=0200(Ethernet) 
link=8.00GB/s)
PCI(busid=:89:07.7 id=1137:00cf class=0200(Ethernet) 
link=8.00GB/s)
PCI(busid=:89:08.0 id=1137:00cf class=0200(Ethernet) 
link=8.00GB/s)

Propagate total memory up
mpi002
$
-

I don't know exactly where to go look for this in the code base.

Can someone point me where to fix this?

Thanks!

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] [EXTERNAL] Which compiler versions to test?

2020-10-09 Thread Jeff Squyres (jsquyres) via devel
On Oct 8, 2020, at 9:59 PM, Baker, Lawrence M  wrote:
> 
> The PGI/nVidia compiler suite is free now and could become more significant 
> in the ARM world, now that nVidia has acquired ARM.  We use PGI on our 
> cluster, along with the others you support.

Yes, quite probably true.

NVIDIA is actually a member of the Open MPI community.

Akshay: are you testing Open MPI with the PGI compiler in MTT?

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] Which compiler versions to test?

2020-10-09 Thread Jeff Squyres (jsquyres) via devel
On Oct 8, 2020, at 9:40 PM, Gilles Gouaillardet via devel 
 wrote:
> 
> On RHEL 8.x, the default gcc compiler is 8.3.1, so I think it is worth 
> testing.

Excellent point.

> Containers could be used to setup a RHEL 8.x environment (so not only
> gcc but also third party libs such as libevent and hwloc can be used)
> if the MTT cluster will not shrink bigger.


I doubt I have enough time to set that up, but I can certainly get gcc 8.3.1 
installed and added to my list.  It's probably "close enough" to RHEL's 8.3.1 
for MTT purposes.

-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Which compiler versions to test?

2020-10-08 Thread Jeff Squyres (jsquyres) via devel
Open question to the Open MPI dev community...

Over time, the size of my MTT cluster has been growing smaller (due to hardware 
failure, power budget restrictions, etc.).  This means that I have far fewer 
CPU cycles available for testing various compilers and configure CLI options 
than I used to.

What compilers does the community think are worthwhile to test these days?  I 
generally have access to gcc/gfortran, clang, and some versions of the Intel 
compiler suite.

master, 4.0.x, and 4.1.x branches
- gcc 4.8.5 (i.e., the default gcc on RHEL 7.x)
- gcc 9.latest
- gcc 10.latest
- clang 9.0.latest
- clang 10.0.latest
- Intel 2017
- Intel 2019

(I don't have Intel 2018 or Intel 2020)

Is this sufficient?  Or is it worthwhile to test other versions of these 
compilers?

-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] New public Open MPI test repo

2020-08-10 Thread Jeff Squyres (jsquyres) via devel
One of the topics we discussed today was having a public git repo for Open MPI 
community tests.  A few community members wanted to have tests that were 
written with public funding to be public.  A very fair point.

(Recall: we have a private repo that is only available to Open MPI core 
developers, not because it contains secret stuff, but rather because back at 
the beginning of the project, we wanted a single Subversion repo where we could 
stash all of our tests, and although all the tests were open source, they 
weren't all *ours* -- they were from other groups/organizations, and we didn't 
know if we had redistribution rights.  Hence, the repository stayed closed)

I have therefore created https://github.com/open-mpi/ompi-tests-public.

This repo can be used for new tests that will be BSD licensed (just like the 
main Open MPI package).  This repo is not intended to be a production-quality, 
distributable set of tests, but rather a loose-collection of tests that are 
useful to the Open MPI community.  That being said, there may be parts of it 
that are applicable to wider audiences.

Feel free to start populating this repository.

-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] August virtual developer meeting

2020-07-21 Thread Jeff Squyres (jsquyres) via devel
The dates/times for the August 2020 developer virtual meeting have been decided:

Monday-Tuesday, August 10-11, 2020
• 8am-11am US Pacific time
• 11am-2pm US Eastern time
• 3-6pm UTC
• 5-8pm CEST

More information is on the wiki.  Please start adding agenda items:

https://github.com/open-mpi/ompi/wiki/Meeting-2020-08

-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] The ABCs of Open MPI (parts 1 and 2): slides + videos posted

2020-07-14 Thread Jeff Squyres (jsquyres) via devel
The slides and videos for parts 1 and 2 of the online seminar presentation "The 
ABCs of Open MPI" have been posted on both the Open MPI web site and the 
EasyBuild wiki:

https://www.open-mpi.org/video/?category=general

https://github.com/easybuilders/easybuild/wiki/EasyBuild-Tech-Talks-I:-Open-MPI

The last part of the seminar (part 3) will be held on Wednesday, August 5, 2020 
at:

- 11am US Eastern time
- 8am US Pacific time
- 3pm UTC
- 5pm CEST

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] [Open MPI Announce] Online presentation: the ABCs of Open MPI

2020-07-06 Thread Jeff Squyres (jsquyres) via devel
Gentle reminder that part 2 of "The ABCs of Open MPI" will be this Wednesday, 8 
July, 2020 at:

- 8am US Pacific time
- 11am US Eastern time
- 3pm UTC
- 5pm CEST

Ralph and I will be continuing our discussion and explanations of the Open MPI 
ecosystem.  The Webex link to join is on the event wiki page:


https://github.com/easybuilders/easybuild/wiki/EasyBuild-Tech-Talks-I:-Open-MPI
The wiki page also has links to the slides and video from the first session.
We've also linked the slides and video on the main Open MPI web 
site<https://www.open-mpi.org/video/?category=general#abcs-of-open-mpi-part-1>.

Additionally, Ralph and I decided that we have so much material that we're 
actually extending to have a *third* session on Wednesday August 5th, 2020 (in 
the same time slot).

Please share this information with others who may be interested in attending 
the 2nd and/or 3rd sessions.



On Jun 22, 2020, at 12:10 PM, Jeff Squyres 
mailto:jsquy...@cisco.com>> wrote:

After assembling the content for this online presentation (based on questions 
and comments from the user community), we have so much material to cover that 
we're going to split it into two sessions.

The first part will be **this Wednesday (24 June 2020)*** at:

- 8am US Pacific time
- 11am US Eastern time
- 3pm UTC
- 5pm CEST

The second part will be two weeks later, on Wednesday, 8 July, 2020, in the 
same time slot.

   
https://github.com/easybuilders/easybuild/wiki/EasyBuild-Tech-Talks-I:-Open-MPI

Anyone is free to join either / both parts.

Hope to see you this Wednesday!




On Jun 14, 2020, at 2:05 PM, Jeff Squyres (jsquyres) via announce 
mailto:annou...@lists.open-mpi.org>> wrote:

In conjunction with the EasyBuild community, Ralph Castain (Intel, Open MPI, 
PMIx) and Jeff Squyres (Cisco, Open MPI) will host an online presentation about 
Open MPI on **Wednesday June 24th 2020** at:

- 11am US Eastern time
- 8am US Pacific time
- 3pm UTC
- 5pm CEST

The general scope of the presentation will be to demystify the alphabet soup of 
the Open MPI ecosystem: the user-facing frameworks and components, the 3rd 
party dependencies, etc.  More information, including topics to be covered and 
WebEx connection details, is available at:

https://github.com/easybuilders/easybuild/wiki/EasyBuild-Tech-Talks-I:-Open-MPI

The presentation is open for anyone to join.  There is no need to register up 
front, just show up!

The session will be recorded and will be available after the fact.

Please share this information with others who may be interested in attending.


--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>



Re: [OMPI devel] [Open MPI Announce] Online presentation: the ABCs of Open MPI

2020-06-22 Thread Jeff Squyres (jsquyres) via devel
After assembling the content for this online presentation (based on questions 
and comments from the user community), we have so much material to cover that 
we're going to split it into two sessions.

The first part will be **this Wednesday (24 June 2020)*** at:

- 8am US Pacific time
- 11am US Eastern time
- 3pm UTC
- 5pm CEST

The second part will be two weeks later, on Wednesday, 8 July, 2020, in the 
same time slot.


https://github.com/easybuilders/easybuild/wiki/EasyBuild-Tech-Talks-I:-Open-MPI

Anyone is free to join either / both parts.

Hope to see you this Wednesday!




> On Jun 14, 2020, at 2:05 PM, Jeff Squyres (jsquyres) via announce 
>  wrote:
> 
> In conjunction with the EasyBuild community, Ralph Castain (Intel, Open MPI, 
> PMIx) and Jeff Squyres (Cisco, Open MPI) will host an online presentation 
> about Open MPI on **Wednesday June 24th 2020** at:
> 
> - 11am US Eastern time
> - 8am US Pacific time
> - 3pm UTC
> - 5pm CEST
> 
> The general scope of the presentation will be to demystify the alphabet soup 
> of the Open MPI ecosystem: the user-facing frameworks and components, the 3rd 
> party dependencies, etc.  More information, including topics to be covered 
> and WebEx connection details, is available at:
> 
>  
> https://github.com/easybuilders/easybuild/wiki/EasyBuild-Tech-Talks-I:-Open-MPI
> 
> The presentation is open for anyone to join.  There is no need to register up 
> front, just show up!
> 
> The session will be recorded and will be available after the fact.
> 
> Please share this information with others who may be interested in attending.
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> 
> ___
> announce mailing list
> annou...@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/announce


-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Online presentation: the ABCs of Open MPI

2020-06-14 Thread Jeff Squyres (jsquyres) via devel
In conjunction with the EasyBuild community, Ralph Castain (Intel, Open MPI, 
PMIx) and Jeff Squyres (Cisco, Open MPI) will host an online presentation about 
Open MPI on **Wednesday June 24th 2020** at:

- 11am US Eastern time
- 8am US Pacific time
- 3pm UTC
- 5pm CEST

The general scope of the presentation will be to demystify the alphabet soup of 
the Open MPI ecosystem: the user-facing frameworks and components, the 3rd 
party dependencies, etc.  More information, including topics to be covered and 
WebEx connection details, is available at:

  
https://github.com/easybuilders/easybuild/wiki/EasyBuild-Tech-Talks-I:-Open-MPI

The presentation is open for anyone to join.  There is no need to register up 
front, just show up!

The session will be recorded and will be available after the fact.

Please share this information with others who may be interested in attending.

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] Running data collection for collectives tuning (slurm option included)

2020-04-16 Thread Jeff Squyres (jsquyres) via devel
I tweaked your scripts a bunch so that I could run a bunch of different 
variations on my cluster.

I have lots of jobs queued up (I have 29 nodes in my cluster -- 3 have died 
over time); they'll take a bunch of time to execute.

 JOBID PARTITION NAME USER ST   TIME  NODES 
NODELIST(REASON)
   3204131   jenkins alltoall jsquyres PD   0:00  8 (Resources)
   3204132   jenkins alltoall jsquyres PD   0:00  8 (Resources)
   3204133   jenkins  barrier jsquyres PD   0:00  8 (Resources)
   3204134   jenkinsbcast jsquyres PD   0:00  8 (Resources)
   3204135   jenkins   gather jsquyres PD   0:00  8 (Resources)
   3204136   jenkins   reduce jsquyres PD   0:00  8 (Resources)
   3204137   jenkins reduce_s jsquyres PD   0:00  8 (Resources)
   3204138   jenkins reduce_s jsquyres PD   0:00  8 (Resources)
   3204139   jenkins  scatter jsquyres PD   0:00  8 (Resources)
   3204140   jenkins allgathe jsquyres PD   0:00  8 (Resources)
   3204141   jenkins allgathe jsquyres PD   0:00  8 (Resources)
   3204142   jenkins allreduc jsquyres PD   0:00  8 (Resources)
   3204143   jenkins alltoall jsquyres PD   0:00  8 (Resources)
   3204144   jenkins alltoall jsquyres PD   0:00  8 (Resources)
   3204145   jenkins  barrier jsquyres PD   0:00  8 (Resources)
   3204146   jenkinsbcast jsquyres PD   0:00  8 (Resources)
   3204147   jenkins   gather jsquyres PD   0:00  8 (Resources)
   3204148   jenkins   reduce jsquyres PD   0:00  8 (Resources)
   3204149   jenkins reduce_s jsquyres PD   0:00  8 (Resources)
   3204150   jenkins reduce_s jsquyres PD   0:00  8 (Resources)
   3204151   jenkins  scatter jsquyres PD   0:00  8 (Resources)
   3204152   jenkins allgathe jsquyres PD   0:00 16 (Resources)
   3204153   jenkins allgathe jsquyres PD   0:00 16 (Resources)
   3204154   jenkins allreduc jsquyres PD   0:00 16 (Resources)
   3204155   jenkins alltoall jsquyres PD   0:00 16 (Resources)
   3204156   jenkins alltoall jsquyres PD   0:00 16 (Resources)
   3204157   jenkins  barrier jsquyres PD   0:00 16 (Resources)
   3204158   jenkinsbcast jsquyres PD   0:00 16 (Resources)
   3204159   jenkins   gather jsquyres PD   0:00 16 (Resources)
   3204160   jenkins   reduce jsquyres PD   0:00 16 (Resources)
   3204161   jenkins reduce_s jsquyres PD   0:00 16 (Resources)
   3204162   jenkins reduce_s jsquyres PD   0:00 16 (Resources)
   3204163   jenkins  scatter jsquyres PD   0:00 16 (Resources)
   3204164   jenkins allgathe jsquyres PD   0:00 16 (Resources)
   3204165   jenkins allgathe jsquyres PD   0:00 16 (Resources)
   3204166   jenkins allreduc jsquyres PD   0:00 16 (Resources)
   3204167   jenkins alltoall jsquyres PD   0:00 16 (Resources)
   3204168   jenkins alltoall jsquyres PD   0:00 16 (Resources)
   3204169   jenkins  barrier jsquyres PD   0:00 16 (Resources)
   3204170   jenkinsbcast jsquyres PD   0:00 16 (Resources)
   3204171   jenkins   gather jsquyres PD   0:00 16 (Resources)
   3204172   jenkins   reduce jsquyres PD   0:00 16 (Resources)
   3204173   jenkins reduce_s jsquyres PD   0:00 16 (Resources)
   3204174   jenkins reduce_s jsquyres PD   0:00 16 (Resources)
   3204175   jenkins  scatter jsquyres PD   0:00 16 (Resources)
   3204176   jenkins allgathe jsquyres PD   0:00 16 (Resources)
   3204177   jenkins allgathe jsquyres PD   0:00 16 (Resources)
   3204178   jenkins allreduc jsquyres PD   0:00 16 (Resources)
   3204179   jenkins alltoall jsquyres PD   0:00 16 (Resources)
   3204180   jenkins alltoall jsquyres PD   0:00 16 (Resources)
   3204181   jenkins  barrier jsquyres PD   0:00 16 (Resources)
   3204182   jenkinsbcast jsquyres PD   0:00 16 (Resources)
   3204183   jenkins   gather jsquyres PD   0:00 16 (Resources)
   3204184   jenkins   reduce jsquyres PD   0:00 16 (Resources)
   3204185   jenkins reduce_s jsquyres PD   0:00 16 (Resources)
   3204186   jenkins reduce_s jsquyres PD   0:00 16 (Resources)
   3204187   jenkins  scatter jsquyres PD   0:00 16 (Resources)
   3204188   jenkins allgathe jsquyres PD   0:00 29 (Resources)
   3204189   jenkins allgathe 

Re: [OMPI devel] MPI_Info args to spawn - resolving deprecated values?

2020-04-08 Thread Jeff Squyres (jsquyres) via devel
On Apr 8, 2020, at 9:51 AM, Ralph Castain via devel  
wrote:
> 
> We have deprecated a number of cmd line options (e.g., bynode, npernode, 
> npersocket) - what do we want to do about their MPI_Info equivalents when 
> calling comm_spawn?
> 
> Do I silently convert them? Should we output a deprecation warning? Return an 
> error?


We should probably do something similar to what happens on the command line 
(i.e., warn and convert).

-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Threads MCA framework

2020-03-19 Thread Jeff Squyres (jsquyres) via devel
If anyone wants to comment on the new threads MCA framework, now's the time:

https://github.com/open-mpi/ompi/pull/6578

I.e., functionality-wise, it looks ready to merge (to me).

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] Adding a new RAS module

2020-02-29 Thread Jeff Squyres (jsquyres) via devel
Some logistics for you:

- As Ralph mentioned, the ORTE code effectively moved to 
GitHub.com/openpmix/prrte
- That code is now referenced in the Open MPI code base via a git submodule in 
the top-level "prrte" directory
- This is on master only (which is the best place to develop for Open MPI).  
The existing release series (v4.x and earlier) all still have ORTE.

Ralph: can one specify an external PRRTE installation for Open MPI to use 
instead of the embedded version?  I don't see a --with-prrte configure CLI 
option -- did I miss it?


On Feb 29, 2020, at 10:07 AM, Davide Giuseppe Siciliano via devel 
mailto:devel@lists.open-mpi.org>> wrote:

Thank you Ralph,
I noticed the change from ORTE to PRRTE, but I haven't looked for the 
corresponding github project. I'll look into it

Thanks again,
Davide

Da: devel 
mailto:devel-boun...@lists.open-mpi.org>> per 
conto di Ralph Castain via devel 
mailto:devel@lists.open-mpi.org>>
Inviato: sabato 29 febbraio 2020 15:22
A: OpenMPI Devel mailto:devel@lists.open-mpi.org>>
Cc: Ralph Castain mailto:r...@open-mpi.org>>
Oggetto: Re: [OMPI devel] Adding a new RAS module

You'll have to do it in the PRRTE project: https://github.com/openpmix/prrte

OMPI has removed the ORTE code base and replaced it with PRRTE, which is 
effectively the same code but taken from a repo that multiple projects support. 
You can use any of the components in there as a template - I don't believe we 
have a formal guide. Feel free to open an issue on the PRRTE repo to track any 
questions.

Ralph


On Feb 29, 2020, at 6:10 AM, Davide Giuseppe Siciliano via devel 
mailto:devel@lists.open-mpi.org>> wrote:

Hello everyone,

i'm trying to add a new RAS module to ingrate a framework developed in 
university with open-mpi.

Can you please help me saying if there is a template (or a guide) to follow to 
develop it in the right way?

Thanks,
Davide


--
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Github reviews now required on master

2020-02-25 Thread Jeff Squyres (jsquyres) via devel
As we discussed last week in the face-to-face Portland meeting, and we 
re-affirmed this morning on the Webex, reviews are now required on master pull 
requests.

We used to only require reviews on release branches, but in an effort to spread 
some knowledge about the code base around, we're now also requiring reviews on 
master PR's too.

The review can be from anyone in the Open MPI community who has write access.  
This can be someone in your organization or someone at a different 
organization.  There's no strict rule on who has to review what; use common 
sense.  When possible, try to invite others to review the code so that more 
people learn about more parts of the code base.

Thanks.

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] Deprecated configure options in OMPI v5

2020-02-20 Thread Jeff Squyres (jsquyres) via devel
On Feb 19, 2020, at 1:31 PM, Ralph Castain via devel  
wrote:
> 
> What do we want to do with the following options? These have either been 
> renamed (changing from "orte..." to a "prrte" equivalent) or are no longer 
> valid:
> 
> --enable-orterun-prefix-by-default
> --enable-mpirun-prefix-by-default
> These are now --enable-prte-prefix-by-default. Should I error out via the 
> deprecation mechanism? Or should we silently translate to the new option?

I know a large number of people who use these functions.

We should probably not-silently translate to the new option.  I.e., issue a 
warning if they use an old option, and translate it to the new functionality.

It might be a while before we can actually delete these CLI params; they've 
become embedded in quite a few build scripts... :-\

>  --enable-per-user-config-files
> This is no longer valid if launching either via mpirun or on a system that 
> has adequate PMIx support. Still, it does apply to direct launch on systems 
> that lack the requisite support. My only concern here is that we ARE going to 
> use user-level config files with mpirun and supported systems, and it is now 
> a runtime decision (not a configure option). So do we remove this and explain 
> another method for doing it on systems lacking support? Or leave it and just 
> "do the right thing" under the covers?

This might be worth a discussion on a Tuesday call -- it sounds like a somewhat 
complex issue and could probably be explained better in person.

> --enable-mpi-cxx
> --enable-mpi-cxx-seek
> --enable-cxx-exceptions
> I assume these should be added to the "deprecation" m4?

Actually, I think we should error out if someone specifies these -- we should 
add this to https://github.com/open-mpi/ompi/pull/7428.  I will do so.

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] ORTE has been removed!

2020-02-10 Thread Jeff Squyres (jsquyres) via devel
On Feb 8, 2020, at 3:30 PM, Ralph Castain via devel  
wrote:
> 
> FYI: pursuant to the objectives outline last year, I have committed PR #7202 
> and removed ORTE from the OMPI repository. It has been replaced with a PRRTE 
> submodule pointed at the PRRTE master branch. At the same tie, we replaced 
> the embedded PMIx code tree with a submodule pointed to the PMIx master 
> branch.
> 
> The mpirun command hasn't changed. It simply starts PRRTE under the covers 
> and then launches your job (using "prun") against it. So everything behaves 
> the same in that regard.

Woo hoo!

This was a really long road with a lot of hard work from a lot of people.  It's 
probably safe to say that a disproportionate amount of it was done by Ralph.  
:-)

Thank you, everyone!

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] Fix your MTT scripts!

2020-02-10 Thread Jeff Squyres (jsquyres) via devel
On Feb 9, 2020, at 10:14 AM, Ralph Castain via devel  
wrote:
> 
> We are seeing many failures on MTT because of errors on the cmd line. Note 
> that by request of the OMPI community, PRRTE is strictly enforcing the Posix 
> "dash" syntax:
> 
>  * a single-dash must be used only for single-character options. You can 
> combine the single-character options like "-abc" as shorthand for "-a -b -c"
>  * two-dashes must precede ALL multi-character options. For example, "--mca" 
> as opposed to "-mca". The latter will be rejected with an error


Woo hoo!  I'm all for these changes -- even though I may have been the one to 
write some of the original command-line parsing code, I grew to dislike the 
ambiguity of single-dash token options (e.g., "-mca" and the like).

This is definitely something we're going to have to prominently mention in the 
OMPI v5 release announcements, though.

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] Git submodules are coming

2020-02-07 Thread Jeff Squyres (jsquyres) via devel
On Feb 7, 2020, at 4:27 AM, Brice Goglin via devel  
wrote:
> 
> PR#7367 was initially on top of PR #7366. When Jeff merged PR#7366, I rebased 
> my #7367 with git prrs and got this error:
> 
> $ git prrs origin master
> From 
> https://github.com/open-mpi/ompi
> 
>  * branch  master -> FETCH_HEAD
> Fetching submodule opal/mca/hwloc/hwloc2/hwloc
> fatal: cannot rebase with locally recorded submodule modifications
> 
> I didn't touch the hwloc submodule as far as I can see. The hwloc submodule 
> also didn't change in origin/master between before and after the rebasing.

Huh.  I can't see from this what happened; I have no insight to offer here, 
sorry...

> $ git submodule status
>  38433c0f5fae0b761bd20e7b928c77f3ff2e76dc opal/mca/hwloc/hwloc2/hwloc 
> (hwloc-2.1.0rc2-33-g38433c0f)

I see this in my ompi clone as well (i.e., it's where the master/HEAD hwloc 
submodule is pointing).

> opal/mca/hwloc/hwloc2/hwloc $ git status
> HEAD detached from f1a2e22a
> nothing to commit, working tree clean
> 
> I am not sure what's this "HEAD detached ..." is doing here.

If you look at the graph log in the opal/mca/hwloc/hwloc2/hwloc tree, you'll 
see:

* 03d42600 (origin/v2.1) doxy: add a ref to envvar from the XML section
...a bunch more commits...
* 38433c0f (HEAD) .gitignore: add config/ltmain.sh.orig
...a bunch more commits...
* f1a2e22a (tag: hwloc-2.1.0rc2, tag: hwloc-2.1.0) contrib/windows: update 
README

Meaning:
- 03d42600 is the head of the "v2.1" branch in the hwloc repo
- 38433c0f is where the submodule is pointing (i.e., local HEAD)
- f1a2e22a is the last tag before that

So I think the "HEAD detached" means that the HEAD is not pointing to a named 
commit (i.e., there's no tags or branches pointing to 38433c0f).

> I seem to be able to reproduce the issue in my master branch by doing "git 
> reset --hard HEAD^". git prrs will then fail the same.
> 
> I worked around the issue by manually reapplying all commits from my PR on 
> top of master with git cherry-pick, but I'd like to understand what's going 
> on. It looks like my submodule is clean but not clean enough for a rebase?

I haven't had problems with rebasing and submodules; I'm not sure what I'm 
doing different than you.

-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] 3.1.6rc2: Cygwin fifo warning (was: v3.0.6rc2 and v3.1.6rc2 available for testing)

2020-02-02 Thread Jeff Squyres (jsquyres) via devel
On Feb 2, 2020, at 2:17 AM, Marco Atzeri via devel  
wrote:
> 
> not a new issue as it was also in 3.1.5. what is causing the
> last line of warning ?
> And why a simple run should try to run a debugger ?
> 
> $ mpirun -n 4 ./hello_c
> ...
> Hello, world, I am 3 of 4, (Open MPI v3.1.6rc2, package: Open MPI 
> Marco@LAPTOP-82F08ILC Distribution, ident: 3.1.6rc2, repo rev: v3.1.6rc2, Jan 
> 30, 2020, 125)
> [LAPTOP-82F08ILC:00154] [[18244,0],0] unable to open debugger attach fifo
> 
> this is a Cygwin 64 bit.


Can you get a stack trace for that, perchance?  The function in question to 
trap is open_fifo() in orted_submit.c.  This function can be called from 3 
different places; it would be good to know in which of the 3 it is happening.

Does Cygwin support mkfifo()?

-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] v3.0.6rc2 and v3.1.6rc2 available for testing

2020-01-30 Thread Jeff Squyres (jsquyres) via devel
Minor updates since rc1:

3.0.6rc2 and 3.1.6rc2:
- Fix run-time linker issues with OMPIO on newer Linux distros.

3.1.6rc2 only:
- Fix issue with zero-length blockLength in MPI_TYPE_INDEXED.

Please test:

   https://www.open-mpi.org/software/ompi/v3.0/
   https://www.open-mpi.org/software/ompi/v3.1/

-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Open MPI v3.1.6rc1 released

2020-01-16 Thread Jeff Squyres (jsquyres) via devel
(I know you just got a very similar email -- but that was for ***3.0.6rc1*** -- 
this email is for ***3.1.6rc1***)

This may well be the end of the line for the v3.1.x series.

Please test v3.1.6rc1:

https://www.open-mpi.org/software/ompi/v3.1/

Changes since v3.1.5:

- Fix PMIX dstore locking compilation issue.  Thanks to Marco Atzeri
  for reporting the issue.
- Allow the user to override modulefile_path in the Open MPI SRPM,
  even if install_in_opt is set to 1.
- Properly detect ConnectX-6 HCAs in the openib BTL.
- Fix segfault in the MTL/OFI initialization for large jobs.
- Fix issue to guarantee to properly release MPI one-sided lock when
  using UCX transports to avoid a deadlock.
- Fix potential deadlock when processing outstanding transfers with
  uGNI transports.
- Fix various portals4 control flow bugs.
- Fix communications ordering for alltoall and Cartesian neighborhood
  collectives.
- Fix an infinite recursion crash in the memory patcher on systems
  with glibc v2.26 or later (e.g., Ubuntu 18.04) when using certain
  OS-bypass interconnects.

-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Open MPI v3.0.6rc1 released

2020-01-16 Thread Jeff Squyres (jsquyres) via devel
We keep threatening to have a last release in the v3.0.x series.  v3.0.6 may 
well be it!

Please test v3.0.6rc1:

https://www.open-mpi.org/software/ompi/v3.0/

Changes since v3.0.5:

- Allow the user to override modulefile_path in the Open MPI SRPM,
  even if install_in_opt is set to 1.
- Properly detect ConnectX-6 HCAs in the openib BTL.
- Fix segfault in the MTL/OFI initialization for large jobs.
- Fix various portals4 control flow bugs.
- Fix communications ordering for alltoall and Cartesian neighborhood
  collectives.
- Fix an infinite recursion crash in the memory patcher on systems
  with glibc v2.26 or later (e.g., Ubuntu 18.04) when using certain
  OS-bypass interconnects.

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] [EXTERNAL] Git submodules are coming

2020-01-08 Thread Jeff Squyres (jsquyres) via devel
Good idea -- see https://github.com/open-mpi/ompi/pull/7286.


On Jan 8, 2020, at 10:10 AM, Thomas Naughton 
mailto:naught...@ornl.gov>> wrote:

Hi Jeff,

I'm not sure where the issue templates reside, but it might be useful to
add `git submodule status` to the list of commands when reporting issues. (Once 
the first submodule PR is merged)


beaker:$ git submodule status
b94e2617df3fd9a3e83c388fa1c691c0057a77e9 opal/mca/pmix/pmix4x/openpmix 
(v1.1.3-2128-gb94e261)
52d498811f19be5306bd55b8433024733d3b589a prrte (dev-30165-g52d4988)
beaker:$


--tjn


_
 Thomas Naughton  
naught...@ornl.gov<mailto:naught...@ornl.gov>
 Research Associate   (865) 576-4184


On Tue, 7 Jan 2020, Jeff Squyres (jsquyres) via devel wrote:

We now have two PRs pending that will introduce the use of Git submodules (and 
there are probably more such PRs on the way).  At last one of these first two 
PRs will likely be merged "Real Soon Now".

We've been talking about using Git submodules forever.  Now we're just about 
ready.

**
*** DEVELOPERS: THIS AFFECTS YOU!! ***
**

You cannot just "clone and build" any more:

-
git clone g...@github.com<mailto:g...@github.com>:open-mpi/ompi.git
cd ompi && ./autogen.pl && ./configure ...
-

You will *have* to initialize the Git submodule(s) -- either during or after 
the clone.  *THEN* you can build Open MPI.

Go read this wiki: https://github.com/open-mpi/ompi/wiki/GitSubmodules

May the force be with us!

--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>




--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>



Re: [OMPI devel] Git submodules are coming

2020-01-08 Thread Jeff Squyres (jsquyres) via devel
The hwloc git submodule just got merged 
(https://github.com/open-mpi/ompi/pull/6821).

A new age is upon us!

Be sure you read the wiki before you complain here about your builds being 
broken.  :-)

https://github.com/open-mpi/ompi/wiki/GitSubmodules




On Jan 8, 2020, at 9:04 AM, Ralph Castain via devel 
mailto:devel@lists.open-mpi.org>> wrote:

Actually, I take that back - making a separate PR to change the opal/pmix 
embedded component to a submodule was way too painful. I simply added it to the 
existing #7202.


On Jan 7, 2020, at 1:33 PM, Ralph Castain via devel 
mailto:devel@lists.open-mpi.org>> wrote:

Just an FYI: there will soon be THREE PRs introducing submodules - I am 
breaking #7202 into two pieces. The first will replace opal/pmix with direct 
use of PMIx everywhere and replace the embedded pmix component with a submodule 
pointing to PMIx master, and the second will replace ORTE with PRRTE.


On Jan 7, 2020, at 9:02 AM, Jeff Squyres (jsquyres) via devel 
mailto:devel@lists.open-mpi.org>> wrote:

We now have two PRs pending that will introduce the use of Git submodules (and 
there are probably more such PRs on the way).  At last one of these first two 
PRs will likely be merged "Real Soon Now".

We've been talking about using Git submodules forever.  Now we're just about 
ready.

**
*** DEVELOPERS: THIS AFFECTS YOU!! ***
**

You cannot just "clone and build" any more:

-
git clone g...@github.com<mailto:g...@github.com>:open-mpi/ompi.git
cd ompi && ./autogen.pl && ./configure ...
-

You will *have* to initialize the Git submodule(s) -- either during or after 
the clone.  *THEN* you can build Open MPI.

Go read this wiki: https://github.com/open-mpi/ompi/wiki/GitSubmodules

May the force be with us!

--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>







--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>



Re: [OMPI devel] Git submodules are coming

2020-01-07 Thread Jeff Squyres (jsquyres) via devel
Good catches -- I updated the hwloc (to "hwloc2") bit, but I'll leave the 
"bar50" edits to Brian/Ralph, since they wrote that section.

Meaning: I *think* you're right, but I'm going to let them do it...  :-0


On Jan 7, 2020, at 4:47 PM, Brice Goglin via devel 
mailto:devel@lists.open-mpi.org>> wrote:


Thanks a lot for writing all this.


At the end 
https://github.com/open-mpi/ompi/wiki/GitSubmodules#adding-a-new-submodule-pointing-to-a-specific-commit
should "bar" be "bar50x" in line "$ git add bar" ?

It seems to me that you are in opal/mca/foo and the new submodule is in 
"bar50x" (according to "cd opal/mca/foo/bar50x" at the beginning).

There's also a "bar-50x" instead of "bar50x" in line "git submodule add --name 
bar-50x ...". Should the submodule name match the directory name?


By the way, in 
https://github.com/open-mpi/ompi/wiki/GitSubmodules#updating-the-commit-that-a-submodule-refers-to
you may want to rename hwloc201 into hwloc2 to avoid confusion and match the 
current PR.

Brice (who cannot edit the wiki :))



Le 07/01/2020 à 18:02, Jeff Squyres (jsquyres) via devel a écrit :

We now have two PRs pending that will introduce the use of Git submodules (and 
there are probably more such PRs on the way).  At last one of these first two 
PRs will likely be merged "Real Soon Now".

We've been talking about using Git submodules forever.  Now we're just about 
ready.

**
*** DEVELOPERS: THIS AFFECTS YOU!! ***
**

You cannot just "clone and build" any more:

-
git clone 
g...@github.com:open-mpi/ompi.git<mailto:g...@github.com:open-mpi/ompi.git>
cd ompi && ./autogen.pl && ./configure ...
-

You will *have* to initialize the Git submodule(s) -- either during or after 
the clone.  *THEN* you can build Open MPI.

Go read this wiki: https://github.com/open-mpi/ompi/wiki/GitSubmodules

May the force be with us!




--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>



[OMPI devel] Git submodules are coming

2020-01-07 Thread Jeff Squyres (jsquyres) via devel
We now have two PRs pending that will introduce the use of Git submodules (and 
there are probably more such PRs on the way).  At last one of these first two 
PRs will likely be merged "Real Soon Now".

We've been talking about using Git submodules forever.  Now we're just about 
ready.

**
*** DEVELOPERS: THIS AFFECTS YOU!! ***
**

You cannot just "clone and build" any more:

-
git clone g...@github.com:open-mpi/ompi.git
cd ompi && ./autogen.pl && ./configure ...
-

You will *have* to initialize the Git submodule(s) -- either during or after 
the clone.  *THEN* you can build Open MPI.

Go read this wiki: https://github.com/open-mpi/ompi/wiki/GitSubmodules

May the force be with us!

-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] PMIX ERROR: INIT spurious message on 3.1.5

2020-01-03 Thread Jeff Squyres (jsquyres) via devel
Is there a configure test we can add to make this kind of behavior be the 
default?


> On Jan 1, 2020, at 11:50 PM, Marco Atzeri via devel 
>  wrote:
> 
> thanks Ralph
> 
> gds = ^ds21
> works as expected
> 
> Am 31.12.2019 um 19:27 schrieb Ralph Castain via devel:
>> PMIx likely defaults to the ds12 component - which will work fine but a tad 
>> slower than ds21. It is likely something to do with the way cygwin handles 
>> memory locks. You can avoid the error message by simply adding "gds = ^ds21" 
>> to your default MCA param file (the pmix one - should be named 
>> pmix-mca-params.conf).
>> Artem - any advice here?
>>> On Dec 25, 2019, at 9:56 AM, Marco Atzeri via devel 
>>>  wrote:
>>> 
>>> I have no multinode around for testing
>>> 
>>> I will need to setup one for testing after the holidays
>>> 
>>> Am 24.12.2019 um 23:27 schrieb Jeff Squyres (jsquyres):
>>>> That actually looks like a legit error -- it's failing to initialize a 
>>>> shared mutex.
>>>> I'm not sure what the consequence is of this failure, though, since the 
>>>> job seemed to run ok.
>>>> Are you able to run multi-node jobs ok?
>>>>> On Dec 22, 2019, at 1:20 AM, Marco Atzeri via devel 
>>>>>  wrote:
>>>>> 
>>>>> Hi Developers,
>>>>> 
>>>>> Cygwin 64bit, openmpi-3.1.5-1
>>>>> testing the cygwin package before releasing it
>>>>> I see a never seen before spurious error messages that do not seem
>>>>> about error at all:
>>>>> 
>>>>> $ mpirun -n 4 ./hello_c.exe
>>>>> [LAPTOP-82F08ILC:02395] PMIX ERROR: INIT in file 
>>>>> /cygdrive/d/cyg_pub/devel/openmpi/v3.1/openmpi-3.1.5-1.x86_64/src/openmpi-3.1.5/opal/mca/pmix/pmix2x/pmix/src/mca/gds/ds21/gds_ds21_lock_pthread.c
>>>>>  at line 188
>>>>> [LAPTOP-82F08ILC:02395] PMIX ERROR: SUCCESS in file 
>>>>> /cygdrive/d/cyg_pub/devel/openmpi/v3.1/openmpi-3.1.5-1.x86_64/src/openmpi-3.1.5/opal/mca/pmix/pmix2x/pmix/src/mca/common/dstore/dstore_base.c
>>>>>  at line 2432
>>>>> Hello, world, I am 0 of 4, (Open MPI v3.1.5, package: Open MPI 
>>>>> Marco@LAPTOP-82F08ILC Distribution, ident: 3.1.5, repo rev: v3.1.5, Nov 
>>>>> 15, 2019, 116)
>>>>> Hello, world, I am 1 of 4, (Open MPI v3.1.5, package: Open MPI 
>>>>> Marco@LAPTOP-82F08ILC Distribution, ident: 3.1.5, repo rev: v3.1.5, Nov 
>>>>> 15, 2019, 116)
>>>>> Hello, world, I am 2 of 4, (Open MPI v3.1.5, package: Open MPI 
>>>>> Marco@LAPTOP-82F08ILC Distribution, ident: 3.1.5, repo rev: v3.1.5, Nov 
>>>>> 15, 2019, 116)
>>>>> Hello, world, I am 3 of 4, (Open MPI v3.1.5, package: Open MPI 
>>>>> Marco@LAPTOP-82F08ILC Distribution, ident: 3.1.5, repo rev: v3.1.5, Nov 
>>>>> 15, 2019, 116)
>>>>> [LAPTOP-82F08ILC:02395] [[20101,0],0] unable to open debugger attach fifo
>>>>> 
>>>>> There is a know workaround ?
>>>>> I have not found anything on the issue list.
>>>>> 
>>>>> Regards
>>>>> MArcp


-- 
Jeff Squyres
jsquy...@cisco.com



Re: [OMPI devel] PMIX ERROR: INIT spurious message on 3.1.5

2019-12-24 Thread Jeff Squyres (jsquyres) via devel
That actually looks like a legit error -- it's failing to initialize a shared 
mutex.

I'm not sure what the consequence is of this failure, though, since the job 
seemed to run ok.

Are you able to run multi-node jobs ok?


> On Dec 22, 2019, at 1:20 AM, Marco Atzeri via devel 
>  wrote:
> 
> Hi Developers,
> 
> Cygwin 64bit, openmpi-3.1.5-1
> testing the cygwin package before releasing it
> I see a never seen before spurious error messages that do not seem
> about error at all:
> 
> $ mpirun -n 4 ./hello_c.exe
> [LAPTOP-82F08ILC:02395] PMIX ERROR: INIT in file 
> /cygdrive/d/cyg_pub/devel/openmpi/v3.1/openmpi-3.1.5-1.x86_64/src/openmpi-3.1.5/opal/mca/pmix/pmix2x/pmix/src/mca/gds/ds21/gds_ds21_lock_pthread.c
>  at line 188
> [LAPTOP-82F08ILC:02395] PMIX ERROR: SUCCESS in file 
> /cygdrive/d/cyg_pub/devel/openmpi/v3.1/openmpi-3.1.5-1.x86_64/src/openmpi-3.1.5/opal/mca/pmix/pmix2x/pmix/src/mca/common/dstore/dstore_base.c
>  at line 2432
> Hello, world, I am 0 of 4, (Open MPI v3.1.5, package: Open MPI 
> Marco@LAPTOP-82F08ILC Distribution, ident: 3.1.5, repo rev: v3.1.5, Nov 15, 
> 2019, 116)
> Hello, world, I am 1 of 4, (Open MPI v3.1.5, package: Open MPI 
> Marco@LAPTOP-82F08ILC Distribution, ident: 3.1.5, repo rev: v3.1.5, Nov 15, 
> 2019, 116)
> Hello, world, I am 2 of 4, (Open MPI v3.1.5, package: Open MPI 
> Marco@LAPTOP-82F08ILC Distribution, ident: 3.1.5, repo rev: v3.1.5, Nov 15, 
> 2019, 116)
> Hello, world, I am 3 of 4, (Open MPI v3.1.5, package: Open MPI 
> Marco@LAPTOP-82F08ILC Distribution, ident: 3.1.5, repo rev: v3.1.5, Nov 15, 
> 2019, 116)
> [LAPTOP-82F08ILC:02395] [[20101,0],0] unable to open debugger attach fifo
> 
> There is a know workaround ?
> I have not found anything on the issue list.
> 
> Regards
> MArcp


-- 
Jeff Squyres
jsquy...@cisco.com



[OMPI devel] Cisco's MTT testing to be shut down for the holidays

2019-12-20 Thread Jeff Squyres (jsquyres) via devel
Cisco will be disabling its MTT testing this evening for our holiday 
company-wide shutdown.

I'll resume our MTT testing in January.

-- 
Jeff Squyres
jsquy...@cisco.com



  1   2   3   4   5   6   7   8   9   10   >