[OMPI devel] Tomorrow (7/4/2023) Open-MPI developer call cancelled.

2023-07-03 Thread Geoffrey Paulsen via devel
Tomorrow’s Open MPI call (7/4/2023) is cancelled due to the U.S. Holiday.


[OMPI devel] Open MPI v5.0.0 release timeline delay

2022-11-07 Thread Geoffrey Paulsen via devel
Open MPI developers,

I’ve got some bad news from a OMPI v5.0.0 release timeframe.  IBM has asked 
Austen and I (and our team) to focus 100% on another project for the next two 
full weeks.

  Open MPI v5.0.x still has a few remaining blocking items including 
documentation, PRRTE 3.0 release, some collective performance 
data/marketing/messaging, along with a few platform specific bugs (see: 
https://github.com/open-mpi/ompi/projects/3)

  Due to these reasons (along with Super Computing and the holidays) the Open 
MPI v5.0 RMs feel that a January of 2023 is a more realistic timeframe for 
release.

Thank you for your understanding.

The Open MPI v5.0.x Release Managers:
   - Tomislav Janjusic, nVidia
   - Austen Lauria, IBM
   - Geoff Paulsen, IBM


[OMPI devel] Discuss HAN / Adapt performance (part 2)

2022-08-18 Thread Geoffrey Paulsen via devel
Good meeting today on HAN / Adapt performance.
Joseph is going to go run some more numbers.  We will finalize our discussion 
next week.

Same time, different Webex Info:

Agenda:
o Discuss Joseph's new results of HAN run with newly-tuned coll_tune.
o Come to consensus about plans for v5.0.0 release.

Discuss HAN / Adapt performance (part 2)
Hosted by Geoff Paulsen

https://ibm.webex.com/ibm/j.php?MTID=m69e2f10a6293a0b3a6eb81084779294c
Thursday, Aug 25, 2022 2:00 pm | 1 hour | (UTC-05:00) Central Time (US & Canada)
Meeting number: 145 719 0018
Password: 7VEiuBMMM63 (78348266 from phones and video systems)
Agenda: Agenda:
o Discuss Joseph's new results of HAN run with newly-tuned coll_tune.
o Come to consensus about plans for v5.0.0 release.

Join by video system
Dial 1457190...@ibm.webex.com
You can also dial 173.243.2.68 and enter your meeting number.

Join by phone
1-844-531-0958 United States Toll Free
1-669-234-1178 United States Toll

Access code: 145 719 0018

--
Geoff Paulsen
IBM Spectrum MPI Engineer
He/His


Re: [OMPI devel] Find a time in next few days to discuss Han/Adapt.

2022-08-16 Thread Geoffrey Paulsen via devel
August 18th, from 2-3pm Central US time is the winner.  Here’s Webex-info:

Open MPI - Discuss HAN / Adapt performance
Hosted by Geoff Paulsen

https://ibm.webex.com/ibm/j.php?MTID=m330b3d97ef8828d8b7dae0fa7105d47e
Thursday, Aug 18, 2022 2:00 pm | 1 hour | (UTC-05:00) Central Time (US & Canada)
Meeting number: 145 206 0471
Password: puZSuEmc768 (78978362 from phones and video systems)
Agenda:  Discuss performance results of various platforms for HAN and Adapt 
collective components, with the aim of deciding if we can make them default on 
and at what priority. Required Reading: - Discussion of Making coll HAN and 
Adapt Default for v5.0.0: Issue #10347 - Improvements for coll HAN: Issue 
#10438 - older large message perf numbers: Issue #9062 - Discussed in Weekly 
Telcon last month, see OMPI Wiki for WeeklyTelcon_20220614 and 
WeeklyTelcon_2022062

Join by video system
Dial 1452060...@ibm.webex.com
You can also dial 173.243.2.68 and enter your meeting number.

Join by phone
1-844-531-0958 United States Toll Free
1-669-234-1178 United States Toll

Access code: 145 206 0471

--
Geoff Paulsen
IBM Spectrum MPI Engineer
He/His


Re: [OMPI devel] Find a time in next few days to discuss Han/Adapt.

2022-08-10 Thread Geoffrey Paulsen via devel
Resending new Doodle (https://doodle.com/meeting/participate/id/e76ENl1d) with 
times next week.
Sorry for the inconvenience.

Please add yourself, and select times you’re available.  We’ll decide on
Tuesday’s Web-ex and reply to this email with final time / dial-in info.

Description:
Discuss performance results of various platforms for HAN and Adapt collective
components, with the aim of deciding if we can make them default on and at what
priority.

Required Reading:
- Discussion of Making coll HAN and Adapt Default for v5.0.0: Issue #10347
- Improvements for coll HAN: Issue #10438
- older large message perf numbers: Issue #9062
- Discussed in Weekly Telcon last month, see OMPI Wiki for
WeeklyTelcon_20220614 and WeeklyTelcon_2022062


--
Geoff Paulsen
IBM Spectrum MPI Engineer
He/His


[OMPI devel] Find a time in next few days to discuss Han/Adapt.

2022-08-10 Thread Geoffrey Paulsen via devel
I’ve created a doodle here: https://doodle.com/meeting/participate/id/axGpPgre

Please add yourself, and select times you’re available.  We’ll decide at Noon 
Eastern time on Thursday and reply to this email with final (I think we can 
also update the doodle at that time to reflect the best time, so can check back 
there as well.

Description:
Discuss performance results of various platforms for HAN and Adapt collective 
components, with the aim of deciding if we can make them default on and at what 
priority.

Required Reading:
- Discussion of Making coll HAN and Adapt Default for v5.0.0: Issue #10347
- Improvements for coll HAN: Issue #10438
- older large message perf numbers: Issue #9062
- Discussed in Weekly Telcon last month, see OMPI Wiki for 
WeeklyTelcon_20220614 and WeeklyTelcon_2022062

--
Geoff Paulsen
IBM Spectrum MPI Engineer
He/His


[OMPI devel] Open MPI v5.0.0rc3 available for testing

2022-03-08 Thread Geoffrey Paulsen via devel
Open MPI v5.0.0rc3 is now available for testing 
(https://www.open-mpi.org/software/ompi/v5.0/).

Pease test and send feedback either to users at lists dot open dash mpi dot org 
or create an issue at https://github.com/open-mpi/ompi/issues/.

See https://github.com/open-mpi/ompi/blob/v5.0.x/NEWS,  for changes in rc3.

Thank you,
v5.0 Release Managers



[OMPI devel] OMPI -> PRRTE MCA parameter changes in Open MPI v5.0

2021-11-09 Thread Geoffrey Paulsen via devel
Open MPI developers,
  The v5.0 RM managers would like to solicit opinions to come to consensusaround our three independent MCA frameworks in Open MPI v5.0.x. 
  As you know, Open MPI v5.0.x has abstracted the runtime away to usethe Open PMIx Reference Runtime Environment (PRRTE) implementation.
  In doing so, many of MCA paremeters are now different as they'reread in by PRRTE.
  The problem is that this affects some vary common components, forexample `--oversubscribe`, which is now "--map-by :OVERSUBSCRIBE".prterun prints a nice warning message, but that is not true for manymca parameters that may never get read in at ALL in Open MPI v5.0.xwhere they were accepted and used in earlier versions.
  This means that users will need to use both ompi_info and prte_infoto understand how to translate their MCA parameters, and furthermorethere is no safety net if an old parameter is not read in with thenew Open MPI v5.0.x releases.
  If someone is interested in writing an aliasing system that mighthelp users adopt Open MPI, that would be FANTASTIC. However, itdoesn't help educate users for the long term.
  For this reason, the Open MPI v5.0 Release Managers are recommendingthat we document this loudly on the website for Open MPI v5.0.0.
  What are other's thoughts?  Lets discuss on the devel mailing list:    
  Let's try to come to consensus by Nov 15th.Thanks,The Open MPI v5.0 Release Managers,Tommy, Austen, Geoff



[OMPI devel] Help us find good times for design discussions.

2021-07-05 Thread Geoffrey Paulsen via devel
Open MPI developers,
 
   We have not been able to meet together face to face for quite some time.  We'd like to schedule a few 2-hour blocks for detailed discussions on topics of interest.
  Please fill out https://doodle.com/poll/rd7szze3agmyq4m5?utm_source=poll&utm_medium=link, include your name and time blocks that might work for you.  Also please add any agenda items to discuss on our wiki page here: https://github.com/open-mpi/ompi/wiki/Meeting-2021-07
 
  Thanks,
  Geoff Paulsen



[OMPI devel] Open MPI virtual Design Sessions

2021-07-05 Thread Geoffrey Paulsen (via Doodle) via devel
Hi,

Geoffrey Paulsen (gpaul...@us.ibm.com) removed you from the Doodle
poll Open MPI virtual Design Sessions.


Best wishes,
The Doodle Team


Doodle AG, Werdstrasse 21, 8021 Zürich

Why did I receive this email?
https://link.e.doodle.com/ls/click?upn=tj06F74K67F9jS-2B7bRMCns5zuGsC2KKv4-2Fcjfmnz8BH3XeInFri5Ol1p124ehn0ryaESogW3jBEZtuatFOT-2BSvY2nn1EOtCAkRb4Z0HQ-2FAzARPfuKasJOGbW6Ht0s8G-2BgyjREwabE0HFl6jI2wBU3g-3D-3D9e9Z_3yVA0AJkLK9RgkvZQZCJdIDWRSX-2Fy3t1wIW8fn9raVuASVuvt9jqLSramijMme693OlHibsOcVVFhpEKR-2B4HqVwomBPiX9LlenriCslAvKTvCU0bZEGaaj6ux-2FlrGpJTtRW2b7Fgsx1iHewpWWiMqGr0uo8fpfze5BVFUPt43LEYSlBzyWsf-2BC21b0x3S4JJtrqT26-2BfXiHUPUGWpZ9mOiICE7SvNJqVU5611vbHBEQyKv3DR4kCfxwb0i6ub39AZ7DqSQ0MJFOglPWRQ7pn2-2FrMwP-2F5TWMGvKqBDl9SpMBDRrtoFt6ubRbglUbpOZ7qpCk6vPG6k4VGHznJim3r8VnQc9Bbu-2BwtK6SoaZ-2BN8Z-2Bl2lx9-2Fl-2BVD1pJKSR1k1Gn-2B0jIJ8M8MfqrgNZK1-2FXjiQ-3D-3D


[OMPI devel] Invitation: Open MPI virtual Design Sessions

2021-07-05 Thread Geoffrey Paulsen (via Doodle) via devel
Hi,

Geoffrey Paulsen (gpaul...@us.ibm.com) invites you to participate in
the Doodle poll Open MPI virtual Design Sessions.

Please indicate the 2 hour blocks that work for a few virtual design
discussions.

Participate now
https://link.e.doodle.com/ls/click?upn=tj06F74K67F9jS-2B7bRMCnoBLLqs9GAYYN-2FoLvL7mXuIdJPlLIvAdt43Sa6HLCVNfNGK516I7FpQb195KRhxIjCbTZNbZ6p81O2MaJTMJfKy0Tx0WykPwfhI0Ez1ZpPQ3fLtrnXJOQtZ4DopSPxVS5O7Yc5-2BhJov6AiTg8aECvex0I06NRkH8peHvi037VpubKIpIimuGhJDXifl6N-2BzvmnYY0azRQs2ZoXCl4rf40n-2F-2FuaIayrC1Ox9SGpgGYOsW1BXQUiRMhYaf1pRV3BkmgN-2BrnmPpLzpLRNGcPLEEz1A-3DRABV_3yVA0AJkLK9RgkvZQZCJdIDWRSX-2Fy3t1wIW8fn9raVuASVuvt9jqLSramijMme693OlHibsOcVVFhpEKR-2B4HqVwomBPiX9LlenriCslAvKTvCU0bZEGaaj6ux-2FlrGpJTtRW2b7Fgsx1iHewpWWiMqGr0uo8fpfze5BVFUPt43LG6gcrlvalYT4hdQUqnbGi3-2Bh191QCIB8KR0x5qQ1Jz1ztz-2BkoczzACMLg6XtA3BQV8J7kiv7GC2Ed-2Fj-2BwjpeM-2FqStRKStYeOaD5Dif5WsC1Yn-2FJ022Qxcs0ZczDAPmciY5IfE457ninZ2UGnFlVLAsZ8rS87UCi9ogbLFnywW-2FxGu1zcgphwrl5weU6E42szCBXTOBr2WaZmv-2BQbgQGlPxq6EofaAZAfE4oqGQgay2xXl4D27su7fr8SPg8PWWw02xrTxAMfiV1otxX1PTBwmG7dRuS1DT2tjYkTlvOS839w-3D-3D

Report this poll as spam
https://link.e.doodle.com/ls/click?upn=tj06F74K67F9jS-2B7bRMCnjnftyLkMUGNCWnZYFd0Sh-2F7UJAkEZ7G8qMzTovZ-2BYdHVk7zO8mO8MlrRi8LefpzFPLTFFySv1oGWcGjv7STYk4-2FnoyE4-2BCWOnqDvuMPw3RmoQ04e4LViOV4ndVcoIU877c6KR9xHFd0xS-2FveXmdamzSDoCzLlPIblke56vkETS-2Fhu6KJY0Sp-2BOkY7LpdkONQv0w2t4UEpkaUr-2BTr9VrSGP82QHxKDC-2F-2FB-2FYcvzfdlZolGinZViqpxACrwkI6iDGfIOISNgxNTF8yGZoAmqp6xDCHqYfQAAAE0jas5-2B1i4zvEPAl_3yVA0AJkLK9RgkvZQZCJdIDWRSX-2Fy3t1wIW8fn9raVuASVuvt9jqLSramijMme693OlHibsOcVVFhpEKR-2B4HqVwomBPiX9LlenriCslAvKTvCU0bZEGaaj6ux-2FlrGpJTtRW2b7Fgsx1iHewpWWiMqGr0uo8fpfze5BVFUPt43LG6gcrlvalYT4hdQUqnbGi3EQLxUk6SLDYBSY5bvjZqENZUpsUuKJydEtsMDo6XGHSiisnbnMJ9Hxqf2GlEtKI9ioy4E8wtI9VrLX-2FHadlExbqUto-2BWEFhL9ns-2FiV6BEBIcgcEPuCfI-2FfQAeFVLQm6pCOm7xUxq6e70Qum85ijJ1A2FFpLL4-2FpzgYso3PoLuUYGEI9NsmvsSkES-2F3rm2GYBcwM-2FB0nK8du0i21WxOdE14MtDck387hQnvE-2BRsbAzfndup9bzWCJ5r7-2BeabJzzmNo7l-2Foc0aDWMBzl00pPpSBA-3D-3D

--


Best wishes,
The Doodle Team


Doodle AG, Werdstrasse 21, 8021 Zürich

Why did I receive this email?
https://link.e.doodle.com/ls/click?upn=tj06F74K67F9jS-2B7bRMCns5zuGsC2KKv4-2Fcjfmnz8BH3XeInFri5Ol1p124ehn0ryaESogW3jBEZtuatFOT-2BSvY2nn1EOtCAkRb4Z0HQ-2FAzARPfuKasJOGbW6Ht0s8G-2BnLuKDM4iD8foiG3cZK-2B5uw-3D-3DKaOe_3yVA0AJkLK9RgkvZQZCJdIDWRSX-2Fy3t1wIW8fn9raVuASVuvt9jqLSramijMme693OlHibsOcVVFhpEKR-2B4HqVwomBPiX9LlenriCslAvKTvCU0bZEGaaj6ux-2FlrGpJTtRW2b7Fgsx1iHewpWWiMqGr0uo8fpfze5BVFUPt43LG6gcrlvalYT4hdQUqnbGi3NSsjhU3k2pbJIt3E9x3VEQLhQPD3stJ0Uy6B1CgmGgof5Gf-2BuR0ITsnvaTSILJIsjiZnDNc3qaqPHOdlD-2Bl8z9FKcqTc76VHXYILKsSJR0I-2FwLZMmogygvHQQMBZ33eTX7M66sid4V-2Boiq9fhWZ1OxWkbxw3W-2F3b9-2FoYJnTVPZXekYkat4KHV1WBcYrXg3tvPaXDwT8nrChTP9xYjg-2BDZwbym2a-2Fx2Iqof2VYFbIAgKmArI4ZAF01Lpszu7B8vKAUTEjR-2FKi8Gt0yrRrgrhumw-3D-3D

Unsubscribe From This List 
https://link.e.doodle.com/asm/unsubscribe/?user_id=6019856&data=oUiv8RQp6MZQ_WAt51RPFkn6a-zMHl2lObSgT82PesedgxmQi8KSdvG9gumpbBzEhWcb5HTRUesye5OxCLmldp-29CpR_FJD3-885i4z4gROmV365k92w-_L7ONzBNEAf0iHmRqP9RafG6vlASsVfO1v-I8z_538srzpvbwOc6TfRQyvos0RoW5fEfaLyni8Le9ibNJmsqIX1jprnwquYwrWi8i0ThmsqfXAgFEm0L-R1Noy_h3rh4lDFfGeF8s6UizMYGZVw3ySkz0rDpyfnzM9UWZTeXUJtLZEDFJM9HwfBkfUkLyN-2LlBKN9lMBz0rn5DYXe-m9E8C-fejBMWmvKv7La_T_wMtkGcw2LM4nMOQoF3KvQTXPX77Qp1nawvN8M4WxsrmpCHO7VQM07MRJVKPD5oW-jgadRo_Ps1RIjhmExN53ZT0XKlkXMPqBIH9OEN1Fg8fjtZTeFDzLbDLU4zfLvBqSwLTZlASE0NmV4AuIm1pnWjbYWUs393OLL80icq-n818dUw7bPzCeRDDxVf6W7mSht4suhcWwElzLtSjkiQz5jk-p9MOxyQBLzIBL4u6VA1HN6UMN9HxGa0w==
 | Manage Email Preferences 
https://link.e.doodle.com/asm/?user_id=6019856&data=LX-eGY-kd1OuscGyYLSAZIONEL1pN01oBkMrRHkPpxAEQJtShPrhTvjCHoJZZwNj-gizVLD5ggXY1sGdr8KHT2Zi9mUgwDJ0w-kMiXaipMFHH0P51YuOhzGlxPPpKOpFZRjyn4XmdqiViPd5cwO5Ikc6uMRJDT5HFbxHgX6KFmJ4ESocHVAkQY-SMdQoounGF1RX1VBVnFGHkTOzZ3PZTcz9avPufSUc2lx0sfU4YymtH13qC4S1HE6kQXadgmSOWCV8y2CUPQNEl-teyMt71J1SZZkZXyPP5zspB9WEjbfsotKjZ_4OT-fkERKHUs-mI7_kBFXVWTzar5UR60qF7CeY_kdZbcRpdCN4SKcle-KkU-_lVy2PvbuqjDd1A1I44qMfIq8pfcLqhCMFWnKTI4RPvpVjPxWE7MyGMAx7pkvNRSLocoMqafeCNb4r8_vFqAIaKOvdDbWYvkrVHslmMqunWpse8mi9QBiM1QjVH6nZ_ciQTX_Gftcw67OVm6aNtVxmP6tnGVq_15TcySgz_V5s-nB6k1ZnJHIUHpAGXUzGafqMzbAFYD4BrQt8s7Lhj22A6ZXir9kr8YIW5v5pjQ==

[OMPI devel] Open MPI v5.0.x branch created

2021-03-11 Thread Geoffrey Paulsen via devel
Open MPI developers,  We've created the Open MPI v5.0.x branch today, and are receiving bugfixes. Please cherry-pick any master PRs to v5.0.x once they've been merged to master.
  We're targeting an aggressive but achievable release date of May 15th.  If you're in charge of your organization's CI tests, please enable for v5.0.x PRs.  It may be a few days until all of our CI is enabled on v5.0.x.  Thanks everyone for your continued commitment to Open MPI's success.  Josh Ladd, Austen Lauria, and Geoff Paulsen - v5.0 RMs
 



[OMPI devel] Open MPI v4.0.6 rc1 available for testing.

2020-12-14 Thread Geoffrey Paulsen via devel
The first release candidate for Open-MPI v4.0.6 rc1 is now available for testing:https://www.open-mpi.org/software/ompi/v4.0/
 
Some fixes include:
    - Update embedded PMIx to 3.2.2.  This update addresses several  MPI_COMM_SPAWN problems.    - Fix a symbol name collision when using the Cray compiler to build  Open SHMEM.  Thanks to Pak Lui for reporting and fixing.    - Correct an issue encountered when building Open MPI under OSX Big Sur.  Thanks to FX Coudert for reporting.    - Various fixes to the OFI MTL.    - Fix an issue with allocation of sufficient memory for parsing long  environment variable values.  Thanks to @zrss for reporting.    - Improve reproducibility of builds to assist Open MPI packages.  Thanks to Bernhard Wiedmann for bringing this to our attention.
---Geoffrey PaulsenSoftware Engineer, IBM Spectrum MPIEmail: gpaul...@us.ibm.com



[OMPI devel] Announcing Open-MPI v4.0.5rc2 available for testing

2020-08-20 Thread Geoffrey Paulsen via devel
Open MPI v4.0.5rc2 is now available for download and test at: https://www.open-mpi.org/software/ompi/v4.0/
 
Please test and give feedback soon.
 
Thanks!The Open-MPI Team



[OMPI devel] Open MPI v4.0.5rc1 available for test.

2020-07-15 Thread Geoffrey Paulsen via devel
Announcing Open-MPI v4.0.5rc1 available for download and test at https://www.open-mpi.org/software/ompi/v4.0/
Please test and send feedback to devel@lists.open-mpi.org
Fixed in v4.0.5: When launching under SLURM's srun, Open MPI will honor SLURM's binding policy even if that would leave the processes unbound.
4.0.5 -- July, 2020 Disable binding of MPI processes to system resources by Open MPI  if an application is launched using SLURM's srun command.- Disable building of the Fortran mpi_f08 module when configuring  Open MPI with default 8 byte Fortran integer size.  Thanks to  @ahcien for reporting.- Fix a problem with mpirun when the --map-by option is used.  Thanks to Wenbin Lyu for reporting.- Fix some issues with MPI one-sided operations uncovered using Global  Arrays regression test-suite.  Thanks to @bjpalmer for reporting.- Fix a problem with make check when using the PGI compiler.  Thanks to  Carl Ponder for reporting.- Fix a problem with MPI_FILE_READ_AT_ALL that could lead to application  hangs under certain circumstances.  Thanks to Scot Breitenfeld for  reporting.- Fix a problem building C++ applications with newer versions of GCC.  Thanks to Constantine Khrulev for reporting.



[OMPI devel] Announcing Open MPI v4.0.4rc2

2020-06-01 Thread Geoffrey Paulsen via devel
Open MPI v4.0.4rc2 is now available for download and test at: sso_last: https://www.open-mpi.org/software/ompi/v4.0/Changes from v4.0.4rc1 include: view commit • OPAL/UCX: enabling new API provided by UCX view commit • event/external: Fix typo in LDFLAGS vs LIBS var before check view commit • Updating README to include WARNING about ABI break view commit • Add checks for libevent.so conflict with LSF view commit • Move from legacy -levent to recommended -levent_core view commit • Correct typo in mapping-too-low* help messages view commit • A slightly stronger check for LSF's libevent view commit • Fix LSF configure check for libevent conflict view commit • VERSION -> v4.0.4rc2 view commit • update NEWS for 4.0.4rc2 view commit • sys limits: fixed soft limit setting if it is less than hard limit
---Geoffrey PaulsenSoftware Engineer, IBM Spectrum MPIEmail: gpaul...@us.ibm.com



[OMPI devel] Open-MPI v5.0 branch date pushed back to May 14th

2020-04-28 Thread Geoffrey Paulsen via devel
Open MPI Developers,  At today's web-ex we've decided to push back the date for branching Open-MPI v5.0 from master to May 14th.  We're still targeting June 30th as the release date (see v5.0.0 milestone: https://github.com/open-mpi/ompi/milestone/37).  If possible, we're still interested in having all New Feature Pull Requests posted to https://github.com/open-mpi/ompi by this Thursday, to allow for ample time to review, discuss and possibly iterate on.
  If you find yourself with a few free cycles, please pickup an open pull request, chime in and review it!  If you find yourself with some extra computer cycles, please head over to https://github.com/open-mpi/ompi-collectives-tuning, and gather some collective tuning data for your system.  Amazon AWS is collating that data to update our collectives tuning values, and your input is necessary.  Thanks,  Geoff Paulsen



[OMPI devel] GitHub v4.0.2 tag is broken

2020-04-02 Thread Geoffrey Paulsen via devel
Ben,  Oops, looks like I may have pushed a v4.0.2 branch around March 10th.  Fortunately the v4.0.2 tag is fine and unaltered.  I've deleted the v4.0.2 branch.  Thanks for bringing this to our attention.  Geoff Paulsen



[OMPI devel] Please test Open MPI v4.0.3rc4

2020-02-27 Thread Geoffrey Paulsen via devel
Open MPI v4.0.3rc4 has been posted to https://www.open-mpi.org/software/ompi/v4.0/.
Please test this on your systems, as it's likely to become v4.0.3.
4.0.3 -- March, 2020
---
- Update embedded PMIx to 3.1.5
- Add support for Mellanox ConnectX-6.
- Fix an issue in OpenMPI IO when using shared file pointers.
Thanks to Romain Hild for reporting.
- Fix a problem with Open MPI using a previously installed
Fortran mpi module during compilation. Thanks to Marcin
Mielniczuk for reporting
- Fix a problem with Fortran compiler wrappers ignoring use of
disable-wrapper-runpath configure option. Thanks to David
Shrader for reporting.
- Fixed an issue with trying to use mpirun on systems where neither
ssh nor rsh is installed.
- Address some problems found when using XPMEM for intra-node message
transport.
- Improve dimensions returned by MPI_Dims_create for certain
cases. Thanks to @aw32 for reporting.
- Fix an issue when sending messages larger than 4GB. Thanks to
Philip Salzmann for reporting this issue.
- Add ability to specify alternative module file path using
Open MPI's RPM spec file. Thanks to @jschwartz-cray for reporting.
- Clarify use of --with-hwloc configuration option in the README.
Thanks to Marcin Mielniczuk for raising this documentation issue.
- Fix an issue with shmem_atomic_set. Thanks to Sameh Sharkawi for reporting.
- Fix a problem with MPI_Neighbor_alltoall(v,w) for cartesian communicators
with cyclic boundary conditions. Thanks to Ralph Rabenseifner and
Tony Skjellum for reporting.
- Fix an issue using Open MPIO on 32 bit systems. Thanks to
Orion Poplawski for reporting.
- Fix an issue with NetCDF test deadlocking when using the vulcan
Open MPIO component. Thanks to O

Re: [OMPI devel] v4.0.3rc3 ready for testing

2020-01-31 Thread Geoffrey Paulsen via devel
Thanks so much for testing.  If further testing reveals anything, please create an issue at https://github.com/open-mpi/ompi/.
---Geoffrey PaulsenSoftware Engineer, IBM Spectrum MPIEmail: gpaul...@us.ibm.com
 
 
- Original message -From: "Heinz, Michael William" To: Open MPI Developers Cc: Geoffrey Paulsen Subject: [EXTERNAL] RE: [OMPI devel] v4.0.3rc3 ready for testingDate: Fri, Jan 31, 2020 11:36 AM  
I’ve run the 3.1.6rc2 and 4.0.3rc3 src rpms through some smoke tests and they both built and ran properly on RHEL 8.
 
From: devel  On Behalf Of Geoffrey Paulsen via develSent: Wednesday, January 29, 2020 7:03 PMTo: devel@lists.open-mpi.orgCc: Geoffrey Paulsen Subject: [OMPI devel] v4.0.3rc3 ready for testing
 
Please test v4.0.3rc3:
   https://www.open-mpi.org/software/ompi/v4.0/
 
Changes since v4.0.2 include:
 
  4.0.3 -- January, 2020   
- Add support for Mellanox Connectx6.- Fix a problem with Fortran compiler wrappers ignoring use of  disable-wrapper-runpath configure option.  Thanks to David  Shrader for reporting.- Fixed an issue with trying to use mpirun on systems where neither  ssh nor rsh is installed.- Address some problems found when using XPMEM for intra-node message  transport.- Improve dimensions returned by MPI_Dims_create for certain  cases.  Thanks to @aw32 for reporting.- Fix an issue when sending messages larger than 4GB. Thanks to  Philip Salzmann for reporting this issue.- Add ability to specify alternative module file path using  Open MPI's RPM spec file.  Thanks to @jschwartz-cray for reporting.- Clarify use of --with-hwloc configuration option in the README.  Thanks to Marcin Mielniczuk for raising this documentation issue.- Fix an issue with shmem_atomic_set.  Thanks to Sameh Sharkawi for reporting.- Fix a problem with MPI_Neighbor_alltoall(v,w) for cartesian communicators  with cyclic boundary conditions.  Thanks to Ralph Rabenseifner and  Tony Skjellum for reporting.- Fix an issue using Open MPIO on 32 bit systems.  Thanks to  Orion Poplawski for reporting.- Fix an issue with NetCDF test deadlocking when using the vulcan  Open MPIO component.  Thanks to Orion Poplawski for reporting.- Fix an issue with the mpi_yield_when_idle parameter being ignored  when set in the Open MPI MCA parameter configuration file.  Thanks to @iassiour for reporting.- Address an issue with Open MPIO when writing/reading more than 2GB  in an operation.  Thanks to Richard Warren for reporting.
 
---Geoffrey PaulsenSoftware Engineer, IBM Spectrum MPIEmail: gpaul...@us.ibm.com
 
 



[OMPI devel] v4.0.3rc3 ready for testing

2020-01-29 Thread Geoffrey Paulsen via devel
Please test v4.0.3rc3:
   https://www.open-mpi.org/software/ompi/v4.0/
 
Changes since v4.0.2 include:
 
  4.0.3 -- January, 2020  
- Add support for Mellanox Connectx6.- Fix a problem with Fortran compiler wrappers ignoring use of  disable-wrapper-runpath configure option.  Thanks to David  Shrader for reporting.- Fixed an issue with trying to use mpirun on systems where neither  ssh nor rsh is installed.- Address some problems found when using XPMEM for intra-node message  transport.- Improve dimensions returned by MPI_Dims_create for certain  cases.  Thanks to @aw32 for reporting.- Fix an issue when sending messages larger than 4GB. Thanks to  Philip Salzmann for reporting this issue.- Add ability to specify alternative module file path using  Open MPI's RPM spec file.  Thanks to @jschwartz-cray for reporting.- Clarify use of --with-hwloc configuration option in the README.  Thanks to Marcin Mielniczuk for raising this documentation issue.- Fix an issue with shmem_atomic_set.  Thanks to Sameh Sharkawi for reporting.- Fix a problem with MPI_Neighbor_alltoall(v,w) for cartesian communicators  with cyclic boundary conditions.  Thanks to Ralph Rabenseifner and  Tony Skjellum for reporting.- Fix an issue using Open MPIO on 32 bit systems.  Thanks to  Orion Poplawski for reporting.- Fix an issue with NetCDF test deadlocking when using the vulcan  Open MPIO component.  Thanks to Orion Poplawski for reporting.- Fix an issue with the mpi_yield_when_idle parameter being ignored  when set in the Open MPI MCA parameter configuration file.  Thanks to @iassiour for reporting.- Address an issue with Open MPIO when writing/reading more than 2GB  in an operation.  Thanks to Richard Warren for reporting.
 
---Geoffrey PaulsenSoftware Engineer, IBM Spectrum MPIEmail: gpaul...@us.ibm.com



[OMPI devel] Open MPI v4.0.3rc1 ready for testing.

2020-01-19 Thread Geoffrey Paulsen via devel
Please test v4.0.3rc1:
   https://www.open-mpi.org/software/ompi/v4.0/
 
Changes since v4.0.2 include:
 
  4.0.3 -- January, 2020    - Add support for Mellanox Connectx6.  - Improve dimensions returned by MPI_Dims_create for certain    cases.  Thanks to @aw32 for reporting.  - Fix an issue when sending messages larger than 4GB. Thanks to    Philip Salzmann for reporting this issue.  - Add ability to specify alternative module file path using    Open MPI's RPM spec file.  Thanks to @jschwartz-cray for reporting.  - Clarify use of --with-hwloc configuration option in the README.    Thanks to Marcin Mielniczuk for raising this documentation issue.  - Fix an issue with shmem_atomic_set.  Thanks to Sameh Sharkawi for reporting.  - Fix a problem with MPI_Neighbor_alltoall(v,w) for cartesian communicators    with cyclic boundary conditions.  Thanks to Ralph Rabenseifner and    Tony Skjellum for reporting.  - Fix an issue using Open MPIO on 32 bit systems.  Thanks to    Orion Poplawski for reporting.  - Fix an issue with NetCDF test deadlocking when using the vulcan    Open MPIO component.  Thanks to Orion Poplawski for reporting.  - Fix an issue with the mpi_yield_when_idle parameter being ignored    when set in the Open MPI MCA parameter configuration file.    Thanks to @iassiour for reporting.  - Address an issue with Open MPIO when writing/reading more than 2GB    in an operation.  Thanks to Richard Warren for reporting.
---Geoffrey PaulsenSoftware Engineer, IBM Spectrum MPIEmail: gpaul...@us.ibm.com



[OMPI devel] Open MPI 4.0.2rc3 available for testing

2019-09-30 Thread Geoffrey Paulsen via devel
The third (and possibly final) release candidate for the Open MPI v4.0.2 release is posted at
https://www.open-mpi.org/software/ompi/v4.0/
Fixes since 4.0.2rc2 include:

- Silent failure of OMPI over OFI with large messages sizes.
- Conform MPIR_Breakpoint to MPIR standard.
- btl/vader: when using single-copy emulation, fragment large rdma.
- restore compilation of smpl/ikrit.


Our goal is to release 4.0.2 in the next week.  All testing and feedback is appreciated.


Thanks,

your Open MPI release team




[OMPI devel] Anyone have any thoughts about cache-alignment issue in osc/sm?

2019-09-12 Thread Geoffrey Paulsen via devel
Does anyone have any thoughts about the cache-alignment issue in osc/sm, reported in https://github.com/open-mpi/ompi/issues/6950?

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] Incorrect minimum versions specified in Open-MPI's autogen.pl

2019-07-23 Thread Geoffrey Paulsen via devel
Open MPI's autogen.pl has not had it's minimum tool version updated since sometime between v1.10 and v2.0.0.  My colleague and I ran into this today when investigating incorrect CFLAGS being used.  Jeff Squyres was able to root cause this as our use of old automake.  Open MPI publishes it's minimum tool version needs here: https://www.open-mpi.org/source/building.php.To help give this issue more visibility (i.e. a fatal error at autogen.pl time, rather than incorrect CFLAGS (and who knows whatever other symptoms)), I've created PR 6837 (https://github.com/open-mpi/ompi/pull/6837/).
 
All of the active release branches (v3.0.x, v3.1.x, and v4.0.x will also need this PR applied), if we take it.Since some of these tool's versions are still newer than what ships in common Linux distros, I wanted to communicate that this PR may cause some developers a level of pain.  Please release the (very short) PR and discuss if this is the direction Open MPI Community would like to go. 
Thanks,Geoff Paulsen
---Geoffrey PaulsenSoftware Engineer, IBM Spectrum MPIEmail: gpaul...@us.ibm.com

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] Announcing Open MPI v4.0.0rc5

2018-10-23 Thread Geoffrey Paulsen
Announcing (our hopefully last) RC5 of Open MPI v4.0.0Available at: https://www.open-mpi.org/software/ompi/v4.0/ 
Differences in v4.0.0rc5 from v4.0.0rc4:
* Fix race condition in btl/vader when writing header* Fix a double free error when using hostfile* Fix configury for internal PMIx* Ignore --with-foo=external arguments in subdirs* Remove C99-style comments in mpi.h* Fix race condition in opal_free_list_get.  Fixes #2921* Fix hang/timeout during finalize in osc/ucx* Fixed zero-size window processing in osc/ucx* Fix return code from mca_pml_ucx_init()* Add worker flush before osc/ucx module free* Btl/uct bugfixes and code cleanup.  Fixes Issues #5820, #5821* Fix javadoc build failure with OpenJDK 11* Add ompi datatype attribute to release ucp_datatype in pml/ucx* Squash a bunch of harmless compiler warnings* Fortran/use-mpi-f08: Correct f08 routine signatures* Fortran: add CHARACTER and LOGICAL support to MPI_Sizeof()* Mpiext/pcollreq: Correct f08 routine signatures* Make dist: Add missing file to tarball* Disabled openib/verbs* removed components: pml/bfo, crs/blcr, crs/criu and crs/dmtcp
 
 

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] IBM CI re-enabled.

2018-10-18 Thread Geoffrey Paulsen
 
I've re-enabled IBM CI for PRs.
 

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] I Shut down IBM CI last night

2018-10-18 Thread Geoffrey Paulsen
Devel,  I shut down IBM CI last night to upgrade our UCX and IB drivers.  Still tinkering, but it should be online again < 1 hour.
 
---Geoffrey PaulsenSoftware Engineer, IBM Spectrum MPIEmail: gpaul...@us.ibm.com

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] Announcing Open MPI v4.0.0rc4

2018-10-08 Thread Geoffrey Paulsen
Announcing Open MPI v4.0.0rc4Please download from https://www.open-mpi.org/software/ompi/v4.0/ and provide feedback from your favorite platforms.
 
changes from rc3 include: 
PR #5780 - Fortran 08 bindings fixes    fortran/use-mpi-f08: Corrections to PMPI signatures of collectives    interface to state correct intent for inout arguments and use the    ASYNCHRONOUS attribute in non-blocking collective calls.PR #5834 - 2 more vader fixes  1. Issue #5814 - work around Oracle C v5.15 compiler bug  2. ensure the fast box tag is always read firstPR #5794 - mtl ofi: Change from opt-in to opt-out provider selectionPR #5802 -  mpi.h: remove MPI_UB/MPI_LB when not enabling MPI-1 compatPR #5823 - btl/tcp: output the IP address correctlyPR #5826 - TCP BTL socklen fixes Issue #3035PR #5790 - shmem/lock: progress communications while waiting for shmem_lockPR #5791 - OPAL/COMMON/UCX: used __func__ macro instead of __FUNCTION__
 

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] Announcing Open MPI v4.0.0rc3

2018-09-28 Thread Geoffrey Paulsen
The Third release candidate for Open MPI v4.0.0 (rc3) has been built and is available at:
https://www.open-mpi.org/software/ompi/v4.0/
Only one News worth difference from v4.0.0rc2:- Fix a problem with ORTE not reporting error messages if an application  terminated normally but exited with non-zero error code.  Thanks to Emre Brookes for reporting

- OSHMEM updated to the OpenSHMEM 1.4 API.- Do not build Open SHMEM layer when there are no SPMLs available.  Currently, this means the Open SHMEM layer will only build if  a MXM or UCX library is found.- A UCX BTL was added for enhanced MPI RMA support using UCX- With this release,  OpenIB BTL now only supports iWarp and RoCE by default.- Updated internal HWLOC to 2.0.1- Updated internal PMIx to 3.0.1- Change the priority for selecting external verses internal HWLOC  and PMIx packages to build.  Starting with this release, configure  by default selects available external HWLOC and PMIx packages over the internal ones.- Updated internal ROMIO to 3.2.1.- Removed support for the MXM MTL.- Removed support for SCIF.- Improved CUDA support when using UCX.- Enable use of CUDA allocated buffers for OMPIO.- Improved support for two phase MPI I/O operations when using OMPIO.- Added support for Software-based Performance Counters, see  https://github.com/davideberius/ompi/wiki/How-to-Use-Software-Based-Performance-Counters-(SPCs)-in-Open-MPI- Various improvements to MPI RMA performance when using RDMA  capable interconnects.- Update memkind component to use the memkind 1.6 public API.- Fix problems with use of newer map-by mpirun options.  Thanks to  Tony Reina for reporting.- Fix rank-by algorithms to properly rank by object and span- Allow for running as root of two environment variables are set.  Requested by Axel Huebl.- Fix a problem with building the Java bindings when using Java 10.  Thanks to Bryce Glover for reporting.- Fix a problem with ORTE not reporting error messages if an application terminated normally but exited with non-zero error code.  Thanks to Emre Brookes for reporting.

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] Open MPI web-ex now.

2018-09-25 Thread Geoffrey Paulsen
web-ex: https://cisco.webex.com/ciscosales/j.php?MTID=m94bcdafd80c2e40b480b2c97c702293a
 
 

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] Announcing Open MPI v4.0.0rc2

2018-09-22 Thread Geoffrey Paulsen
The Second release candidate for the Open MPI v4.0.0 release has been built and will be available tonight at:https://www.open-mpi.org/software/ompi/v4.0/
 
Major differences from v4.0.0rc1 include: 
- Removed support for SCIF.- Enable use of CUDA allocated buffers for OMPIO.- Fix a problem with ORTE not reporting error messages if an application  terminated normally but exited with non-zero error code. Thanks to Emre Brookes for reporting.
 All Major differences from v3.1x include:

 
- OSHMEM updated to the OpenSHMEM 1.4 API.- Do not build Open SHMEM layer when there are no SPMLs available.  Currently, this means the Open SHMEM layer will only build if  a MXM or UCX library is found.- A UCX BTL was added for enhanced MPI RMA support using UCX- With this release,  OpenIB BTL now only supports iWarp and RoCE by default.- Updated internal HWLOC to 2.0.1- Updated internal PMIx to 3.0.1- Change the priority for selecting external verses internal HWLOC  and PMIx packages to build.  Starting with this release, configure  by default selects available external HWLOC and PMIx packages over  the internal ones.- Updated internal ROMIO to 3.2.1.- Removed support for the MXM MTL.- Removed support for SCIF.- Improved CUDA support when using UCX.- Enable use of CUDA allocated buffers for OMPIO.- Improved support for two phase MPI I/O operations when using OMPIO.- Added support for Software-based Performance Counters, see  https://github.com/davideberius/ompi/wiki/How-to-Use-Software-Based-Performance-Counters-(SPCs)-in-Open-MPI- Various improvements to MPI RMA performance when using RDMA  capable interconnects.- Update memkind component to use the memkind 1.6 public API.- Fix problems with use of newer map-by mpirun options.  Thanks to  Tony Reina for reporting.- Fix rank-by algorithms to properly rank by object and span- Allow for running as root of two environment variables are set.  Requested by Axel Huebl.- Fix a problem with building the Java bindings when using Java 10.  Thanks to Bryce Glover for reporting.- Fix a problem with ORTE not reporting error messages if an application  terminated normally but exited with non-zero error code.  Thanks to  Emre Brookes for reporting.

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] Announcing Open MPI v4.0.0rc1

2018-09-16 Thread Geoffrey Paulsen
The first release candidate for the Open MPI v4.0.0 release is posted at 
https://www.open-mpi.org/software/ompi/v4.0/
Major changes include:


4.0.0 -- September, 2018


- OSHMEM updated to the OpenSHMEM 1.4 API.
- Do not build Open SHMEM layer when there are no SPMLs available.
  Currently, this means the Open SHMEM layer will only build if
  a MXM or UCX library is found.
- A UCX BTL was added for enhanced MPI RMA support using UCX
- With this release,  OpenIB BTL now only supports iWarp and RoCE by default.
- Updated internal HWLOC to 2.0.1
- Updated internal PMIx to 3.0.1
- Change the priority for selecting external verses internal HWLOC
  and PMIx packages to build.  Starting with this release, configure
  by default selects available external HWLOC and PMIx packages over
  the internal ones.
- Updated internal ROMIO to 3.2.1.
- Removed support for the MXM MTL.
- Improved CUDA support when using UCX.
- Improved support for two phase MPI I/O operations when using OMPIO.
- Added support for Software-based Performance Counters, see
  https://github.com/davideberius/ompi/wiki/How-to-Use-Software-Based-Performance-Counters-(SPCs)-in-Open-MPI- Various improvements to MPI RMA performance when using RDMA
  capable interconnects.
- Update memkind component to use the memkind 1.6 public API.
- Fix problems with use of newer map-by mpirun options.  Thanks to
  Tony Reina for reporting.
- Fix rank-by algorithms to properly rank by object and span
- Allow for running as root of two environment variables are set.
  Requested by Axel Huebl.
- Fix a problem with building the Java bindings when using Java 10.
  Thanks to Bryce Glover for reporting.

Our goal is to release 4.0.0 by mid Oct, so any testing is appreciated.

 

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] Github deprecated "Github services" Does this affect us?

2018-06-07 Thread Geoffrey Paulsen
Devel,
 
   I just came across Github's deprecation announcement of Github Services.
https://developer.github.com/changes/2018-04-25-github-services-deprecation/   Does anyone know if this will affect Open-MPI at all, and do we need to change any processes because of this?
---Geoffrey Paulsen

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] Today's Open-MPI discussion notes highlighting potential new runtime approach.

2018-06-05 Thread Geoffrey Paulsen
All,   In today's Open MPI Web-Ex (Minutes here: https://github.com/open-mpi/ompi/wiki/WeeklyTelcon_20180605) we discussed the future of Open MPI ORTE runtime (mpirun / orteds, launching etc).  Nothing was decided, but please take a look and discuss on the mailing lists, and or come to next week's Web-Ex for more discussion.   This discussion is in the context of Open MPI v5.0 which we haven't yet decided on the schedule for (but v4.0 branches from master mid-July, and releases mid-Sept).
   We'd love to hear your input.   Thanks,   Geoff Paulsen

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] Today's Open MPI Web-Ex. 29 minutes.

2017-12-12 Thread Geoffrey Paulsen
Web-Ex: https://cisco.webex.com/ciscosales/j.php?MTID=me125278da54f7bbeb722fc30d5b73a2f
 
Minutes: https://github.com/open-mpi/ompi/wiki/WeeklyTelcon_20171212

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] Open MPI WebEx:

2017-04-04 Thread Geoffrey Paulsen
WebEx: https://cisco.webex.com/ciscosales/j.php?MTID=me125278da54f7bbeb722fc30d5b73a2f
 
Agenda / Minutes: https://github.com/open-mpi/ompi/wiki/WeeklyTelcon_20170404
 
+ MTT testing of Master
---Geoffrey PaulsenSoftware Engineer, IBM Spectrum MPIPhone: 720-349-2832Email: gpaul...@us.ibm.comwww.ibm.com

___
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

[OMPI devel] RFC: Open MPI v3.0 branching soon (next week). Move to date-based release

2017-02-21 Thread Geoffrey Paulsen
RFC: Open MPI v3.0 branching soon (next week). Move to date-based release
 
At the Face-to-face in San Jose (minutes: https://github.com/open-mpi/ompi/wiki/Meeting-Minutes-2017-01) we agreed that starting with v3.0, we would switch to three date-based releases each year.  These would be rleased on the 15th of the months Feb, June, and October.  At the face-to-face, we agreed for the first cycle, we would branch for v3.0 on June 15th, and release on October 15th (and branch for next release that same day).  Date based releases means we might have to ship with possibly critical bugs, but this gives us more motivation to get testing done EARLY.
 
In today's WebEx (minutes: https://github.com/open-mpi/ompi/wiki/WeeklyTelcon_20170221) we discussed accelerating the transition to the date-based releases so that we could ship v3.0 on June 15th.  To do this, we'd need to branch v3.x from master soon.  We set the preliminary date to branch v3.x from master next Tuesday Feb 28th.  What does the community think about this?  Can everyone who has new features destined for v3.0 get them into master, within a week?  Once v3.x branches, there would be only bugfixes only accepted to that branch.  The good news is that any features that won't make the v3.x branch, would go out in the next release (which we have decreed would be branched June 15th, and shipped October 15th).
 
We'd like to solicit input from the community in this thread on the devel list by Monday, Feb 27th.  Please answer the following questions:
 
1) Are you okay with branching for v3.0 Tuesday Feb 28th?  If not, please discuss reasons, and possible solutions.
 
2) Is anyone working on any new features that they'd like to get out with v3.0 but is not yet in master?  Remember if it misses v3.0, there will be another opportunity with v3.1 in 4 months.
 

___
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

[OMPI devel] New Open MPI Community Bylaws to discuss

2016-10-11 Thread Geoffrey Paulsen
We have been discussing new Bylaws for the Open MPI Community.  The primary motivator is to allow non-members to commit code.  Details in the proposal (link below).
 
Old Bylaws / Procedures:  https://github.com/open-mpi/ompi/wiki/Admistrative-rules
New Bylaws proposal: https://github.com/open-mpi/ompi/wiki/Proposed-New-Bylaws
 
Open MPI members will be voting on October 25th.  Please voice any comments or concerns.

___
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

[OMPI devel] Minutes from Telcon today

2016-01-26 Thread Geoffrey Paulsen
https://github.com/open-mpi/ompi/wiki/WeeklyTelcon_20160126
Open MPI Weekly Telcon
Dialup Info: (Do not post to public mailing list or public wiki)
 
Attendees
Geoff PaulsenJeff SquyresBrad BentonEdgar GabrielGeoffroy ValleeJoshua LaddNathan HjelmRalph CastainRyan GrantSylvain JeaugeyTodd Kordenbrock
 
Agenda
 
Review 1.10
Milestones: https://github.com/open-mpi/ompi-release/milestones/v1.10.31.10.2 went out the Door.Already have a bug (Giles) Ralph fixed.Another bug Fortran - broken F08 bindings (Jeff) saw late last night.
Issue https://github.com/open-mpi/ompi/issues/1323If it's broken, how did it pass testing? Jeff needs a day or two to dig into.Need to verify that library versions are still correct? -Jeff took care of.MPI_Abort investigation (Ralph)? - Periodically have this issue where MPI_Abort + MTT has some issue. Perl is suspect, Ralph will look into ruby or another language.1.10 C Strided mutex lock issue. (Nathan)?High CPU utilization on Async progress thread (Ralph)? Ralph Fixed... One off 1.10, not in master. In 1.10.2
 
Review 2.0.x
Wiki: https://github.com/open-mpi/ompi/wiki/Releasev20Blocker Issues: https://github.com/open-mpi/ompi/issues?utf8=%E2%9C%93&q=is%3Aopen+milestone%3Av2.0.0+label%3AblockerIssue 1252 - Nathan's progression decay function progress? Looking at files today.
udcm, openib_error_handler - opal_outputs would be sufficent.Issue 1215 - Group Comm Errors thing (Ralph) - Deal with race condition in ORTE collectives.
Launch goes down the tree. Mutex goes across the tree.So possible to receive a modex message before you receive launch message.Milestones: https://github.com/open-mpi/ompi-release/milestones/v2.0.0Group Comms weren't working for Comms of powers of 2. (Nathan)? Fixed.ROMIO default for OMPI on Luster (only) PR 896?894, 890, 900, 901 - Jeff and Howard are good with. Jeff?
Taking all of those merged.Issue 1292 - Asked Ralph if this is right way to fix this. (Ralph)Issue 1177 - large message writev, fixed but not merged to master - Test working everywhere but OS X / BSD (George).
OS X / BSD limits large message total size to 32K?Not going to fix for 2.0.0Someone can write code to handle OS X / BSD.Issue 1299 - hang (Nathan)? Need to go ahead an fix this today. Giles has patch, Nathan just needs to verify.2.0.0 does not compile on Solaris due to statfs(). Now that we moved to OMPIO, we're now hitting the problem.
Edgar is working on it. Solaris has different number of args and return code.Issue 1301 - check max CQ size before creating CQ. (Josh)
If it passes Jenkins, happy. UD OOB (Mellanox runs). Approved, Pending Jenkins.HWThreads - Ralph? Talk to Mike about use case? A commit has been done, and moved to 1.10.
Pinged Giles that it should go to 2.0 also.Travis Status on 2.0?
Going well.Nathan is good with 2.0 for 1sidedPR918 - Ralph reviewed on master. Giles PRed it to 2.0.PR919 - hwloc - Ralph will reviewPR911 - use correct endpoint. Just got word from nVidia that this is good.PR917 - Ryan will look at today. LANL hardware that hits this is going away. Doesn't affect Aries. Aries doesn't have get_alignment(). Want this in.
 
Review Master?
BTL flags = 305 perf got horrible? Edgar? Worked around by removing this on his cluster. Don't understand why. He always used to set it, but now doesn't.OMPIO not finding PDFS2 - configure work Edgar is
 
MTT status:
Cisco was showing timeouts. Jeff found 2 things on cluster. Specific problem couldn't replicate.
not handling OOB on Master or 1.10. Cisco cluster 4 or 5 IP addresses on each node. eth0 was down on one node. Timeout on eth0 was taking quite a while. Jeff removed those two nodes. Unusual for real world. OOB verbosity exposes.Long running problem, need a good solution.
 
Status Updates:
Cisco - Been working on Cluster, Release issues with Howard. have a couple of small scalability improvements for usNIC.ORNL - Not much to report. Any progress with UBUNTU package ownership? Geoffroy will look on Saturday.UTK - Not much to report.NVIDIA - Sylvain not much, A user issue not f

[OMPI devel] Please sign up on wiki if you're coming to Face 2 Face in Dallas Feb 23-25

2016-01-12 Thread Geoffrey Paulsen
Hello,
 
  Please sign up on the wiki (https://github.com/open-mpi/ompi/wiki/Meeting-2016-02) if you're planning to come to the Developer's conference hosted by IBM in Dallas [Feb 23-25].
 
  Thanks,
  Geoff
---Geoffrey PaulsenSoftware Engineer, IBM Platform MPIIBM Platform-MPIPhone: 720-349-2832Email: gpaul...@us.ibm.comwww.ibm.com



[OMPI devel] Minutes from today's Telcon

2016-01-12 Thread Geoffrey Paulsen
Also available here:
 
https://github.com/open-mpi/ompi/wiki/WeeklyTelcon_20160112
 
Open MPI Weekly Telcon
Dialup Info: (Do not post to public mailing list or public wiki)
 
Attendees
Brad BentonEdgar GabrielGeoffroy ValleeGeorgeHowardJosh HurseyNathan HjelmRalphRyan GrantSylvain JeaugeyTodd Kordenbrock
 
Minutes
 
Review 1.10
Milestones: https://github.com/open-mpi/ompi-release/milestones/v1.10.2
mpirun hangs on ONLY SLES 12. Minimum 40 procs/node. at very end of mpirun. Only seeing it in certain cases. Not sure what's going on.Is mpirun not exiting because ORTED not exiting? Nathan saw this on 2.0wait for Paul Hardgrove.No objections for Ralph shipping 1.10.2
 
Review 2.0.x
Wiki: https://github.com/open-mpi/ompi/wiki/Releasev20Blocker Issues: https://github.com/open-mpi/ompi/issues?utf8=%E2%9C%93&q=is%3Aopen+milestone%3Av2.0.0+label%3AblockerMilestones: https://github.com/open-mpi/ompi-release/milestones/v2.0.0Group Comms weren't working for Comms of powers of 2. Nathan found massive memory issue.https://github.com/open-mpi/ompi/issues/1252 - Nathan working on a decay function for progress functions to "fix" this.
Nathan's been delayed until later this week. Could get done by middle of next week.George commented that openib btl specificly could be made to only progress if there is a send/recv message posted.
ugeniee progress - could only check for data grams every (only 200ns hit).Prefer to stick with nathan's original decay function without modifying openib.https://github.com/open-mpi/ompi/issues/1225 - Totalview debugger problem + PMPI-x.
SLURM users use srun, doesn't have this issue.DDT does NOT have this issue either. Don't know why it's different. Attach FIFO.
mpirun waits on a pipe for debugger to write a 1 on that pipe.Don't see how that CAN work.Nathan's been using attach, rather than mpirun --debug. Attach happens after launch, so then it's not going through this step. Nathan thinks not so critical since attach works.Anything will work, as long as you're ATTACHING to a running job, rather than launching through debugger.Barring a breakthrough with PMI-x notify in next week. We'll do an RC2 and just carfully document what works/doesn't as far as debuggers.Will disable "mpirun --debug" and print an error on 2.0 branch that says it's broken.No longer a blocker for 2.0.0 due to schedule. Still want to fix this for next release.No new features (except for
Howard will reviewreview group commdon't know if we'll bother with pls filesystem.UXC using Modex stuff.OMPI-IO + Luster slow on 2.0.0 (and master) branches. Discussed making ROMIO default for OMPI on Luster (only).
 
Review Master?
Bunch of failures on Master branch. No chance to look at yet.Cisco and Ivy cluster.Nathan's seeing a resource deadlock avoided on OMPI Waitall. Some TCP BTL issue. Looks like something going on down there. Should be fairly easy to test this. Cisco TCP one-sided stuff.
Nathan will see if he can figure this out. Haven't changed one-sided pt2pt receintly. Surprised. Maybe proclocks on by default? Need to work this out. Just changed locks from being conditional to being unconditional.Edgar found some luster issues. OMPI master, has bad MPI-IO performance on luster. Looked reasonable on master, but now performance is poor. Not completely sure when get performance
Luster itself, could switch back to ROMIO for default.GPFS, and others will look good, but Luster is bad. Can't have OMPI-IO as default on Luster.Problem for 2.0.0 AND Master Branch.https://github.com/open-mpi/ompi/issues/398 ready for Pull request
Nathan - Should go to 2.1 (since mpull changes pushed to 2.1).https://github.com/open-mpi/ompi/pull/1118 - mpull rewrite should be ready to go, but want George to look at make comments. Pro

[OMPI devel] No meeting today 12/29/2015 either.

2015-12-29 Thread Geoffrey Paulsen
I think many people are out this week.
 
Please note that Ralph respun 1.10.2.rc3.
 
See everyone next Tuesday Jan 5th, 2016.  Have a Happy New Year!
 
 



[OMPI devel] Hotels for Feb Face 2 Face

2015-12-16 Thread Geoffrey Paulsen
I've updated the wiki to include a map of 3 hotels near DFW that offers a shuttle both to/from DFW and the IBM Innovation Center for those who wish to go without a car.
 
https://github.com/open-mpi/ompi/wiki/Meeting-2016-02
---Geoffrey PaulsenSoftware Engineer, IBM Platform MPIIBM Platform-MPIPhone: 720-349-2832Email: gpaul...@us.ibm.comwww.ibm.com



[OMPI devel] No Meeting 12/22/2015

2015-12-15 Thread Geoffrey Paulsen
In today's telcom we decided to skip next week's meeting.
 



[OMPI devel] Minutes from Weekly Telecon 12/15/2015

2015-12-15 Thread Geoffrey Paulsen
https://github.com/open-mpi/ompi/wiki/WeeklyTelcon_20151215
 
Also, reminder, NO meeting next week 12/22/2015.



[OMPI devel] Agenda 12/8

2015-12-07 Thread Geoffrey Paulsen
Open MPI Meeting 12/8/2015
--- Attendees --
Agenda:- Review 1.10  o  Milestones: https://github.com/open-mpi/ompi-release/milestones/v1.10.2  o  1.10.2 Release Candidate before the holidays?- Review 2.x  o  Wiki: https://github.com/open-mpi/ompi/wiki/Releasev20  o  Blocker issues: https://github.com/open-mpi/ompi/issues?utf8=%E2%9C%93&q=is%3Aopen+milestone%3Av2.0.0+label%3Ablocker + 1064 - Ralph / Jeff, is this do-able by december?     + Dynamic add procs is busted now when set value to 0 (not related to PMI-x)  o  Milestones:  https://github.com/open-mpi/ompi-release/milestones/v2.0.0 + One of us will go through ALL Issues for 2.0.0 to ask if they can be moved out to future release.  o RFC on embedded PMIx version handling  o RFC process wiki page?- MTT status- Status Update: LANL, Houston, HLRS, IBM
   --- Status Update Rotation Cisco, ORNL, UTK, NVIDIAMellanox, Sandia, IntelLANL, Houston, HLRS, IBM



[OMPI devel] Meeting Notes 12/1/2015

2015-12-01 Thread Geoffrey Paulsen
Open MPI Meeting 12/1/2015 
--- Attendees --Geoff PaulsenJeff SquyresGeoffroy ValeeHowardRyan GrantSylvain Jeaugey - new nVidia contact (replaces Rolf) previously at Bull Computing (10 years) lives in Santa Clara.Todd Kordenbrock
Agenda:- Solicit volunteer to run the weekly telecon- Review 1.10  o  Milestones: https://github.com/open-mpi/ompi-release/milestones/v1.10.2     + one PR for 1.10.2  (PR 782)     + Need someone to clarify on this, to resolve.     + After we decide if it's right, A core developer will need to create PR for Master.     + Rest of PRs are for 1.10.3 (March or April 2016?)  o  When do we want to start release work for 1.10.2?     + How about a 1.10.2 Release Candidate before the holidays?     + Ralph will send email about this to dev list to solicit discussion.- Review 2.x   o  Wiki: https://github.com/open-mpi/ompi/wiki/Releasev20  o  Blocker issues: https://github.com/open-mpi/ompi/issues?utf8=%E2%9C%93&q=is%3Aopen+milestone%3Av2.0.0+label%3Ablocker     + 1064 - Ralph / Jeff, is this do-able by december?     + Dynamic add procs is busted now when set value to 0 (not related to PMI-x)  o  Milestones:  https://github.com/open-mpi/ompi-release/milestones/v2.0.0     + One of us will go through ALL Issues for 2.0.0 to ask if they can be moved out to future release.  o RFC on embedded PMIx version handling     + PMI-x, once it's stablized, treat it just like hwloc or libevent.     + PMI-x would have seperate releases, and if Open MPI needs to cherry pick specific releases.     + PMI-x when it has a new release, we'll create a new directory and validate when it's ready to go remove older ones.     + PMI-x Ralph will create a tarball for that.     + PMI-x - Needs to be in 2.0.0, need to update it and go to right naming convention while do that.  o RFC process - Every so often there are 'big deal' issues / PR requests.  It's hard to spot these BIG ones.     + Ralph proposing that if you're making a major change or change to core-code:          Send RFC to devel list before you do it!  (and again with PR when it's ready, put "RFC" in PR title.)     + Good idea to send out RFC before you start to do it, then others can give a heads up or comment.     + Prevent potential conflicts of parallel development.        Howard - Nice to have affected components, and reason for wanting change.        Jeff - Had a nice format for RFC's before.  Short / Long versions.  Might want to nail down.        Jeff - Propose we put "RFC" in PR title.        Jeff - should the body and format be in PR        Discussion about proposed work should be on devel email.        Discussion about already written code is on PR, and        Jeff proposes a wiki page describing this process.    Where - what does it affect.    When - when can we discuss?  Give at least 1 week for others to reply.    What - summary    Why - Some justification, better than "I was board".    Down below deeper discussion.o Supercomputing reports    + OMPI BoF went well.  Over 100 people in room.  Slides on OMPI website, and on Jeff's Blog.    + People appreciated the BoF format of "status, roadmap, what's going well, what needs more attention, etc"    + PMI-x Bof Went well too.  Scaling improvements went REALLY well.    + PMI-x showed really good slope, they thought it was wire up times of daemons.      Mellanox needs to remove requirement to remove LID and GID, but still like a yearo Status Update: Mellanox, Sandia, Intel   + Mellanox (via Ralph)      1. Artem will be working with Ralph et al. to finish off the OMPI side issues in PMIx integration.      2. Igor Ivanov will continue to fix memory corruption bugs uncovered in Valgrind.      3. Artem and Igor will start looking at making the necessary changes to UCX PML to use the direct modex.      4. Mellanox plans to submit UCX PML for inclusion in 1.10.3.      5. Mellanox plans to submit missing routines needed for OSHMEM 1.2 spec compliance for inclusion in 1.10.3. Igor Ivanov will be leading this.   + Sandia (ryan Grant)      - Put Portals triggered Ops on master.  Will run tests there for a while and then put PR for 2.0 branch.   + Intel (Ralph)     - PMI-x Working on Pull Requests.     - HPC stuff occupying alot of his time.  Announcing Open HPC to create a community distributions optimized for HPC.        - Building on top of OPAL.o Howard has request for Slyvan / nVidia   + Slyvan stopped Rolf's MTT yesterday, hoping to have it back by end of the week.   + MLX5 HCAs - on master there are lots of errors, not sure because of software.   + Nvidia cluster shows up bugs before other clusters.   + Right now master under defaults running really clean.  But turning on dynamic add-procs is showing lots of issue in Comm_dup, and other Comm creation code.   --- Status Update Rotation LANL, Houston, HLRS, IBMCisco, ORNL, UTK, NVIDIAMellanox, Sandia, Intel



[OMPI devel] Doodle to find time to discuss issues/398

2015-11-03 Thread Geoffrey Paulsen
Anyone interested, please add your name to doodle, and we'll find a time that everyone can meet.
 
http://doodle.com/poll/3gk6bx4dzgrpsqva
 
--- Agenda ---
 
In the Open MPI call today we discussed a few aspects of https://github.com/open-mpi/ompi/issues/3981) Moving ompi_info_t down to opal_info_t to allow lower level components access to this functionality2) Implementing OMPI_Comm_Info get/set in a way that can be reused for Windows and Files also.   There are a number of issues around how the standard words the return values from get() that are left up to the implementation, for example:  - values for non-explicitly set keys that the MPI layer is using.  - values for non-explicitly set keys that the MPI layer is not using.  - values for explicitly overwritten values.  - Communication (to user via docs??) of what hints Open MPI recognizes.  - Communication (to user via docs??) of what values are required to be the same/different on all ranks of Comm.  - additional consistancy checking of values in debugging mode?  - ability to print/log unrecognized hints.  
---Geoffrey PaulsenSoftware Engineer, IBM Platform MPIIBM Platform-MPIPhone: 720-349-2832Email: gpaul...@us.ibm.comwww.ibm.com



[OMPI devel] IBM Innovation Center Reserved for Open MPI Face-2-Face

2015-10-20 Thread Geoffrey Paulsen
We have the Dallas IBM Innovation Center (http://ibm.com/partnerworld/iic/dallas.htm) reserved 2/23 - 2/25, 2016.
 
IBM Innovation Center - Dallas1177 South Beltline RdCoppell, TX 75019469-549-8444
 
https://www.google.com/maps/place/IBM+Innovation+Center+-+Dallas/@32.942725,-96.9965226,17z/data="">
 
 
We've reserved two rooms, a large classroom "Hollerith" from 8:30am - 5pm each day: 
There is also a "Think Bar" for us to lounge about in our PJs and eat lunch.  I think it's not as "Bar" like as some would prefer.
 
---Geoffrey PaulsenSoftware Engineer, IBM Platform MPIIBM Platform-MPIPhone: 720-349-2832Email: gpaul...@us.ibm.comwww.ibm.com


Re: [OMPI devel] OMPI_PROC_BIND value is invalid errors

2015-06-30 Thread Geoffrey Paulsen

I discussed with Robert Ho who was working with Ralph on this option.  He
believes it's possible that the PGI compiler / runtime does not understand
OMP_PROC_BIND=SPREAD which was only introduced in OpenMP 4.0.

Unfortunately I can't find any docs as the http://www.pgroup.com/index.htm
is down right now.

We have PGI version 11.8 which only support OpenMP version 3.0, and does
not list OMP_PROC_BIND at all.

in 11.8, PGI supported MP_BIND=yes which would request the PGI runtime
libraries to bind processes or threads in a parallel region to phsyical
processors (default is no).
It also supported MP_BLIST=a,b,c,d  (when MP_BIND was set to yes to map how
you wanted threads or processes bound to physical processors 0,1,2,3.

There is a note in the documentation that setting MP_BIND does NOT affect
the compiler behavior at all, only the runtime library.


Regards,

Geoffrey (Geoff) Paulsen
Software Engineer - Platform MPI
   
   
   
 Phone: 1-720-349-2832  
IBM
 E-mail: gpaul...@us.ibm.com   
1177 S Belt 
Line Rd
 Coppell, TX 
75019-4642
  United 
States
   





From:   Howard Pritchard 
To: Open MPI Developers 
List-Post: devel@lists.open-mpi.org
Date:   06/29/2015 09:27 PM
Subject:Re: [OMPI devel] OMPI_PROC_BIND value is invalid errors
Sent by:"devel" 



I decided just to disable the carver/pgi mtt runs.


2015-06-29 15:10 GMT-06:00 Ralph Castain :
  Very strange then - again, can you run it with the verbose flag and send
  me the output? I can't replicate what you are seeing.


  On Mon, Jun 29, 2015 at 4:05 PM, Howard Pritchard 
  wrote:
   ibm dataplex and laki ~= cray.  nothing to do with cray.
   Cray runs fine since I use aprun there.


   2015-06-29 13:54 GMT-06:00 Ralph Castain :
 Hmmm...is this some Cray weirdness? I checked the code and it looks
 right, and it runs correctly for me on both Mac and Linux. All it is
 doing is calling "setenv", so I'm wondering if there is something
 environ-specific going on here?

 I added some debug in cast that might help - can you run it on the
 Cray with "--mca rtc_base_verbose 5" on the cmd line?


 On Mon, Jun 29, 2015 at 1:19 PM, Jeff Squyres (jsquyres) <
 jsquy...@cisco.com> wrote:
  Ahh... it's OMP_PROC_BIND, not OMPI_PROC_BIND.

  Yes, Ralph just added this.

  I chatted with him about this on the phone moments ago; he's pretty
  sure he knows where to go look to find the problem.


  > On Jun 29, 2015, at 12:00 PM, Howard Pritchard  wrote:
  >
  > laki is also showing the errors:
  >
  >
  > Here's the shortened url:
  >
  > http://goo.gl/Ra264U
  >
  > looks like the badness started with the latest nightly.
  > I think there was some activity in the orte binding area recently.
  >
  > Howard
  >
  >
  >
  >
  > 2015-06-29 9:52 GMT-06:00 Jeff Squyres (jsquyres) <
  jsquy...@cisco.com>:
  > Can you provide an MTT short URL to show the results?
  >
  > Or, if the MTT results are not on the community reporter, can you
  show a bit more context in the output?
  >
  >
  > > On Jun 29, 2015, at 11:47 AM, Howard Pritchard <
  hpprit...@gmail.com> wrote:
  > >
  > > Hi Folks,
  > >
  > > I'm seeing an error I've not seen before in the MTT runs on the
  ibm dataplex
  > > at NERSC.  The mpirun launched jobs are failing with
  > >
  > > OMPI_PROC_BIND value is invalid
  > >
  > > errors.
  > >
  > > This is is for the trivial ring tests.
  > >
  > > Is anyone else seeing these types of errors?
  > >
  > > Howard
  > >
  > > ___
  > > devel mailing list
  > > de...@open-mpi.org
  > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
  > > Link to this post:
  http://www.open-mpi.org/community/lists/devel/2015/06/17558.php
  >
  >
  > --
  > Jeff Squyres
  > jsquy...@cisco.com
  > For corporate legal information go to:
  http://www.cisco.com/web/about/doing_business/legal/cri/
  >
  > ___
  > devel mailing list
  > de...@open-mpi.org
  > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
  > Link to this post:
  http://www.open-mpi.org/community/lists/devel/2015/06/17559.php
  >