Re: MPICH as default MPI; WAS: MPI debugging workflows
On 07/12/2018 12:49, Alastair McKinstry wrote: > Looking into it further, I'm reluctant now to move to mpich for buster as the > default. One was the experience of the openmpi3 transition, shaking out many > issues. Ack. Given how long it took to complete the last transition and how many packages failed to build on a few architectures but not others (which complicates rebuild testing as packages may build on amd64 but then fail on other arches during the transition) I would be very hesitant to ack this transition at this stage. So let's release with openmpi as the default and if you want to change it for bullseye we can do that. Cheers, Emilio
Re: MPICH as default MPI; WAS: MPI debugging workflows
On 11/12/2018 18:26, Drew Parsons wrote: On 2018-12-08 12:31, Drew Parsons wrote: On 2018-12-07 19:49, Alastair McKinstry wrote: Looking into it further, I'm reluctant now to move to mpich for buster as the default. One was the experience of the openmpi3 transition, shaking out many issues. ... So I think more testing of mpich3 builds with CH4 /pmix / OFI support is needed, but moving over openmpi-> mpich at this stage is iffy. Thanks Alistair, your analysis sounds sound. I'm happy to be patient with the switch and wait till after buster, especially if pmix support complicates the matter. That will make it all the more useful to set up a test buildd to test the transition. I'll invite upstream authors who have been promoting mpich over openmpi to chip in with their experience. I received this feedback from a PETSc developer: "The Open MPI API is better for developers due to better type safety (all handles in MPICH are typedef'd to int). Most major commercial vendors are organized around variants of MPICH (where there is collective ABI standardization). Open MPI is more modular so most vendor stuff goes into their plugins (for those vendors working with OMPI). "I think a good solution for Linux distros (and many others) would be to make a library that is ABI compatible with OMPI, but dispatches through to MPICH. There exists a (messy) open source demonstration. " https://github.com/cea-hpc/wi4mpi/ " ABI compatibility sounds like a nice resolution of the dilemma. Drew Thanks, thats interesting. Yes, we should work for ABI compatibility for Bullseye. My goals for MPI in Bullseye are: (1) symbol versioning. OpenMPI has just released 4.0, but its too late to realistically push into Buster given experience. I have a symbol versioning patch that upstream have tentatively agreed to but needs work for non-gcc compilers. I intend to get that into the OpenMPI5 release, and OpenMPI5 into Bullseye, aiming to make that the "last" big OpenMPI transition ... (2) Ditto for MPICH ? (3) Build using pkg-confiig instead of mpicc / mpifort wrappers. As of now, OpenMPI and MPICH are multiarch-aware, so you can cross-build MPI applications, but only if you use pkg-config rather than mpi wrappers. Push this work through the MPI stack. Best regards Alastair -- Alastair McKinstry, , , https://diaspora.sceal.ie/u/amckinstry Commander Vimes didn’t like the phrase “The innocent have nothing to fear,” believing the innocent had everything to fear, mostly from the guilty but in the longer term even more from those who say things like “The innocent have nothing to fear.” - T. Pratchett, Snuff
Re: MPICH as default MPI; WAS: MPI debugging workflows
On 2018-12-08 12:31, Drew Parsons wrote: On 2018-12-07 19:49, Alastair McKinstry wrote: Looking into it further, I'm reluctant now to move to mpich for buster as the default. One was the experience of the openmpi3 transition, shaking out many issues. I suspect we could see the same with other package builds, as you point out, tuned to openmpi rather than mpich, but also the feature support for mpich. ... So I think more testing of mpich3 builds with CH4 /pmix / OFI support is needed, but moving over openmpi-> mpich at this stage is iffy. Thanks Alistair, your analysis sounds sound. I'm happy to be patient with the switch and wait till after buster, especially if pmix support complicates the matter. That will make it all the more useful to set up a test buildd to test the transition. I'll invite upstream authors who have been promoting mpich over openmpi to chip in with their experience. I received this feedback from a PETSc developer: "The Open MPI API is better for developers due to better type safety (all handles in MPICH are typedef'd to int). Most major commercial vendors are organized around variants of MPICH (where there is collective ABI standardization). Open MPI is more modular so most vendor stuff goes into their plugins (for those vendors working with OMPI). "I think a good solution for Linux distros (and many others) would be to make a library that is ABI compatible with OMPI, but dispatches through to MPICH. There exists a (messy) open source demonstration. " https://github.com/cea-hpc/wi4mpi/ " ABI compatibility sounds like a nice resolution of the dilemma. Drew
Re: MPICH as default MPI; WAS: MPI debugging workflows
On 2018-12-07 19:49, Alastair McKinstry wrote: On 07/12/2018 11:26, Drew Parsons wrote: Hi Alistair, openmpi3 seems to be stabilised now, packages are now passing tests and libpsm2 is no longer injecting 15 sec delays. Nice that the mpich 3.3 release is now finalised. Do we feel confident proceeding with the switch of mpi-defaults from openmpi to mpich? Are there any know issues with the transition? One that catches my eye are the build failures in scalapack. It's been tuned to pass built time tests with openmpi but fails many tests with mpich (scalapack builds packages for both mpi implementations). I'm not sure how concerned we should be with those build failures. Perhaps upstream should be consulted on it. Are similar mpich failures expected in other packages? Is there a simple way of setting up a buildd to do a test run of the transition before making it official? Drew Hi Drew, Looking into it further, I'm reluctant now to move to mpich for buster as the default. One was the experience of the openmpi3 transition, shaking out many issues. I suspect we could see the same with other package builds, as you point out, tuned to openmpi rather than mpich, but also the feature support for mpich. e.g. mpich integration with psm / pmix / slurm is weak (in Debian). While it might not look important to be able to scale to 10k+ nodes on Debian (as none of the top500 machines run Debian), we're seeing an increase in the container use case: building mpi apps within Singularity containers running on our main machine. We don't run Debian as the OS on the base supercomputer at work because we need kernel support from $vendor, but the apps are built in Singularity containers running Debian ... v. large scale jobs becom increasingly likely, and openmpi / pmix is needed for that. Testing mpich I've yet to get CH4 working reliably - needed for pmix, and the OFI / UFX support is labeled 'experimental'. My driving use case for the move to mpich had been fault tolerance - needed for co-arrays (https://tracker.debian.org/pkg/open-coarrays) needed for Fortran 2018, but i've since re-done open-coarrays to build both openmpi and mpich variants, so that issue went away. So I think more testing of mpich3 builds with CH4 /pmix / OFI support is needed, but moving over openmpi-> mpich at this stage is iffy. Thanks Alistair, your analysis sounds sound. I'm happy to be patient with the switch and wait till after buster, especially if pmix support complicates the matter. That will make it all the more useful to set up a test buildd to test the transition. I'll invite upstream authors who have been promoting mpich over openmpi to chip in with their experience. Drew
Re: MPICH as default MPI; WAS: MPI debugging workflows
Hi, On 2018-12-07 12:49, Alastair McKinstry wrote: While it might not look important to be able to scale to 10k+ nodes on Debian (as none of the top500 machines run Debian) Some EDF clusters running a custom Debian derivative have been in the TOP500 lists since 2012. _g.
Re: MPICH as default MPI; WAS: MPI debugging workflows
On 07/12/2018 11:26, Drew Parsons wrote: Hi Alistair, openmpi3 seems to be stabilised now, packages are now passing tests and libpsm2 is no longer injecting 15 sec delays. Nice that the mpich 3.3 release is now finalised. Do we feel confident proceeding with the switch of mpi-defaults from openmpi to mpich? Are there any know issues with the transition? One that catches my eye are the build failures in scalapack. It's been tuned to pass built time tests with openmpi but fails many tests with mpich (scalapack builds packages for both mpi implementations). I'm not sure how concerned we should be with those build failures. Perhaps upstream should be consulted on it. Are similar mpich failures expected in other packages? Is there a simple way of setting up a buildd to do a test run of the transition before making it official? Drew Hi Drew, Looking into it further, I'm reluctant now to move to mpich for buster as the default. One was the experience of the openmpi3 transition, shaking out many issues. I suspect we could see the same with other package builds, as you point out, tuned to openmpi rather than mpich, but also the feature support for mpich. e.g. mpich integration with psm / pmix / slurm is weak (in Debian). While it might not look important to be able to scale to 10k+ nodes on Debian (as none of the top500 machines run Debian), we're seeing an increase in the container use case: building mpi apps within Singularity containers running on our main machine. We don't run Debian as the OS on the base supercomputer at work because we need kernel support from $vendor, but the apps are built in Singularity containers running Debian ... v. large scale jobs becom increasingly likely, and openmpi / pmix is needed for that. Testing mpich I've yet to get CH4 working reliably - needed for pmix, and the OFI / UFX support is labeled 'experimental'. My driving use case for the move to mpich had been fault tolerance - needed for co-arrays (https://tracker.debian.org/pkg/open-coarrays) needed for Fortran 2018, but i've since re-done open-coarrays to build both openmpi and mpich variants, so that issue went away. So I think more testing of mpich3 builds with CH4 /pmix / OFI support is needed, but moving over openmpi-> mpich at this stage is iffy. regards Alastair -- Alastair McKinstry, , , https://diaspora.sceal.ie/u/amckinstry Misentropy: doubting that the Universe is becoming more disordered.
Re: MPICH as default MPI; WAS: MPI debugging workflows
On 2018-10-04 09:08, Drew Parsons wrote: On 2018-10-03 21:02, Alastair McKinstry wrote: Hi, See thread below. I've just uploaded 3.1.2-5 which I believe fixes the hangs due to OpenMPI ( non-atomic handling of sending a 64-bit tag, occuring mostly on archs with 32-bit atomics). Awkward observation: openmpi 3.1.2-5 now causes dolfin tests to timeout (on amd64) https://ci.debian.net/packages/d/dolfin/unstable/amd64/ Tracker page pings lammps and liggghts as well. Hi Alistair, openmpi3 seems to be stabilised now, packages are now passing tests and libpsm2 is no longer injecting 15 sec delays. Nice that the mpich 3.3 release is now finalised. Do we feel confident proceeding with the switch of mpi-defaults from openmpi to mpich? Are there any know issues with the transition? One that catches my eye are the build failures in scalapack. It's been tuned to pass built time tests with openmpi but fails many tests with mpich (scalapack builds packages for both mpi implementations). I'm not sure how concerned we should be with those build failures. Perhaps upstream should be consulted on it. Are similar mpich failures expected in other packages? Is there a simple way of setting up a buildd to do a test run of the transition before making it official? Drew
Re: MPICH as default MPI; WAS: MPI debugging workflows
On 2018-10-03 21:02, Alastair McKinstry wrote: Hi, See thread below. I've just uploaded 3.1.2-5 which I believe fixes the hangs due to OpenMPI ( non-atomic handling of sending a 64-bit tag, occuring mostly on archs with 32-bit atomics). Awkward observation: openmpi 3.1.2-5 now causes dolfin tests to timeout (on amd64) https://ci.debian.net/packages/d/dolfin/unstable/amd64/ Tracker page pings lammps and liggghts as well. Drew
Re: MPICH as default MPI; WAS: MPI debugging workflows
Hi, Em qua, 3 de out de 2018 às 16:24, Mattia Rizzolo escreveu: > > On Wed, Oct 03, 2018 at 02:02:25PM +0100, Alastair McKinstry wrote: > > Any ideas on how to write the ben tracker script? I think it would work by > > looking for packages with binaries linked to openmpi rather than mpich, but > > there are a number of packages that would be false positives (HDF5, > > open-coarrays, etc. ) that build against both. > > The bug report I linked in mpi-defaults' README.source contains an > example. > > the false positives could be just handled manually (i.e. listed in the > transition bug), I don't think there are many that links to both. > > Also, please note that mpich never built on riscv64, and is not up2date > on hppa and ppc64. I think at least the riscv64 should be handlded > first, so to avoid being in the same situation again when a single weird > architecture uses a different MPI implementation. > I'm CCing debian-ri...@lists.debian.org for this, in case you need help. >From the riscv64 camp, I built the current version in hardware and uploaded to "unreleased" in debian-ports: mpich_3.3~b3-2_riscv64.changes I suspect that the reason why it doesn't build is timeout in the tests, due to the buildds being qemu-system. The timeout can be increased with environment variables: test/mpi/runtests.in:if (defined($ENV{"MPITEST_TIMEOUT"})) { test/mpi/runtests.in:$defaultTimeLimit = $ENV{"MPITEST_TIMEOUT"}; test/mpi/runtests.in:if (defined($ENV{"MPITEST_TIMEOUT_MULTIPLIER"})) { test/mpi/runtests.in:$defaultTimeLimitMultiplier = $ENV{"MPITEST_TIMEOUT_MULTIPLIER"}; We can try to send a patch if it helps. Cheers. -- Manuel A. Fernandez Montecelo
Re: MPICH as default MPI; WAS: MPI debugging workflows
On 2018-10-03 22:12, Mattia Rizzolo wrote: the false positives could be just handled manually (i.e. listed in the transition bug), I don't think there are many that links to both. Also, please note that mpich never built on riscv64, and is not up2date on hppa and ppc64. I think at least the riscv64 should be handlded first, so to avoid being in the same situation again when a single weird architecture uses a different MPI implementation. I'm CCing debian-ri...@lists.debian.org for this, in case you need help. Another thing to be mindful of are the debci tests, which are growing in number. Tests that pass with openmpi won't necessarily pass mpich, notwithstanding the assertion we're working with that mpich is more stable. An example is the build-time tests for scalapack (which builds both openmpi and mpich versions). Build logs at https://buildd.debian.org/status/package.php?p=scalapack . Tests are set to run against both openmpi and mpich. On amd64 the openmpi tests pass (apart from a couple of timeouts), while against mpich 16 tests failed out of 96 (so currently mpich test failures are being ignored). I don't know if that means the scalapack tests are making assumptions that aren't valid under the general MPI specification, but I can guess it won't be the only package with this challenge. As far as the ben script goes, it could look at dependencies on mpi-default-dev as well as libopenmpi3. Drew
Re: MPICH as default MPI; WAS: MPI debugging workflows
On Wed, Oct 03, 2018 at 02:02:25PM +0100, Alastair McKinstry wrote: > Any ideas on how to write the ben tracker script? I think it would work by > looking for packages with binaries linked to openmpi rather than mpich, but > there are a number of packages that would be false positives (HDF5, > open-coarrays, etc. ) that build against both. The bug report I linked in mpi-defaults' README.source contains an example. the false positives could be just handled manually (i.e. listed in the transition bug), I don't think there are many that links to both. Also, please note that mpich never built on riscv64, and is not up2date on hppa and ppc64. I think at least the riscv64 should be handlded first, so to avoid being in the same situation again when a single weird architecture uses a different MPI implementation. I'm CCing debian-ri...@lists.debian.org for this, in case you need help. -- regards, Mattia Rizzolo GPG Key: 66AE 2B4A FCCF 3F52 DA18 4D18 4B04 3FCD B944 4540 .''`. more about me: https://mapreri.org : :' : Launchpad user: https://launchpad.net/~mapreri `. `'` Debian QA page: https://qa.debian.org/developer.php?login=mattia `- signature.asc Description: PGP signature
Re: MPICH as default MPI; WAS: MPI debugging workflows
Hi, See thread below. I've just uploaded 3.1.2-5 which I believe fixes the hangs due to OpenMPI ( non-atomic handling of sending a 64-bit tag, occuring mostly on archs with 32-bit atomics). With this, I think it is appropriate to start the ball rolling on making mpich the default MPI for buster. Any objections? Any ideas on how to write the ben tracker script? I think it would work by looking for packages with binaries linked to openmpi rather than mpich, but there are a number of packages that would be false positives (HDF5, open-coarrays, etc. ) that build against both. regards Alastair On 31/08/2018 11:17, Alastair McKinstry wrote: On 31/08/2018 11:04, Drew Parsons wrote: On 2018-08-30 14:18, Alastair McKinstry wrote: On 30/08/2018 09:39, Drew Parsons wrote: If you want a break from the openmpi angst then go ahead and drop mpich 3.3b3 into unstable. It won't make the overall MPI situation any worse... :) Drew Ok, I've pushed 3.3b3 to unstable. Great! For me there are two concerns: (1) The current setup (openmpi default) shakes out issues in openmpi3 that should be fixed. It would be good to get that done. That's fair. If we're going to "drop" openmpi, it's a good policy to leave it in as stable a state as possible. At this stage it appears there is a remaining "hang" / threading issue thats affecting 32-bit platforms (See #907267). Once thats fixed, I'm favouring no further updates before Buster - ie ship openmpi 3.1.2 with pmix 3.0.1 (openmpi now has a dependency on libpmix, the Process Management Interface for exascale, that handles the launching of processes (up to millions, hierarchically). the openmpi /pmix interface has been flaky, I suspect, and not well tested on non-traditional HPC architectures (eg. I suspect its the source of the 32-bit issue). mpich _can_ be built with pmix but I'm recommending not doing so for Buster. (2) moving to mpich as default is a transition and should be pushed before the deadline - say setting 30 Sept? This is probably a good point to confer with the Release Team, so I'm cc:ing them. Release Team: we have nearly completed the openmpi3 transition. But there is a broader question of switching mpi-defaults to mpich instead of openmpi. mpich is reported to be more stable than openmpi and is recommended by several upstream authors of the HPC software libraries. We have some consensus that switching to mpich is probably a good idea, it's just a question of timing at this point. Does an MPI / mpich transition overlap with other transitions planned for Buster - say hwloc, hdf5 ? hdf5 already builds against both openmpi and mpich, so it should not be a particular problem. It has had more build failures on the minor arches (with the new hdf5 version in experimental), but there's no reason to blame mpich for that. I don't know about hwloc, but the builds in experimental look clean. Drew -- Alastair McKinstry, , , https://diaspora.sceal.ie/u/amckinstry Misentropy: doubting that the Universe is becoming more disordered.
MPICH as default MPI; WAS: MPI debugging workflows
On 31/08/2018 11:04, Drew Parsons wrote: On 2018-08-30 14:18, Alastair McKinstry wrote: On 30/08/2018 09:39, Drew Parsons wrote: If you want a break from the openmpi angst then go ahead and drop mpich 3.3b3 into unstable. It won't make the overall MPI situation any worse... :) Drew Ok, I've pushed 3.3b3 to unstable. Great! For me there are two concerns: (1) The current setup (openmpi default) shakes out issues in openmpi3 that should be fixed. It would be good to get that done. That's fair. If we're going to "drop" openmpi, it's a good policy to leave it in as stable a state as possible. At this stage it appears there is a remaining "hang" / threading issue thats affecting 32-bit platforms (See #907267). Once thats fixed, I'm favouring no further updates before Buster - ie ship openmpi 3.1.2 with pmix 3.0.1 (openmpi now has a dependency on libpmix, the Process Management Interface for exascale, that handles the launching of processes (up to millions, hierarchically). the openmpi /pmix interface has been flaky, I suspect, and not well tested on non-traditional HPC architectures (eg. I suspect its the source of the 32-bit issue). mpich _can_ be built with pmix but I'm recommending not doing so for Buster. (2) moving to mpich as default is a transition and should be pushed before the deadline - say setting 30 Sept? This is probably a good point to confer with the Release Team, so I'm cc:ing them. Release Team: we have nearly completed the openmpi3 transition. But there is a broader question of switching mpi-defaults to mpich instead of openmpi. mpich is reported to be more stable than openmpi and is recommended by several upstream authors of the HPC software libraries. We have some consensus that switching to mpich is probably a good idea, it's just a question of timing at this point. Does an MPI / mpich transition overlap with other transitions planned for Buster - say hwloc, hdf5 ? hdf5 already builds against both openmpi and mpich, so it should not be a particular problem. It has had more build failures on the minor arches (with the new hdf5 version in experimental), but there's no reason to blame mpich for that. I don't know about hwloc, but the builds in experimental look clean. Drew -- Alastair McKinstry, , , https://diaspora.sceal.ie/u/amckinstry Misentropy: doubting that the Universe is becoming more disordered.