Re: MPICH as default MPI; WAS: MPI debugging workflows

2018-12-12 Thread Emilio Pozuelo Monfort
On 07/12/2018 12:49, Alastair McKinstry wrote:
> Looking into it further, I'm reluctant now to move to mpich for buster as the
> default. One was the experience of the openmpi3 transition, shaking out many
> issues.
Ack. Given how long it took to complete the last transition and how many
packages failed to build on a few architectures but not others (which
complicates rebuild testing as packages may build on amd64 but then fail on
other arches during the transition) I would be very hesitant to ack this
transition at this stage.

So let's release with openmpi as the default and if you want to change it for
bullseye we can do that.

Cheers,
Emilio



Re: MPICH as default MPI; WAS: MPI debugging workflows

2018-12-12 Thread Alastair McKinstry



On 11/12/2018 18:26, Drew Parsons wrote:

On 2018-12-08 12:31, Drew Parsons wrote:

On 2018-12-07 19:49, Alastair McKinstry wrote:



Looking into it further, I'm reluctant now to move to mpich for buster
as the default. One was the experience of the openmpi3 transition,
shaking out many issues.



...


So I think more testing of mpich3 builds with CH4 /pmix / OFI support
is needed, but moving over openmpi-> mpich at this stage is iffy.



Thanks Alistair, your analysis sounds sound.  I'm happy to be patient
with the switch and wait till after buster, especially if pmix support
complicates the matter.  That will make it all the more useful to set
up a test buildd to test the transition.

I'll invite upstream authors who have been promoting mpich over
openmpi to chip in with their experience.



I received this feedback from a PETSc developer:

  "The Open MPI API is better for developers due to better type safety 
(all

  handles in MPICH are typedef'd to int). Most major commercial vendors
  are organized around variants of MPICH (where there is collective ABI
  standardization). Open MPI is more modular so most vendor stuff goes
  into their plugins (for those vendors working with OMPI).

  "I think a good solution for Linux distros (and many others) would 
be to

  make a library that is ABI compatible with OMPI, but dispatches through
  to MPICH.  There exists a (messy) open source demonstration.

  "  https://github.com/cea-hpc/wi4mpi/ "



ABI compatibility sounds like a nice resolution of the dilemma.


Drew


Thanks, thats interesting. Yes, we should work for ABI compatibility for 
Bullseye.


My goals for MPI in Bullseye are:

(1) symbol versioning. OpenMPI has just released 4.0, but its too late 
to realistically push into Buster given experience. I have a symbol 
versioning patch that upstream have tentatively agreed to but needs work 
for non-gcc compilers. I intend to get that into the OpenMPI5 release, 
and OpenMPI5 into Bullseye, aiming to  make that the "last" big OpenMPI 
transition ...


(2) Ditto for MPICH ?

(3) Build using pkg-confiig instead of mpicc / mpifort wrappers. As of 
now, OpenMPI and MPICH are multiarch-aware, so you can cross-build MPI 
applications, but only if you use pkg-config rather than mpi wrappers.  
Push this work through the MPI stack.


Best regards

Alastair



--
Alastair McKinstry, , , 
https://diaspora.sceal.ie/u/amckinstry
Commander Vimes didn’t like the phrase “The innocent have nothing to fear,”
 believing the innocent had everything to fear, mostly from the guilty but in 
the longer term
 even more from those who say things like “The innocent have nothing to fear.”
 - T. Pratchett, Snuff



Re: MPICH as default MPI; WAS: MPI debugging workflows

2018-12-11 Thread Drew Parsons

On 2018-12-08 12:31, Drew Parsons wrote:

On 2018-12-07 19:49, Alastair McKinstry wrote:



Looking into it further, I'm reluctant now to move to mpich for buster
as the default. One was the experience of the openmpi3 transition,
shaking out many issues.

I suspect we could see the same with other package builds, as you
point out, tuned to openmpi rather than mpich, but also the feature
support for mpich.


...


So I think more testing of mpich3 builds with CH4 /pmix / OFI support
is needed, but moving over openmpi-> mpich at this stage is iffy.



Thanks Alistair, your analysis sounds sound.  I'm happy to be patient
with the switch and wait till after buster, especially if pmix support
complicates the matter.  That will make it all the more useful to set
up a test buildd to test the transition.

I'll invite upstream authors who have been promoting mpich over
openmpi to chip in with their experience.



I received this feedback from a PETSc developer:

  "The Open MPI API is better for developers due to better type safety 
(all

  handles in MPICH are typedef'd to int). Most major commercial vendors
  are organized around variants of MPICH (where there is collective ABI
  standardization). Open MPI is more modular so most vendor stuff goes
  into their plugins (for those vendors working with OMPI).

  "I think a good solution for Linux distros (and many others) would be 
to
  make a library that is ABI compatible with OMPI, but dispatches 
through

  to MPICH.  There exists a (messy) open source demonstration.

  "  https://github.com/cea-hpc/wi4mpi/ "



ABI compatibility sounds like a nice resolution of the dilemma.


Drew





Re: MPICH as default MPI; WAS: MPI debugging workflows

2018-12-07 Thread Drew Parsons

On 2018-12-07 19:49, Alastair McKinstry wrote:

On 07/12/2018 11:26, Drew Parsons wrote:



Hi Alistair, openmpi3 seems to be stabilised now, packages are now 
passing tests and libpsm2 is no longer injecting 15 sec delays.


Nice that the mpich 3.3 release is now finalised.  Do we feel 
confident proceeding with the switch of mpi-defaults from openmpi to 
mpich?


Are there any know issues with the transition?  One that catches my 
eye are the build failures in scalapack.  It's been tuned to pass 
built time tests with openmpi but fails many tests with mpich 
(scalapack builds packages for both mpi implementations). I'm not sure 
how concerned we should be with those build failures. Perhaps upstream 
should be consulted on it.  Are similar mpich failures expected in 
other packages?   Is there a simple way of setting up a buildd to do a 
test run of the transition before making it official?


Drew


Hi Drew,

Looking into it further, I'm reluctant now to move to mpich for buster
as the default. One was the experience of the openmpi3 transition,
shaking out many issues.

I suspect we could see the same with other package builds, as you
point out, tuned to openmpi rather than mpich, but also the feature
support for mpich.

e.g. mpich integration with psm / pmix / slurm is weak (in Debian).
While it might not look important to be able to scale to 10k+ nodes on
Debian (as none of the top500 machines run Debian), we're seeing an
increase in the container use case: building mpi apps within
Singularity containers running on our main machine. We don't run
Debian as the OS on the base supercomputer at work because we need
kernel support from $vendor, but the apps are built in Singularity
containers running Debian ... v. large scale jobs becom increasingly
likely, and openmpi / pmix is needed for that. Testing mpich I've yet
to get CH4 working reliably - needed for pmix, and the OFI / UFX
support is labeled 'experimental'.

My driving use case for the move to mpich had been fault tolerance -
needed for co-arrays (https://tracker.debian.org/pkg/open-coarrays)
needed for Fortran 2018, but i've since re-done open-coarrays to build
both openmpi and mpich variants, so that issue went away.

So I think more testing of mpich3 builds with CH4 /pmix / OFI support
is needed, but moving over openmpi-> mpich at this stage is iffy.



Thanks Alistair, your analysis sounds sound.  I'm happy to be patient 
with the switch and wait till after buster, especially if pmix support 
complicates the matter.  That will make it all the more useful to set up 
a test buildd to test the transition.


I'll invite upstream authors who have been promoting mpich over openmpi 
to chip in with their experience.


Drew



Re: MPICH as default MPI; WAS: MPI debugging workflows

2018-12-07 Thread Gilles Filippini

Hi,

On 2018-12-07 12:49, Alastair McKinstry wrote:

While it might not look important to be able to scale to 10k+ nodes on
Debian (as none of the top500 machines run Debian)


Some EDF clusters running a custom Debian derivative have been in the 
TOP500 lists since 2012.


_g.



Re: MPICH as default MPI; WAS: MPI debugging workflows

2018-12-07 Thread Alastair McKinstry



On 07/12/2018 11:26, Drew Parsons wrote:



Hi Alistair, openmpi3 seems to be stabilised now, packages are now 
passing tests and libpsm2 is no longer injecting 15 sec delays.


Nice that the mpich 3.3 release is now finalised.  Do we feel 
confident proceeding with the switch of mpi-defaults from openmpi to 
mpich?


Are there any know issues with the transition?  One that catches my 
eye are the build failures in scalapack.  It's been tuned to pass 
built time tests with openmpi but fails many tests with mpich 
(scalapack builds packages for both mpi implementations). I'm not sure 
how concerned we should be with those build failures. Perhaps upstream 
should be consulted on it.  Are similar mpich failures expected in 
other packages?   Is there a simple way of setting up a buildd to do a 
test run of the transition before making it official?


Drew


Hi Drew,

Looking into it further, I'm reluctant now to move to mpich for buster 
as the default. One was the experience of the openmpi3 transition, 
shaking out many issues.


I suspect we could see the same with other package builds, as you point 
out, tuned to openmpi rather than mpich, but also the feature support 
for mpich.


e.g. mpich integration with psm / pmix / slurm is weak (in Debian). 
While it might not look important to be able to scale to 10k+ nodes on 
Debian (as none of the top500 machines run Debian), we're seeing an 
increase in the container use case: building mpi apps within Singularity 
containers running on our main machine. We don't run Debian as the OS on 
the base supercomputer at work because we need kernel support from 
$vendor, but the apps are built in Singularity containers running Debian 
... v. large scale jobs becom increasingly likely, and openmpi / pmix is 
needed for that. Testing mpich I've yet to get CH4 working reliably - 
needed for pmix, and the OFI / UFX support is labeled 'experimental'.


My driving use case for the move to mpich had been fault tolerance - 
needed for co-arrays (https://tracker.debian.org/pkg/open-coarrays) 
needed for Fortran 2018, but i've since re-done open-coarrays to build 
both openmpi and mpich variants, so that issue went away.


So I think more testing of mpich3 builds with CH4 /pmix / OFI support is 
needed, but moving over openmpi-> mpich at this stage is iffy.


regards

Alastair

--
Alastair McKinstry, , , 
https://diaspora.sceal.ie/u/amckinstry
Misentropy: doubting that the Universe is becoming more disordered.



Re: MPICH as default MPI; WAS: MPI debugging workflows

2018-12-07 Thread Drew Parsons

On 2018-10-04 09:08, Drew Parsons wrote:

On 2018-10-03 21:02, Alastair McKinstry wrote:

Hi,

See thread below.

I've just uploaded 3.1.2-5 which I believe fixes the hangs due to
OpenMPI ( non-atomic handling of sending a 64-bit tag, occuring mostly
on archs with 32-bit atomics).


Awkward observation: openmpi 3.1.2-5 now causes dolfin tests to
timeout (on amd64)
https://ci.debian.net/packages/d/dolfin/unstable/amd64/

Tracker page pings lammps and liggghts as well.




Hi Alistair, openmpi3 seems to be stabilised now, packages are now 
passing tests and libpsm2 is no longer injecting 15 sec delays.


Nice that the mpich 3.3 release is now finalised.  Do we feel confident 
proceeding with the switch of mpi-defaults from openmpi to mpich?


Are there any know issues with the transition?  One that catches my eye 
are the build failures in scalapack.  It's been tuned to pass built time 
tests with openmpi but fails many tests with mpich (scalapack builds 
packages for both mpi implementations). I'm not sure how concerned we 
should be with those build failures. Perhaps upstream should be 
consulted on it.  Are similar mpich failures expected in other packages? 
  Is there a simple way of setting up a buildd to do a test run of the 
transition before making it official?


Drew



Re: MPICH as default MPI; WAS: MPI debugging workflows

2018-10-03 Thread Drew Parsons

On 2018-10-03 21:02, Alastair McKinstry wrote:

Hi,

See thread below.

I've just uploaded 3.1.2-5 which I believe fixes the hangs due to
OpenMPI ( non-atomic handling of sending a 64-bit tag, occuring mostly
on archs with 32-bit atomics).


Awkward observation: openmpi 3.1.2-5 now causes dolfin tests to timeout 
(on amd64)

https://ci.debian.net/packages/d/dolfin/unstable/amd64/

Tracker page pings lammps and liggghts as well.

Drew



Re: MPICH as default MPI; WAS: MPI debugging workflows

2018-10-03 Thread Manuel A. Fernandez Montecelo
Hi,

Em qua, 3 de out de 2018 às 16:24, Mattia Rizzolo  escreveu:
>
> On Wed, Oct 03, 2018 at 02:02:25PM +0100, Alastair McKinstry wrote:
> > Any ideas on how to write the ben tracker script? I think it would work by
> > looking for packages with binaries linked to openmpi rather than mpich, but
> > there are a number of packages that would be false positives (HDF5,
> > open-coarrays, etc. ) that build against both.
>
> The bug report I linked in mpi-defaults' README.source contains an
> example.
>
> the false positives could be just handled manually (i.e. listed in the
> transition bug), I don't think there are many that links to both.
>
> Also, please note that mpich never built on riscv64, and is not up2date
> on hppa and ppc64.  I think at least the riscv64 should be handlded
> first, so to avoid being in the same situation again when a single weird
> architecture uses a different MPI implementation.
> I'm CCing debian-ri...@lists.debian.org for this, in case you need help.

>From the riscv64 camp, I built the current version in hardware and
uploaded to "unreleased" in debian-ports:
mpich_3.3~b3-2_riscv64.changes

I suspect that the reason why it doesn't build is timeout in the
tests, due to the buildds being qemu-system.  The timeout can be
increased with environment variables:

  test/mpi/runtests.in:if (defined($ENV{"MPITEST_TIMEOUT"})) {
  test/mpi/runtests.in:$defaultTimeLimit = $ENV{"MPITEST_TIMEOUT"};
  test/mpi/runtests.in:if (defined($ENV{"MPITEST_TIMEOUT_MULTIPLIER"})) {
  test/mpi/runtests.in:$defaultTimeLimitMultiplier =
$ENV{"MPITEST_TIMEOUT_MULTIPLIER"};

We can try to send a patch if it helps.

Cheers.
-- 
Manuel A. Fernandez Montecelo 



Re: MPICH as default MPI; WAS: MPI debugging workflows

2018-10-03 Thread Drew Parsons

On 2018-10-03 22:12, Mattia Rizzolo wrote:


the false positives could be just handled manually (i.e. listed in the
transition bug), I don't think there are many that links to both.

Also, please note that mpich never built on riscv64, and is not up2date
on hppa and ppc64.  I think at least the riscv64 should be handlded
first, so to avoid being in the same situation again when a single 
weird

architecture uses a different MPI implementation.
I'm CCing debian-ri...@lists.debian.org for this, in case you need 
help.



Another thing to be mindful of are the debci tests, which are growing in 
number.  Tests that pass with openmpi won't necessarily pass mpich, 
notwithstanding the assertion we're working with that mpich is more 
stable.


An example is the build-time tests for scalapack (which builds both 
openmpi and mpich versions). Build logs at 
https://buildd.debian.org/status/package.php?p=scalapack . Tests are set 
to run against both openmpi and mpich.  On amd64 the openmpi tests pass 
(apart from a couple of timeouts), while against mpich 16 tests failed 
out of 96 (so currently mpich test failures are being ignored). I don't 
know if that means the scalapack tests are making assumptions that 
aren't valid under the general MPI specification, but I can guess it 
won't be the only package with this challenge.


As far as the ben script goes, it could look at dependencies on 
mpi-default-dev as well as libopenmpi3.


Drew



Re: MPICH as default MPI; WAS: MPI debugging workflows

2018-10-03 Thread Mattia Rizzolo
On Wed, Oct 03, 2018 at 02:02:25PM +0100, Alastair McKinstry wrote:
> Any ideas on how to write the ben tracker script? I think it would work by
> looking for packages with binaries linked to openmpi rather than mpich, but
> there are a number of packages that would be false positives (HDF5,
> open-coarrays, etc. ) that build against both.

The bug report I linked in mpi-defaults' README.source contains an
example.

the false positives could be just handled manually (i.e. listed in the
transition bug), I don't think there are many that links to both.

Also, please note that mpich never built on riscv64, and is not up2date
on hppa and ppc64.  I think at least the riscv64 should be handlded
first, so to avoid being in the same situation again when a single weird
architecture uses a different MPI implementation.
I'm CCing debian-ri...@lists.debian.org for this, in case you need help.

-- 
regards,
Mattia Rizzolo

GPG Key: 66AE 2B4A FCCF 3F52 DA18  4D18 4B04 3FCD B944 4540  .''`.
more about me:  https://mapreri.org : :'  :
Launchpad user: https://launchpad.net/~mapreri  `. `'`
Debian QA page: https://qa.debian.org/developer.php?login=mattia  `-


signature.asc
Description: PGP signature


Re: MPICH as default MPI; WAS: MPI debugging workflows

2018-10-03 Thread Alastair McKinstry

Hi,

See thread below.

I've just uploaded 3.1.2-5 which I believe fixes the hangs due to 
OpenMPI ( non-atomic handling of sending a 64-bit tag, occuring mostly 
on archs with 32-bit atomics).


With this, I think it is appropriate to start the ball rolling on making 
mpich the default MPI for buster.


Any objections?

Any ideas on how to write the ben tracker script? I think it would work 
by looking for packages with binaries linked to openmpi rather than 
mpich, but there are a number of packages that would be false positives 
(HDF5, open-coarrays, etc. ) that build against both.



regards

Alastair


On 31/08/2018 11:17, Alastair McKinstry wrote:


On 31/08/2018 11:04, Drew Parsons wrote:

On 2018-08-30 14:18, Alastair McKinstry wrote:

On 30/08/2018 09:39, Drew Parsons wrote:


If you want a break from the openmpi angst then go ahead and drop 
mpich 3.3b3 into unstable.  It won't make the overall MPI situation 
any worse... :)


Drew


Ok, I've pushed 3.3b3 to unstable.


Great!


For me there are two concerns:

(1) The current setup (openmpi default) shakes out issues in openmpi3
that should be fixed. It would be good to get that done.


That's fair.  If we're going to "drop" openmpi, it's a good policy to 
leave it in as stable a state as possible.



At this stage it appears there is a remaining "hang" / threading issue 
thats affecting 32-bit platforms


(See #907267). Once thats fixed, I'm favouring no further updates 
before Buster - ie ship openmpi 3.1.2 with pmix 3.0.1


(openmpi now has a dependency on  libpmix, the Process Management 
Interface for exascale, that handles the launching of processes (up to 
millions, hierarchically).


the openmpi /pmix interface has been flaky, I suspect, and not well 
tested on non-traditional HPC architectures (eg. I suspect its the 
source of the 32-bit issue).


mpich _can_ be built with pmix but I'm recommending not doing so for 
Buster.




(2) moving to mpich as default is a transition and should be pushed
before the deadline - say setting 30 Sept?


This is probably a good point to confer with the Release Team, so I'm 
cc:ing them.


Release Team: we have nearly completed the openmpi3 transition. But 
there is a broader question of switching mpi-defaults to mpich 
instead of openmpi.  mpich is reported to be more stable than openmpi 
and is recommended by several upstream authors of the HPC software 
libraries.  We have some consensus that switching to mpich is 
probably a good idea, it's just a question of timing at this point.




Does an MPI / mpich transition overlap with other transitions planned
for Buster -  say hwloc, hdf5 ?


hdf5 already builds against both openmpi and mpich, so it should not 
be a particular problem. It has had more build failures on the minor 
arches (with the new hdf5 version in experimental), but there's no 
reason to blame mpich for that.


I don't know about hwloc, but the builds in experimental look clean.

Drew


--
Alastair McKinstry, , , 
https://diaspora.sceal.ie/u/amckinstry
Misentropy: doubting that the Universe is becoming more disordered.