Re: [petsc-dev] Question about MPICH device we use

2020-07-27 Thread Jed Brown
Jeff Hammond writes: > On Thu, Jul 23, 2020 at 9:35 PM Satish Balay wrote: > >> On Thu, 23 Jul 2020, Jeff Hammond wrote: >> >> > Open-MPI refuses to let users over subscribe without an extra flag to >> > mpirun. >> >> Yes - and when using this flag - it lets the run through - but there is >> sti

Re: [petsc-dev] Question about MPICH device we use

2020-07-26 Thread Jeff Hammond
On Thu, Jul 23, 2020 at 9:35 PM Satish Balay wrote: > On Thu, 23 Jul 2020, Jeff Hammond wrote: > > > Open-MPI refuses to let users over subscribe without an extra flag to > > mpirun. > > Yes - and when using this flag - it lets the run through - but there is > still performance degradation in ove

Re: [petsc-dev] Question about MPICH device we use

2020-07-23 Thread Satish Balay via petsc-dev
I have this change at: https://gitlab.com/petsc/petsc/-/merge_requests/2990 Satish On Thu, 23 Jul 2020, Satish Balay via petsc-dev wrote: > The primary reason is for users - developing on laptops/desktop and doing > development runs in oversubscribed mode. > > The choice was few percent loss

Re: [petsc-dev] Question about MPICH device we use

2020-07-23 Thread Satish Balay via petsc-dev
Should also note: the test suite is also run by users - not just CI. Only yesterday I suggested Oana to try nemesis for a different issues [on WSL] - and the response was 'test suite is slow' so reverted back to sock [and tried a different workaround for that issue] Satish On Thu, 23 Jul 2020,

Re: [petsc-dev] Question about MPICH device we use

2020-07-23 Thread Junchao Zhang
On Thu, Jul 23, 2020 at 11:35 PM Satish Balay via petsc-dev < petsc-dev@mcs.anl.gov> wrote: > On Thu, 23 Jul 2020, Jeff Hammond wrote: > > > Open-MPI refuses to let users over subscribe without an extra flag to > > mpirun. > > Yes - and when using this flag - it lets the run through - but there is

Re: [petsc-dev] Question about MPICH device we use

2020-07-23 Thread Satish Balay via petsc-dev
On Thu, 23 Jul 2020, Jeff Hammond wrote: > Open-MPI refuses to let users over subscribe without an extra flag to > mpirun. Yes - and when using this flag - it lets the run through - but there is still performance degradation in oversubscribe mode. > I think Intel MPI has an option for blocking

Re: [petsc-dev] Question about MPICH device we use

2020-07-23 Thread Satish Balay via petsc-dev
The primary reason is for users - developing on laptops/desktop and doing development runs in oversubscribed mode. The choice was few percent loss in performance for sock vs exponential cost for oversubscribed usage of nemesis [so we defaulted to sock]. I think we should preserve this behavior

Re: [petsc-dev] Question about MPICH device we use

2020-07-23 Thread Jeff Hammond
Open-MPI refuses to let users over subscribe without an extra flag to mpirun. I think Intel MPI has an option for blocking poll that supports oversubscription “nicely”. MPICH might have a “no local” option that disables shared memory, in which case nemesis over libfabric with the sockets or TCP pro

Re: [petsc-dev] Question about MPICH device we use

2020-07-23 Thread Jed Brown
I think we should default to ch3:nemesis when --download-mpich, and only do ch3:sock when requested (which we would do in CI). Satish Balay via petsc-dev writes: > Primarily because ch3:sock performance does not degrade in oversubscribe mode > - which is developer friendly - i.e on your laptop

Re: [petsc-dev] Question about MPICH device we use

2020-07-22 Thread Matthew Knepley
On Wed, Jul 22, 2020 at 2:34 PM Satish Balay wrote: > BTW: Last we compared performance [many years ago] the difference was not > a factor of 2 - but a few percentage points. > > Which petsc example can we re-run to compare? > Not sure. Scott I think was running his new MHD stuff. His post is on

Re: [petsc-dev] Question about MPICH device we use

2020-07-22 Thread Satish Balay via petsc-dev
BTW: Last we compared performance [many years ago] the difference was not a factor of 2 - but a few percentage points. Which petsc example can we re-run to compare? Satish On Wed, 22 Jul 2020, Satish Balay via petsc-dev wrote: > Primarily because ch3:sock performance does not degrade in oversu

Re: [petsc-dev] Question about MPICH device we use

2020-07-22 Thread Satish Balay via petsc-dev
Primarily because ch3:sock performance does not degrade in oversubscribe mode - which is developer friendly - i.e on your laptop. And folks doing optimized runs should use a properly tuned MPI for their setup anyway. In this case --download-mpich-device=ch3:nemesis is likely appropriate if usin

Re: [petsc-dev] Question about MPICH device we use

2020-07-22 Thread Barry Smith
Because nemesis does function in a reasonable way when the system is oversubscribed, that is there are more MPI ranks than cores. Many users getting started have whatever number of cores (4?) but want to run tests with a few more ranks, this is not feasible with nemesis. In addition with the

[petsc-dev] Question about MPICH device we use

2020-07-22 Thread Matthew Knepley
We default to ch3:sock. Scott MacLachlan just had a long thread on the Firedrake list where it ended up that reconfiguring using ch3:nemesis had a 2x performance boost on his 16-core proc, and noticeable effect on the 4 core speedup. Why do we default to sock? Thanks, Matt -- What most