I am not sure about 64 flat run,
unfortunately I did not save logs since it's easy to run, but for 16 -
here is the plot I got for different number of threads for KSPSolve time
On 02/13/2018 10:28 AM, Matthew Knepley wrote:
On Tue, Feb 13, 2018 at 11:30 AM, Smith, Barry F. <bsm...@mcs.anl.gov
> On Feb 13, 2018, at 10:12 AM, Mark Adams <mfad...@lbl.gov
> FYI, we were able to get hypre with threads working on KNL on
Cori by going down to -O1 optimization. We are getting about 2x
speedup with 4 threads and 16 MPI processes per socket. Not bad.
In other works using 16 MPI processes with 4 threads per process
is twice as fast as running with 64 mpi processes? Could you send
the -log_view output for these two cases?
Is that what you mean? I took it to mean
We ran 16MPI processes and got time T.
We ran 16MPI processes with 4 threads each and got time T/2.
I would likely eat my shirt if 16x4 was 2x faster than 64.
> There error, flatlined or slightly diverging hypre solves,
occurred even in flat MPI runs with openmp=1.
But the answers are wrong as soon as you turn on OpenMP?
> We are going to test the Haswell nodes next.
> On Thu, Jan 25, 2018 at 4:16 PM, Mark Adams <mfad...@lbl.gov
> Baky (cc'ed) is getting a strange error on Cori/KNL at NERSC.
Using maint it runs fine with -with-openmp=0, it runs fine with
-with-openmp=1 and gamg, but with hypre and -with-openmp=1, even
running with flat MPI, the solver seems flatline (see attached and
notice that the residual starts to creep after a few time steps).
> Maybe you can suggest a hypre test that I can run?
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener