> El 4 may 2020, a las 12:24, Matthew Knepley <[email protected]> escribió:
>
>
> On Mon, May 4, 2020 at 6:12 AM Jose E. Roman <[email protected]> wrote:
>
>
> > El 4 may 2020, a las 12:06, Matthew Knepley <[email protected]> escribió:
> >
> > On Mon, May 4, 2020 at 3:51 AM Walker Andreas <[email protected]>
> > wrote:
> > Hey everyone,
> >
> > I wanted to give you a short update on this:
> >
> > - As suggested by Matt, I played around with the distribution of my cores
> > over nodes and changed to using one core per node only.
> > - After experimenting with a monolithic matrix instead of a MATNEST
> > object, I observed that the monolithic showed better speedup and changed to
> > building my problem matrix as monolithic matrix.
> > - I keep the subspace for the solver small which slightly improves the
> > runtimes
> >
> > After these changes, I get near-perfect scaling with speedups above 56 for
> > 64 cores (1 core/node) for example. Unfortunately I can’t really tell which
> > of the above changes contributed how much to this improvement.
> >
> > Anyway, thanks everyone for your help!
> >
> > Great! I am glad its scaling well.
> >
> > 1) You should be able to up the number of cores per node as large as you
> > want, as long as you evaluate the scaling in terms of nodes., meaning use
> > that as the baseline.
> >
> > 2) You can see how many cores per node are used efficiently by running
> > 'make streams'.
> >
> > 3) We would be really interested in seeing the logs from runs where MATNEST
> > scales worse than AIJ. Maybe we could fix that.
>
> Matt, this last issue is due to SLEPc, I am preparing a fix.
>
> Thanks! I am interested to see it.
Have a look at MR 50: https://gitlab.com/slepc/slepc/-/merge_requests/50
In the test example I have made it seems that the MatNest is not using VecNest
internally. How can I modify the example to use VecNest?
Jose
>
> Matt
>
> >
> > Thanks,
> >
> > Matt
> >
> > Best regards and stay healthy
> >
> > Andreas
> >
> >> Am 01.05.2020 um 14:45 schrieb Matthew Knepley <[email protected]>:
> >>
> >> On Fri, May 1, 2020 at 8:32 AM Walker Andreas <[email protected]>
> >> wrote:
> >> Hi Jed, Hi Jose,
> >>
> >> Thank you very much for your suggestions.
> >>
> >> - I tried reducing the subspace to 64 which indeed reduced the runtime by
> >> around 20 percent (sometimes more) for 128 cores. I will check what the
> >> effect on the sequential runtime is.
> >> - Regarding MatNest, I can just look for the eigenvalues of a submatrix
> >> to see how the speedup is affected; I will check that. Replacing the full
> >> matnest with a contiguous matrix is definitely more work but, if it
> >> improves the performance, worth the work (we assume that the program will
> >> be reused a lot).
> >> - Petsc is configured with mumps, openblas, scalapack (among others). But
> >> I noticed no significant difference to when petsc is configured without
> >> them.
> >> - The number of iterations required by the solver does not depend on the
> >> number of cores.
> >>
> >> Best regards and many thanks,
> >>
> >> Let me just address something from a high level. These operations are not
> >> compute limited (for the most part), but limited by
> >> bandwidth. Bandwidth is allocated by node, not by core, on these machines.
> >> That is why it important to understand how many
> >> nodes you are using, not cores. A useful scaling test would be to fill up
> >> a single node (however many cores fit on one node), and
> >> then increase the # of nodes. We would expect close to linear scaling in
> >> that case.
> >>
> >> Thanks,
> >>
> >> Matt
> >>
> >> Andreas Walker
> >>
> >> > Am 01.05.2020 um 14:12 schrieb Jed Brown <[email protected]>:
> >> >
> >> > "Jose E. Roman" <[email protected]> writes:
> >> >
> >> >> Comments related to PETSc:
> >> >>
> >> >> - If you look at the "Reduct" column you will see that MatMult() is
> >> >> doing a lot of global reductions, which is bad for scaling. This is due
> >> >> to MATNEST (other Mat types do not do that). I don't know the details
> >> >> of MATNEST, maybe Matt can comment on this.
> >> >
> >> > It is not intrinsic to MatNest, though use of MatNest incurs extra
> >> > VecScatter costs. If you use MatNest without VecNest, then
> >> > VecGetSubVector incurs significant cost (including reductions). I
> >> > suspect it's likely that some SLEPc functionality is not available with
> >> > VecNest. A better option would be to optimize VecGetSubVector by
> >> > caching the IS and subvector, at least in the contiguous case.
> >> >
> >> > How difficult would it be for you to run with a monolithic matrix
> >> > instead of MatNest? It would certainly be better at amortizing
> >> > communication costs.
> >> >
> >> >>
> >> >> Comments related to SLEPc.
> >> >>
> >> >> - The last rows (DSSolve, DSVectors, DSOther) correspond to
> >> >> "sequential" computations. In your case they take a non-negligible time
> >> >> (around 30 seconds). You can try to reduce this time by reducing the
> >> >> size of the projected problem, e.g. running with -eps_nev 100 -eps_mpd
> >> >> 64 (see
> >> >> https://slepc.upv.es/documentation/current/docs/manualpages/EPS/EPSSetDimensions.html
> >> >> )
> >> >>
> >> >> - In my previous comment about multithreaded BLAS, I was refering to
> >> >> configuring PETSc with MKL, OpenBLAS or similar. But anyway, I don't
> >> >> think this is relevant here.
> >> >>
> >> >> - Regarding the number of iterations, yes the number of iterations
> >> >> should be the same for different runs if you keep the same number of
> >> >> processes, but when you change the number of processes there might be
> >> >> significant differences for some problems, that is the rationale of my
> >> >> suggestion. Anyway, in your case the fluctuation does not seem very
> >> >> important.
> >> >>
> >> >> Jose
> >> >>
> >> >>
> >> >>> El 1 may 2020, a las 10:07, Walker Andreas <[email protected]>
> >> >>> escribió:
> >> >>>
> >> >>> Hi Matthew,
> >> >>>
> >> >>> I just ran the same program on a single core. You can see the output
> >> >>> of -log_view below. As I see it, most functions have speedups of
> >> >>> around 50 for 128 cores, also functions like matmult etc.
> >> >>>
> >> >>> Best regards,
> >> >>>
> >> >>> Andreas
> >> >>>
> >> >>> ************************************************************************************************************************
> >> >>> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r
> >> >>> -fCourier9' to print this document ***
> >> >>> ************************************************************************************************************************
> >> >>>
> >> >>> ---------------------------------------------- PETSc Performance
> >> >>> Summary: ----------------------------------------------
> >> >>>
> >> >>> ./Solver on a named eu-a6-011-09 with 1 processor, by awalker Fri May
> >> >>> 1 04:03:07 2020
> >> >>> Using Petsc Release Version 3.10.5, Mar, 28, 2019
> >> >>>
> >> >>> Max Max/Min Avg Total
> >> >>> Time (sec): 3.092e+04 1.000 3.092e+04
> >> >>> Objects: 6.099e+05 1.000 6.099e+05
> >> >>> Flop: 9.313e+13 1.000 9.313e+13 9.313e+13
> >> >>> Flop/sec: 3.012e+09 1.000 3.012e+09 3.012e+09
> >> >>> MPI Messages: 0.000e+00 0.000 0.000e+00 0.000e+00
> >> >>> MPI Message Lengths: 0.000e+00 0.000 0.000e+00 0.000e+00
> >> >>> MPI Reductions: 0.000e+00 0.000
> >> >>>
> >> >>> Flop counting convention: 1 flop = 1 real number operation of type
> >> >>> (multiply/divide/add/subtract)
> >> >>> e.g., VecAXPY() for real vectors of length
> >> >>> N --> 2N flop
> >> >>> and VecAXPY() for complex vectors of length
> >> >>> N --> 8N flop
> >> >>>
> >> >>> Summary of Stages: ----- Time ------ ----- Flop ------ ---
> >> >>> Messages --- -- Message Lengths -- -- Reductions --
> >> >>> Avg %Total Avg %Total Count
> >> >>> %Total Avg %Total Count %Total
> >> >>> 0: Main Stage: 3.0925e+04 100.0% 9.3134e+13 100.0% 0.000e+00
> >> >>> 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
> >> >>>
> >> >>> ------------------------------------------------------------------------------------------------------------------------
> >> >>> See the 'Profiling' chapter of the users' manual for details on
> >> >>> interpreting output.
> >> >>> Phase summary info:
> >> >>> Count: number of times phase was executed
> >> >>> Time and Flop: Max - maximum over all processors
> >> >>> Ratio - ratio of maximum to minimum over all
> >> >>> processors
> >> >>> Mess: number of messages sent
> >> >>> AvgLen: average message length (bytes)
> >> >>> Reduct: number of global reductions
> >> >>> Global: entire computation
> >> >>> Stage: stages of a computation. Set stages with PetscLogStagePush()
> >> >>> and PetscLogStagePop().
> >> >>> %T - percent time in this phase %F - percent flop in this
> >> >>> phase
> >> >>> %M - percent messages in this phase %L - percent message
> >> >>> lengths in this phase
> >> >>> %R - percent reductions in this phase
> >> >>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time
> >> >>> over all processors)
> >> >>> ------------------------------------------------------------------------------------------------------------------------
> >> >>> Event Count Time (sec) Flop
> >> >>> --- Global --- --- Stage ---- Total
> >> >>> Max Ratio Max Ratio Max Ratio Mess AvgLen
> >> >>> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
> >> >>> ------------------------------------------------------------------------------------------------------------------------
> >> >>>
> >> >>> --- Event Stage 0: Main Stage
> >> >>>
> >> >>> MatMult 152338 1.0 8.2799e+03 1.0 8.20e+12 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 27 9 0 0 0 27 9 0 0 0 990
> >> >>> MatMultAdd 609352 1.0 8.1229e+03 1.0 8.20e+12 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 26 9 0 0 0 26 9 0 0 0 1010
> >> >>> MatConvert 30 1.0 1.5797e+00 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> MatScale 10 1.0 4.7172e-02 1.0 6.73e+07 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1426
> >> >>> MatAssemblyBegin 516 1.0 2.0695e-04 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> MatAssemblyEnd 516 1.0 2.8933e+00 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> MatZeroEntries 2 1.0 3.6038e-02 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> MatView 10 1.0 2.4422e+00 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> MatAXPY 40 1.0 3.1595e-01 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> MatMatMult 60 1.0 1.3723e+01 1.0 1.24e+09 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 90
> >> >>> MatMatMultSym 100 1.0 1.3651e+01 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> MatMatMultNum 100 1.0 7.5159e+00 1.0 2.06e+09 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 274
> >> >>> MatMatMatMult 40 1.0 1.8674e+01 1.0 1.66e+09 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 89
> >> >>> MatMatMatMultSym 40 1.0 1.1848e+01 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> MatMatMatMultNum 40 1.0 6.8266e+00 1.0 1.66e+09 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 243
> >> >>> MatPtAP 40 1.0 1.9042e+01 1.0 1.66e+09 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 87
> >> >>> MatTrnMatMult 40 1.0 7.7990e+00 1.0 8.24e+08 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 106
> >> >>> DMPlexStratify 1 1.0 5.1223e-02 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> DMPlexPrealloc 2 1.0 1.5242e+00 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> VecSet 914053 1.0 1.4929e+02 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> VecAssemblyBegin 1 1.0 1.3411e-07 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> VecAssemblyEnd 1 1.0 8.0094e-08 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> VecScatterBegin 1 1.0 2.6399e-04 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> VecSetRandom 10 1.0 8.6088e-02 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> EPSSetUp 10 1.0 2.9988e+00 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> EPSSolve 10 1.0 2.8695e+04 1.0 9.31e+13 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 93100 0 0 0 93100 0 0 0 3246
> >> >>> STSetUp 10 1.0 9.7291e-05 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> STApply 152338 1.0 8.2803e+03 1.0 8.20e+12 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 27 9 0 0 0 27 9 0 0 0 990
> >> >>> BVCopy 1814 1.0 1.1076e+00 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> BVMultVec 304639 1.0 9.8281e+03 1.0 3.34e+13 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 32 36 0 0 0 32 36 0 0 0 3397
> >> >>> BVMultInPlace 1824 1.0 7.0999e+02 1.0 1.79e+13 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 2 19 0 0 0 2 19 0 0 0 25213
> >> >>> BVDotVec 304639 1.0 9.8037e+03 1.0 3.36e+13 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 32 36 0 0 0 32 36 0 0 0 3427
> >> >>> BVOrthogonalizeV 152348 1.0 1.9633e+04 1.0 6.70e+13 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 63 72 0 0 0 63 72 0 0 0 3411
> >> >>> BVScale 152348 1.0 3.7888e+01 1.0 5.32e+10 1.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1403
> >> >>> BVSetRandom 10 1.0 8.6364e-02 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> DSSolve 1824 1.0 1.7363e+01 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> DSVectors 2797 1.0 1.2353e-01 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> DSOther 1824 1.0 9.8627e+00 1.0 0.00e+00 0.0 0.0e+00
> >> >>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>> ------------------------------------------------------------------------------------------------------------------------
> >> >>>
> >> >>> Memory usage is given in bytes:
> >> >>>
> >> >>> Object Type Creations Destructions Memory Descendants'
> >> >>> Mem.
> >> >>> Reports information only for process 0.
> >> >>>
> >> >>> --- Event Stage 0: Main Stage
> >> >>>
> >> >>> Container 1 1 584 0.
> >> >>> Distributed Mesh 1 1 5184 0.
> >> >>> GraphPartitioner 1 1 624 0.
> >> >>> Matrix 320 320 3469402576 0.
> >> >>> Index Set 53 53 2777932 0.
> >> >>> IS L to G Mapping 1 1 249320 0.
> >> >>> Section 13 11 7920 0.
> >> >>> Star Forest Graph 6 6 4896 0.
> >> >>> Discrete System 1 1 936 0.
> >> >>> Vector 609405 609405 857220847896 0.
> >> >>> Vec Scatter 1 1 704 0.
> >> >>> Viewer 22 11 9328 0.
> >> >>> EPS Solver 10 10 86360 0.
> >> >>> Spectral Transform 10 10 8400 0.
> >> >>> Basis Vectors 10 10 530336 0.
> >> >>> PetscRandom 10 10 6540 0.
> >> >>> Region 10 10 6800 0.
> >> >>> Direct Solver 10 10 9838880 0.
> >> >>> Krylov Solver 10 10 13920 0.
> >> >>> Preconditioner 10 10 10080 0.
> >> >>> ========================================================================================================================
> >> >>> Average time to get PetscTime(): 2.50991e-08
> >> >>> #PETSc Option Table entries:
> >> >>> -config=benchmark3.json
> >> >>> -eps_converged_reason
> >> >>> -log_view
> >> >>> #End of PETSc Option Table entries
> >> >>> Compiled without FORTRAN kernels
> >> >>> Compiled with full precision matrices (default)
> >> >>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> >> >>> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> >> >>> Configure options:
> >> >>> --prefix=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit
> >> >>> --with-ssl=0 --download-c2html=0 --download-sowing=0
> >> >>> --download-hwloc=0 CFLAGS="-ftree-vectorize -O2 -march=core-avx2 -fPIC
> >> >>> -mavx2" FFLAGS= CXXFLAGS="-ftree-vectorize -O2 -march=core-avx2 -fPIC
> >> >>> -mavx2"
> >> >>> --with-cc=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc
> >> >>>
> >> >>> --with-cxx=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpic++
> >> >>>
> >> >>> --with-fc=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90
> >> >>> --with-precision=double --with-scalar-type=real
> >> >>> --with-shared-libraries=1 --with-debugging=0 --with-64-bit-indices=0
> >> >>> COPTFLAGS= FOPTFLAGS= CXXOPTFLAGS=
> >> >>> --with-blaslapack-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib/libopenblas.so
> >> >>> --with-x=0 --with-cxx-dialect=C++11 --with-boost=1 --with-clanguage=C
> >> >>> --with-scalapack-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib/libscalapack.so
> >> >>> --with-scalapack=1 --with-metis=1
> >> >>> --with-metis-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk
> >> >>> --with-hdf5=1
> >> >>> --with-hdf5-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5
> >> >>> --with-hypre=1
> >> >>> --with-hypre-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne
> >> >>> --with-parmetis=1
> >> >>> --with-parmetis-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4
> >> >>> --with-mumps=1
> >> >>> --with-mumps-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b
> >> >>> --with-trilinos=1
> >> >>> --with-trilinos-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo
> >> >>> --with-fftw=0 --with-cxx-dialect=C++11
> >> >>> --with-superlu_dist-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/include
> >> >>>
> >> >>> --with-superlu_dist-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib/libsuperlu_dist.a
> >> >>> --with-superlu_dist=1
> >> >>> --with-suitesparse-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/include
> >> >>>
> >> >>> --with-suitesparse-lib="/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libumfpack.so
> >> >>>
> >> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libklu.so
> >> >>>
> >> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcholmod.so
> >> >>>
> >> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libbtf.so
> >> >>>
> >> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libccolamd.so
> >> >>>
> >> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcolamd.so
> >> >>>
> >> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcamd.so
> >> >>>
> >> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libamd.so
> >> >>>
> >> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libsuitesparseconfig.so
> >> >>> /lib64/librt.so" --with-suitesparse=1
> >> >>> --with-zlib-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/include
> >> >>>
> >> >>> --with-zlib-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib/libz.so
> >> >>> --with-zlib=1
> >> >>> -----------------------------------------
> >> >>> Libraries compiled on 2020-01-22 15:21:53 on eu-c7-051-02
> >> >>> Machine characteristics:
> >> >>> Linux-3.10.0-862.14.4.el7.x86_64-x86_64-with-centos-7.5.1804-Core
> >> >>> Using PETSc directory:
> >> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit
> >> >>> Using PETSc arch:
> >> >>> -----------------------------------------
> >> >>>
> >> >>> Using C compiler:
> >> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc
> >> >>> -ftree-vectorize -O2 -march=core-avx2 -fPIC -mavx2
> >> >>> Using Fortran compiler:
> >> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90
> >> >>>
> >> >>> -----------------------------------------
> >> >>>
> >> >>> Using include paths:
> >> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/include
> >> >>>
> >> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/include
> >> >>>
> >> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/include
> >> >>>
> >> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/include
> >> >>>
> >> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/include
> >> >>>
> >> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/include
> >> >>>
> >> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/include
> >> >>>
> >> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/include
> >> >>>
> >> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/include
> >> >>>
> >> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/include
> >> >>> -----------------------------------------
> >> >>>
> >> >>> Using C linker:
> >> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc
> >> >>> Using Fortran linker:
> >> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90
> >> >>> Using libraries:
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/lib
> >> >>> -lpetsc
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/lib
> >> >>>
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/lib
> >> >>>
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib
> >> >>>
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib
> >> >>> /lib64/librt.so
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib
> >> >>>
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/lib
> >> >>>
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib
> >> >>>
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/lib
> >> >>>
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/lib
> >> >>>
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/lib
> >> >>>
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib
> >> >>>
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hwloc-1.11.9-a436y6rdahnn57u6oe6snwemjhcfmrso/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hwloc-1.11.9-a436y6rdahnn57u6oe6snwemjhcfmrso/lib
> >> >>> -Wl,-rpath,/cluster/apps/lsf/10.1/linux2.6-glibc2.3-x86_64/lib
> >> >>> -L/cluster/apps/lsf/10.1/linux2.6-glibc2.3-x86_64/lib
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/lib
> >> >>>
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib:/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64
> >> >>>
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib/gcc/x86_64-pc-linux-gnu/6.3.0
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib/gcc/x86_64-pc-linux-gnu/6.3.0
> >> >>>
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64
> >> >>>
> >> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib
> >> >>>
> >> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib
> >> >>> -lmuelu-adapters -lmuelu-interface -lmuelu -lstratimikos
> >> >>> -lstratimikosbelos -lstratimikosaztecoo -lstratimikosamesos
> >> >>> -lstratimikosml -lstratimikosifpack -lModeLaplace -lanasaziepetra
> >> >>> -lanasazi -lmapvarlib -lsuplib_cpp -lsuplib_c -lsuplib -lsupes
> >> >>> -laprepro_lib -lchaco -lio_info_lib -lIonit -lIotr -lIohb -lIogs
> >> >>> -lIogn -lIovs -lIopg -lIoexo_fac -lIopx -lIofx -lIoex -lIoss -lnemesis
> >> >>> -lexoIIv2for32 -lexodus_for -lexodus -lmapvarlib -lsuplib_cpp
> >> >>> -lsuplib_c -lsuplib -lsupes -laprepro_lib -lchaco -lio_info_lib
> >> >>> -lIonit -lIotr -lIohb -lIogs -lIogn -lIovs -lIopg -lIoexo_fac -lIopx
> >> >>> -lIofx -lIoex -lIoss -lnemesis -lexoIIv2for32 -lexodus_for -lexodus
> >> >>> -lbelosxpetra -lbelosepetra -lbelos -lml -lifpack -lpamgen_extras
> >> >>> -lpamgen -lamesos -lgaleri-xpetra -lgaleri-epetra -laztecoo
> >> >>> -lisorropia -lxpetra-sup -lxpetra -lthyraepetraext -lthyraepetra
> >> >>> -lthyracore -lthyraepetraext -lthyraepetra -lthyracore -lepetraext
> >> >>> -ltrilinosss -ltriutils -lzoltan -lepetra -lsacado -lrtop
> >> >>> -lkokkoskernels -lteuchoskokkoscomm -lteuchoskokkoscompat
> >> >>> -lteuchosremainder -lteuchosnumerics -lteuchoscomm
> >> >>> -lteuchosparameterlist -lteuchosparser -lteuchoscore
> >> >>> -lteuchoskokkoscomm -lteuchoskokkoscompat -lteuchosremainder
> >> >>> -lteuchosnumerics -lteuchoscomm -lteuchosparameterlist -lteuchosparser
> >> >>> -lteuchoscore -lkokkosalgorithms -lkokkoscontainers -lkokkoscore
> >> >>> -lkokkosalgorithms -lkokkoscontainers -lkokkoscore -lgtest -lpthread
> >> >>> -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack
> >> >>> -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd
> >> >>> -lsuitesparseconfig -lsuperlu_dist -lHYPRE -lopenblas -lhdf5hl_fortran
> >> >>> -lhdf5_fortran -lhdf5_hl -lhdf5 -lparmetis -lmetis -lm -lz -lstdc++
> >> >>> -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
> >> >>> -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++
> >> >>> -ldl
> >> >>> -----------------------------------------
> >> >>>
> >> >>>
> >> >>>> Am 30.04.2020 um 17:14 schrieb Matthew Knepley <[email protected]>:
> >> >>>>
> >> >>>> On Thu, Apr 30, 2020 at 10:55 AM Walker Andreas
> >> >>>> <[email protected]> wrote:
> >> >>>> Hello everyone,
> >> >>>>
> >> >>>> I have used SLEPc successfully on a FEM-related project. Even though
> >> >>>> it is very powerful overall, the speedup I measure is a bit below my
> >> >>>> expectations. Compared to using a single core, the speedup is for
> >> >>>> example around 1.8 for two cores but only maybe 50-60 for 128 cores
> >> >>>> and maybe 70 or 80 for 256 cores. Some details about my problem:
> >> >>>>
> >> >>>> - The problem is based on meshes with up to 400k degrees of freedom.
> >> >>>> DMPlex is used for organizing it.
> >> >>>> - ParMetis is used to partition the mesh. This yields a stiffness
> >> >>>> matrix where the vast majority of entries is in the diagonal blocks
> >> >>>> (i.e. looking at the rows owned by a core, there is a very dense
> >> >>>> square-shaped region around the diagonal and some loosely scattered
> >> >>>> nozeroes in the other columns).
> >> >>>> - The actual matrix from which I need eigenvalues is a 2x2 block
> >> >>>> matrix, saved as MATNEST - matrix. Each of these four matrices is
> >> >>>> computed based on the stiffness matrix and has a similar size and
> >> >>>> nonzero pattern. For a mesh of 200k dofs, one such matrix has a size
> >> >>>> of about 174kx174k and on average about 40 nonzeroes per row.
> >> >>>> - I use the default Krylov-Schur solver and look for the 100 smallest
> >> >>>> eigenvalues
> >> >>>> - The output of -log_view for the 200k-dof - mesh described above run
> >> >>>> on 128 cores is at the end of this mail.
> >> >>>>
> >> >>>> I noticed that the problem matrices are not perfectly balanced, i.e.
> >> >>>> the number of rows per core might vary between 2500 and 3000, for
> >> >>>> example. But I am not sure if this is the main reason for the poor
> >> >>>> speedup.
> >> >>>>
> >> >>>> I tried to reduce the subspace size but without effect. I also
> >> >>>> attempted to use the shift-and-invert spectral transformation but the
> >> >>>> MATNEST-type prevents this.
> >> >>>>
> >> >>>> Are there any suggestions to improve the speedup further or is this
> >> >>>> the maximum speedup that I can expect?
> >> >>>>
> >> >>>> Can you also give us the performance for this problem on one node
> >> >>>> using the same number of cores per node? Then we can calculate speedup
> >> >>>> and look at which functions are not speeding up.
> >> >>>>
> >> >>>> Thanks,
> >> >>>>
> >> >>>> Matt
> >> >>>>
> >> >>>> Thanks a lot in advance,
> >> >>>>
> >> >>>> Andreas Walker
> >> >>>>
> >> >>>> m&m group
> >> >>>> D-MAVT
> >> >>>> ETH Zurich
> >> >>>>
> >> >>>> ************************************************************************************************************************
> >> >>>> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript
> >> >>>> -r -fCourier9' to print this document ***
> >> >>>> ************************************************************************************************************************
> >> >>>>
> >> >>>> ---------------------------------------------- PETSc Performance
> >> >>>> Summary: ----------------------------------------------
> >> >>>>
> >> >>>> ./Solver on a named eu-g1-050-2 with 128 processors, by awalker Thu
> >> >>>> Apr 30 15:50:22 2020
> >> >>>> Using Petsc Release Version 3.10.5, Mar, 28, 2019
> >> >>>>
> >> >>>> Max Max/Min Avg Total
> >> >>>> Time (sec): 6.209e+02 1.000 6.209e+02
> >> >>>> Objects: 6.068e+05 1.001 6.063e+05
> >> >>>> Flop: 9.230e+11 1.816 7.212e+11 9.231e+13
> >> >>>> Flop/sec: 1.487e+09 1.816 1.161e+09 1.487e+11
> >> >>>> MPI Messages: 1.451e+07 2.999 8.265e+06 1.058e+09
> >> >>>> MPI Message Lengths: 6.062e+09 2.011 5.029e+02 5.321e+11
> >> >>>> MPI Reductions: 1.512e+06 1.000
> >> >>>>
> >> >>>> Flop counting convention: 1 flop = 1 real number operation of type
> >> >>>> (multiply/divide/add/subtract)
> >> >>>> e.g., VecAXPY() for real vectors of length
> >> >>>> N --> 2N flop
> >> >>>> and VecAXPY() for complex vectors of
> >> >>>> length N --> 8N flop
> >> >>>>
> >> >>>> Summary of Stages: ----- Time ------ ----- Flop ------ ---
> >> >>>> Messages --- -- Message Lengths -- -- Reductions --
> >> >>>> Avg %Total Avg %Total Count
> >> >>>> %Total Avg %Total Count %Total
> >> >>>> 0: Main Stage: 6.2090e+02 100.0% 9.2309e+13 100.0% 1.058e+09
> >> >>>> 100.0% 5.029e+02 100.0% 1.512e+06 100.0%
> >> >>>>
> >> >>>> ------------------------------------------------------------------------------------------------------------------------
> >> >>>> See the 'Profiling' chapter of the users' manual for details on
> >> >>>> interpreting output.
> >> >>>> Phase summary info:
> >> >>>> Count: number of times phase was executed
> >> >>>> Time and Flop: Max - maximum over all processors
> >> >>>> Ratio - ratio of maximum to minimum over all
> >> >>>> processors
> >> >>>> Mess: number of messages sent
> >> >>>> AvgLen: average message length (bytes)
> >> >>>> Reduct: number of global reductions
> >> >>>> Global: entire computation
> >> >>>> Stage: stages of a computation. Set stages with PetscLogStagePush()
> >> >>>> and PetscLogStagePop().
> >> >>>> %T - percent time in this phase %F - percent flop in
> >> >>>> this phase
> >> >>>> %M - percent messages in this phase %L - percent message
> >> >>>> lengths in this phase
> >> >>>> %R - percent reductions in this phase
> >> >>>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time
> >> >>>> over all processors)
> >> >>>> ------------------------------------------------------------------------------------------------------------------------
> >> >>>> Event Count Time (sec) Flop
> >> >>>> --- Global --- --- Stage ---- Total
> >> >>>> Max Ratio Max Ratio Max Ratio Mess
> >> >>>> AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
> >> >>>> ------------------------------------------------------------------------------------------------------------------------
> >> >>>>
> >> >>>> --- Event Stage 0: Main Stage
> >> >>>>
> >> >>>> BuildTwoSided 20 1.0 2.3249e-01 2.2 0.00e+00 0.0 2.2e+04
> >> >>>> 4.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> BuildTwoSidedF 317 1.0 8.5016e-01 4.8 0.00e+00 0.0 2.1e+04
> >> >>>> 1.4e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> MatMult 150986 1.0 2.1963e+02 1.3 8.07e+10 1.8 1.1e+09
> >> >>>> 5.0e+02 1.2e+06 31 9100100 80 31 9100100 80 37007
> >> >>>> MatMultAdd 603944 1.0 1.6209e+02 1.4 8.07e+10 1.8 1.1e+09
> >> >>>> 5.0e+02 0.0e+00 23 9100100 0 23 9100100 0 50145
> >> >>>> MatConvert 30 1.0 1.6488e-02 2.2 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> MatScale 10 1.0 1.0347e-03 3.9 6.68e+05 1.8 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 65036
> >> >>>> MatAssemblyBegin 916 1.0 8.6715e-01 1.4 0.00e+00 0.0 2.1e+04
> >> >>>> 1.4e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> MatAssemblyEnd 916 1.0 2.0682e-01 1.1 0.00e+00 0.0 4.7e+05
> >> >>>> 1.3e+02 1.5e+03 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> MatZeroEntries 42 1.0 7.2787e-03 2.0 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> MatView 10 1.0 1.4816e+00 1.0 0.00e+00 0.0 6.4e+03
> >> >>>> 1.3e+05 3.0e+01 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> MatAXPY 40 1.0 1.0752e-02 1.9 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> MatTranspose 80 1.0 3.0198e-03 1.4 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> MatMatMult 60 1.0 3.0391e-01 1.0 7.82e+06 1.6 3.8e+05
> >> >>>> 2.8e+02 7.8e+02 0 0 0 0 0 0 0 0 0 0 2711
> >> >>>> MatMatMultSym 60 1.0 2.4238e-01 1.0 0.00e+00 0.0 3.3e+05
> >> >>>> 2.4e+02 7.2e+02 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> MatMatMultNum 60 1.0 5.8508e-02 1.0 7.82e+06 1.6 4.7e+04
> >> >>>> 5.7e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 14084
> >> >>>> MatPtAP 40 1.0 4.5617e-01 1.0 1.59e+07 1.6 3.3e+05
> >> >>>> 1.0e+03 6.4e+02 0 0 0 0 0 0 0 0 0 0 3649
> >> >>>> MatPtAPSymbolic 40 1.0 2.6002e-01 1.0 0.00e+00 0.0 1.7e+05
> >> >>>> 6.5e+02 2.8e+02 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> MatPtAPNumeric 40 1.0 1.9293e-01 1.0 1.59e+07 1.6 1.5e+05
> >> >>>> 1.5e+03 3.2e+02 0 0 0 0 0 0 0 0 0 0 8629
> >> >>>> MatTrnMatMult 40 1.0 2.3801e-01 1.0 6.09e+06 1.8 1.8e+05
> >> >>>> 1.0e+03 6.4e+02 0 0 0 0 0 0 0 0 0 0 2442
> >> >>>> MatTrnMatMultSym 40 1.0 1.6962e-01 1.0 0.00e+00 0.0 1.7e+05
> >> >>>> 4.4e+02 6.4e+02 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> MatTrnMatMultNum 40 1.0 6.9000e-02 1.0 6.09e+06 1.8 9.7e+03
> >> >>>> 1.1e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 8425
> >> >>>> MatGetLocalMat 240 1.0 4.9149e-02 1.6 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> MatGetBrAoCol 160 1.0 2.0470e-02 1.6 0.00e+00 0.0 3.3e+05
> >> >>>> 4.1e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> MatTranspose_SeqAIJ_FAST 80 1.0 2.9940e-03 1.4 0.00e+00 0.0
> >> >>>> 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> Mesh Partition 1 1.0 1.4825e+00 1.0 0.00e+00 0.0 9.8e+04
> >> >>>> 6.9e+01 6.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> Mesh Migration 1 1.0 3.6680e-02 1.0 0.00e+00 0.0 1.5e+03
> >> >>>> 1.4e+04 6.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> DMPlexDistribute 1 1.0 1.5269e+00 1.0 0.00e+00 0.0 1.0e+05
> >> >>>> 3.5e+02 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> DMPlexDistCones 1 1.0 1.8845e-02 1.2 0.00e+00 0.0 1.0e+03
> >> >>>> 1.7e+04 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> DMPlexDistLabels 1 1.0 9.7280e-04 1.2 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> DMPlexDistData 1 1.0 3.1499e-01 1.4 0.00e+00 0.0 9.8e+04
> >> >>>> 4.3e+01 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> DMPlexStratify 2 1.0 9.3421e-02 1.8 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> DMPlexPrealloc 2 1.0 3.5980e-02 1.0 0.00e+00 0.0 4.0e+04
> >> >>>> 1.8e+03 3.0e+01 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> SFSetGraph 20 1.0 1.6069e-05 2.0 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> SFSetUp 20 1.0 2.8043e-01 1.9 0.00e+00 0.0 6.7e+04
> >> >>>> 5.0e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> SFBcastBegin 25 1.0 3.9653e-02 2.5 0.00e+00 0.0 6.1e+04
> >> >>>> 4.9e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> SFBcastEnd 25 1.0 9.0128e-02 1.6 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> SFReduceBegin 10 1.0 4.3473e-04 5.5 0.00e+00 0.0 7.4e+03
> >> >>>> 4.0e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> SFReduceEnd 10 1.0 5.7962e-03 1.3 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> SFFetchOpBegin 2 1.0 1.6069e-0434.7 0.00e+00 0.0 1.8e+03
> >> >>>> 4.4e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> SFFetchOpEnd 2 1.0 8.9251e-04 2.6 0.00e+00 0.0 1.8e+03
> >> >>>> 4.4e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> VecSet 302179 1.0 1.3128e+00 2.3 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> VecAssemblyBegin 1 1.0 1.3844e-03 7.3 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> VecAssemblyEnd 1 1.0 3.4710e-05 4.1 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> VecScatterBegin 603945 1.0 2.2874e+01 4.4 0.00e+00 0.0 1.1e+09
> >> >>>> 5.0e+02 1.0e+00 2 0100100 0 2 0100100 0 0
> >> >>>> VecScatterEnd 603944 1.0 8.2651e+01 4.5 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 7 0 0 0 0 7 0 0 0 0 0
> >> >>>> VecSetRandom 11 1.0 2.7061e-03 3.1 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> EPSSetUp 10 1.0 5.0371e-02 1.1 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 4.0e+01 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> EPSSolve 10 1.0 6.1329e+02 1.0 9.23e+11 1.8 1.1e+09
> >> >>>> 5.0e+02 1.5e+06 99100100100100 99100100100100 150509
> >> >>>> STSetUp 10 1.0 2.5475e-04 2.9 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> STApply 150986 1.0 2.1997e+02 1.3 8.07e+10 1.8 1.1e+09
> >> >>>> 5.0e+02 1.2e+06 31 9100100 80 31 9100100 80 36950
> >> >>>> BVCopy 1791 1.0 5.1953e-03 1.5 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> BVMultVec 301925 1.0 1.5007e+02 3.1 3.31e+11 1.8 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 14 36 0 0 0 14 36 0 0 0 220292
> >> >>>> BVMultInPlace 1801 1.0 8.0080e+00 1.8 1.78e+11 1.8 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 1 19 0 0 0 1 19 0 0 0 2222543
> >> >>>> BVDotVec 301925 1.0 3.2807e+02 1.4 3.33e+11 1.8 0.0e+00
> >> >>>> 0.0e+00 3.0e+05 47 36 0 0 20 47 36 0 0 20 101409
> >> >>>> BVOrthogonalizeV 150996 1.0 4.0292e+02 1.1 6.64e+11 1.8 0.0e+00
> >> >>>> 0.0e+00 3.0e+05 62 72 0 0 20 62 72 0 0 20 164619
> >> >>>> BVScale 150996 1.0 4.1660e-01 3.2 5.27e+08 1.8 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 126494
> >> >>>> BVSetRandom 10 1.0 2.5061e-03 2.9 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> DSSolve 1801 1.0 2.0764e+01 1.1 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0
> >> >>>> DSVectors 2779 1.0 1.2691e-01 1.1 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> >> >>>> DSOther 1801 1.0 1.2944e+01 1.0 0.00e+00 0.0 0.0e+00
> >> >>>> 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
> >> >>>> ------------------------------------------------------------------------------------------------------------------------
> >> >>>>
> >> >>>> Memory usage is given in bytes:
> >> >>>>
> >> >>>> Object Type Creations Destructions Memory
> >> >>>> Descendants' Mem.
> >> >>>> Reports information only for process 0.
> >> >>>>
> >> >>>> --- Event Stage 0: Main Stage
> >> >>>>
> >> >>>> Container 1 1 584 0.
> >> >>>> Distributed Mesh 6 6 29160 0.
> >> >>>> GraphPartitioner 2 2 1244 0.
> >> >>>> Matrix 1104 1104 136615232 0.
> >> >>>> Index Set 930 930 9125912 0.
> >> >>>> IS L to G Mapping 3 3 2235608 0.
> >> >>>> Section 28 26 18720 0.
> >> >>>> Star Forest Graph 30 30 25632 0.
> >> >>>> Discrete System 6 6 5616 0.
> >> >>>> PetscRandom 11 11 7194 0.
> >> >>>> Vector 604372 604372 8204816368 0.
> >> >>>> Vec Scatter 203 203 272192 0.
> >> >>>> Viewer 21 10 8480 0.
> >> >>>> EPS Solver 10 10 86360 0.
> >> >>>> Spectral Transform 10 10 8400 0.
> >> >>>> Basis Vectors 10 10 530848 0.
> >> >>>> Region 10 10 6800 0.
> >> >>>> Direct Solver 10 10 9838880 0.
> >> >>>> Krylov Solver 10 10 13920 0.
> >> >>>> Preconditioner 10 10 10080 0.
> >> >>>> ========================================================================================================================
> >> >>>> Average time to get PetscTime(): 3.49944e-08
> >> >>>> Average time for MPI_Barrier(): 5.842e-06
> >> >>>> Average time for zero size MPI_Send(): 8.72551e-06
> >> >>>> #PETSc Option Table entries:
> >> >>>> -config=benchmark3.json
> >> >>>> -log_view
> >> >>>> #End of PETSc Option Table entries
> >> >>>> Compiled without FORTRAN kernels
> >> >>>> Compiled with full precision matrices (default)
> >> >>>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> >> >>>> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
> >> >>>> Configure options:
> >> >>>> --prefix=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit
> >> >>>> --with-ssl=0 --download-c2html=0 --download-sowing=0
> >> >>>> --download-hwloc=0 CFLAGS="-ftree-vectorize -O2 -march=core-avx2
> >> >>>> -fPIC -mavx2" FFLAGS= CXXFLAGS="-ftree-vectorize -O2 -march=core-avx2
> >> >>>> -fPIC -mavx2"
> >> >>>> --with-cc=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc
> >> >>>>
> >> >>>> --with-cxx=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpic++
> >> >>>>
> >> >>>> --with-fc=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90
> >> >>>> --with-precision=double --with-scalar-type=real
> >> >>>> --with-shared-libraries=1 --with-debugging=0 --with-64-bit-indices=0
> >> >>>> COPTFLAGS= FOPTFLAGS= CXXOPTFLAGS=
> >> >>>> --with-blaslapack-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib/libopenblas.so
> >> >>>> --with-x=0 --with-cxx-dialect=C++11 --with-boost=1
> >> >>>> --with-clanguage=C
> >> >>>> --with-scalapack-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib/libscalapack.so
> >> >>>> --with-scalapack=1 --with-metis=1
> >> >>>> --with-metis-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk
> >> >>>> --with-hdf5=1
> >> >>>> --with-hdf5-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5
> >> >>>> --with-hypre=1
> >> >>>> --with-hypre-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne
> >> >>>> --with-parmetis=1
> >> >>>> --with-parmetis-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4
> >> >>>> --with-mumps=1
> >> >>>> --with-mumps-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b
> >> >>>> --with-trilinos=1
> >> >>>> --with-trilinos-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo
> >> >>>> --with-fftw=0 --with-cxx-dialect=C++11
> >> >>>> --with-superlu_dist-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/include
> >> >>>>
> >> >>>> --with-superlu_dist-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib/libsuperlu_dist.a
> >> >>>> --with-superlu_dist=1
> >> >>>> --with-suitesparse-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/include
> >> >>>>
> >> >>>> --with-suitesparse-lib="/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libumfpack.so
> >> >>>>
> >> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libklu.so
> >> >>>>
> >> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcholmod.so
> >> >>>>
> >> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libbtf.so
> >> >>>>
> >> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libccolamd.so
> >> >>>>
> >> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcolamd.so
> >> >>>>
> >> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcamd.so
> >> >>>>
> >> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libamd.so
> >> >>>>
> >> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libsuitesparseconfig.so
> >> >>>> /lib64/librt.so" --with-suitesparse=1
> >> >>>> --with-zlib-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/include
> >> >>>>
> >> >>>> --with-zlib-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib/libz.so
> >> >>>> --with-zlib=1
> >> >>>> -----------------------------------------
> >> >>>> Libraries compiled on 2020-01-22 15:21:53 on eu-c7-051-02
> >> >>>> Machine characteristics:
> >> >>>> Linux-3.10.0-862.14.4.el7.x86_64-x86_64-with-centos-7.5.1804-Core
> >> >>>> Using PETSc directory:
> >> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit
> >> >>>> Using PETSc arch:
> >> >>>> -----------------------------------------
> >> >>>>
> >> >>>> Using C compiler:
> >> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc
> >> >>>> -ftree-vectorize -O2 -march=core-avx2 -fPIC -mavx2
> >> >>>> Using Fortran compiler:
> >> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90
> >> >>>>
> >> >>>> -----------------------------------------
> >> >>>>
> >> >>>> Using include paths:
> >> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/include
> >> >>>>
> >> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/include
> >> >>>>
> >> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/include
> >> >>>>
> >> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/include
> >> >>>>
> >> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/include
> >> >>>>
> >> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/include
> >> >>>>
> >> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/include
> >> >>>>
> >> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/include
> >> >>>>
> >> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/include
> >> >>>>
> >> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/include
> >> >>>> -----------------------------------------
> >> >>>>
> >> >>>> Using C linker:
> >> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc
> >> >>>> Using Fortran linker:
> >> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90
> >> >>>> Using libraries:
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/lib
> >> >>>> -lpetsc
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/lib
> >> >>>>
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/lib
> >> >>>>
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib
> >> >>>>
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib
> >> >>>> /lib64/librt.so
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib
> >> >>>>
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/lib
> >> >>>>
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib
> >> >>>>
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/lib
> >> >>>>
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/lib
> >> >>>>
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/lib
> >> >>>>
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib
> >> >>>>
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hwloc-1.11.9-a436y6rdahnn57u6oe6snwemjhcfmrso/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hwloc-1.11.9-a436y6rdahnn57u6oe6snwemjhcfmrso/lib
> >> >>>> -Wl,-rpath,/cluster/apps/lsf/10.1/linux2.6-glibc2.3-x86_64/lib
> >> >>>> -L/cluster/apps/lsf/10.1/linux2.6-glibc2.3-x86_64/lib
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/lib
> >> >>>>
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib:/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64
> >> >>>>
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib/gcc/x86_64-pc-linux-gnu/6.3.0
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib/gcc/x86_64-pc-linux-gnu/6.3.0
> >> >>>>
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64
> >> >>>>
> >> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib
> >> >>>>
> >> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib
> >> >>>> -lmuelu-adapters -lmuelu-interface -lmuelu -lstratimikos
> >> >>>> -lstratimikosbelos -lstratimikosaztecoo -lstratimikosamesos
> >> >>>> -lstratimikosml -lstratimikosifpack -lModeLaplace -lanasaziepetra
> >> >>>> -lanasazi -lmapvarlib -lsuplib_cpp -lsuplib_c -lsuplib -lsupes
> >> >>>> -laprepro_lib -lchaco -lio_info_lib -lIonit -lIotr -lIohb -lIogs
> >> >>>> -lIogn -lIovs -lIopg -lIoexo_fac -lIopx -lIofx -lIoex -lIoss
> >> >>>> -lnemesis -lexoIIv2for32 -lexodus_for -lexodus -lmapvarlib
> >> >>>> -lsuplib_cpp -lsuplib_c -lsuplib -lsupes -laprepro_lib -lchaco
> >> >>>> -lio_info_lib -lIonit -lIotr -lIohb -lIogs -lIogn -lIovs -lIopg
> >> >>>> -lIoexo_fac -lIopx -lIofx -lIoex -lIoss -lnemesis -lexoIIv2for32
> >> >>>> -lexodus_for -lexodus -lbelosxpetra -lbelosepetra -lbelos -lml
> >> >>>> -lifpack -lpamgen_extras -lpamgen -lamesos -lgaleri-xpetra
> >> >>>> -lgaleri-epetra -laztecoo -lisorropia -lxpetra-sup -lxpetra
> >> >>>> -lthyraepetraext -lthyraepetra -lthyracore -lthyraepetraext
> >> >>>> -lthyraepetra -lthyracore -lepetraext -ltrilinosss -ltriutils
> >> >>>> -lzoltan -lepetra -lsacado -lrtop -lkokkoskernels -lteuchoskokkoscomm
> >> >>>> -lteuchoskokkoscompat -lteuchosremainder -lteuchosnumerics
> >> >>>> -lteuchoscomm -lteuchosparameterlist -lteuchosparser -lteuchoscore
> >> >>>> -lteuchoskokkoscomm -lteuchoskokkoscompat -lteuchosremainder
> >> >>>> -lteuchosnumerics -lteuchoscomm -lteuchosparameterlist
> >> >>>> -lteuchosparser -lteuchoscore -lkokkosalgorithms -lkokkoscontainers
> >> >>>> -lkokkoscore -lkokkosalgorithms -lkokkoscontainers -lkokkoscore
> >> >>>> -lgtest -lpthread -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common
> >> >>>> -lpord -lscalapack -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd
> >> >>>> -lcamd -lamd -lsuitesparseconfig -lsuperlu_dist -lHYPRE -lopenblas
> >> >>>> -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lparmetis -lmetis
> >> >>>> -lm -lz -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr
> >> >>>> -lmpi_mpifh -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath
> >> >>>> -lpthread -lstdc++ -ldl
> >> >>>> -----------------------------------------
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> --
> >> >>>> What most experimenters take for granted before they begin their
> >> >>>> experiments is infinitely more interesting than any results to which
> >> >>>> their experiments lead.
> >> >>>> -- Norbert Wiener
> >> >>>>
> >> >>>> https://www.cse.buffalo.edu/~knepley/
> >> >>>
> >>
> >>
> >>
> >> --
> >> What most experimenters take for granted before they begin their
> >> experiments is infinitely more interesting than any results to which their
> >> experiments lead.
> >> -- Norbert Wiener
> >>
> >> https://www.cse.buffalo.edu/~knepley/
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> > experiments is infinitely more interesting than any results to which their
> > experiments lead.
> > -- Norbert Wiener
> >
> > https://www.cse.buffalo.edu/~knepley/
>
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/