Hi Jed, Hi Jose, Thank you very much for your suggestions.
- I tried reducing the subspace to 64 which indeed reduced the runtime by around 20 percent (sometimes more) for 128 cores. I will check what the effect on the sequential runtime is. - Regarding MatNest, I can just look for the eigenvalues of a submatrix to see how the speedup is affected; I will check that. Replacing the full matnest with a contiguous matrix is definitely more work but, if it improves the performance, worth the work (we assume that the program will be reused a lot). - Petsc is configured with mumps, openblas, scalapack (among others). But I noticed no significant difference to when petsc is configured without them. - The number of iterations required by the solver does not depend on the number of cores. Best regards and many thanks, Andreas Walker > Am 01.05.2020 um 14:12 schrieb Jed Brown <[email protected]>: > > "Jose E. Roman" <[email protected]> writes: > >> Comments related to PETSc: >> >> - If you look at the "Reduct" column you will see that MatMult() is doing a >> lot of global reductions, which is bad for scaling. This is due to MATNEST >> (other Mat types do not do that). I don't know the details of MATNEST, maybe >> Matt can comment on this. > > It is not intrinsic to MatNest, though use of MatNest incurs extra > VecScatter costs. If you use MatNest without VecNest, then > VecGetSubVector incurs significant cost (including reductions). I > suspect it's likely that some SLEPc functionality is not available with > VecNest. A better option would be to optimize VecGetSubVector by > caching the IS and subvector, at least in the contiguous case. > > How difficult would it be for you to run with a monolithic matrix > instead of MatNest? It would certainly be better at amortizing > communication costs. > >> >> Comments related to SLEPc. >> >> - The last rows (DSSolve, DSVectors, DSOther) correspond to "sequential" >> computations. In your case they take a non-negligible time (around 30 >> seconds). You can try to reduce this time by reducing the size of the >> projected problem, e.g. running with -eps_nev 100 -eps_mpd 64 (see >> https://slepc.upv.es/documentation/current/docs/manualpages/EPS/EPSSetDimensions.html >> ) >> >> - In my previous comment about multithreaded BLAS, I was refering to >> configuring PETSc with MKL, OpenBLAS or similar. But anyway, I don't think >> this is relevant here. >> >> - Regarding the number of iterations, yes the number of iterations should be >> the same for different runs if you keep the same number of processes, but >> when you change the number of processes there might be significant >> differences for some problems, that is the rationale of my suggestion. >> Anyway, in your case the fluctuation does not seem very important. >> >> Jose >> >> >>> El 1 may 2020, a las 10:07, Walker Andreas <[email protected]> >>> escribió: >>> >>> Hi Matthew, >>> >>> I just ran the same program on a single core. You can see the output of >>> -log_view below. As I see it, most functions have speedups of around 50 for >>> 128 cores, also functions like matmult etc. >>> >>> Best regards, >>> >>> Andreas >>> >>> ************************************************************************************************************************ >>> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r >>> -fCourier9' to print this document *** >>> ************************************************************************************************************************ >>> >>> ---------------------------------------------- PETSc Performance Summary: >>> ---------------------------------------------- >>> >>> ./Solver on a named eu-a6-011-09 with 1 processor, by awalker Fri May 1 >>> 04:03:07 2020 >>> Using Petsc Release Version 3.10.5, Mar, 28, 2019 >>> >>> Max Max/Min Avg Total >>> Time (sec): 3.092e+04 1.000 3.092e+04 >>> Objects: 6.099e+05 1.000 6.099e+05 >>> Flop: 9.313e+13 1.000 9.313e+13 9.313e+13 >>> Flop/sec: 3.012e+09 1.000 3.012e+09 3.012e+09 >>> MPI Messages: 0.000e+00 0.000 0.000e+00 0.000e+00 >>> MPI Message Lengths: 0.000e+00 0.000 0.000e+00 0.000e+00 >>> MPI Reductions: 0.000e+00 0.000 >>> >>> Flop counting convention: 1 flop = 1 real number operation of type >>> (multiply/divide/add/subtract) >>> e.g., VecAXPY() for real vectors of length N --> >>> 2N flop >>> and VecAXPY() for complex vectors of length N >>> --> 8N flop >>> >>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- >>> -- Message Lengths -- -- Reductions -- >>> Avg %Total Avg %Total Count %Total >>> Avg %Total Count %Total >>> 0: Main Stage: 3.0925e+04 100.0% 9.3134e+13 100.0% 0.000e+00 0.0% >>> 0.000e+00 0.0% 0.000e+00 0.0% >>> >>> ------------------------------------------------------------------------------------------------------------------------ >>> See the 'Profiling' chapter of the users' manual for details on >>> interpreting output. >>> Phase summary info: >>> Count: number of times phase was executed >>> Time and Flop: Max - maximum over all processors >>> Ratio - ratio of maximum to minimum over all processors >>> Mess: number of messages sent >>> AvgLen: average message length (bytes) >>> Reduct: number of global reductions >>> Global: entire computation >>> Stage: stages of a computation. Set stages with PetscLogStagePush() and >>> PetscLogStagePop(). >>> %T - percent time in this phase %F - percent flop in this phase >>> %M - percent messages in this phase %L - percent message lengths >>> in this phase >>> %R - percent reductions in this phase >>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over >>> all processors) >>> ------------------------------------------------------------------------------------------------------------------------ >>> Event Count Time (sec) Flop >>> --- Global --- --- Stage ---- Total >>> Max Ratio Max Ratio Max Ratio Mess AvgLen >>> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> --- Event Stage 0: Main Stage >>> >>> MatMult 152338 1.0 8.2799e+03 1.0 8.20e+12 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 27 9 0 0 0 27 9 0 0 0 990 >>> MatMultAdd 609352 1.0 8.1229e+03 1.0 8.20e+12 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 26 9 0 0 0 26 9 0 0 0 1010 >>> MatConvert 30 1.0 1.5797e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatScale 10 1.0 4.7172e-02 1.0 6.73e+07 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1426 >>> MatAssemblyBegin 516 1.0 2.0695e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatAssemblyEnd 516 1.0 2.8933e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatZeroEntries 2 1.0 3.6038e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatView 10 1.0 2.4422e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatAXPY 40 1.0 3.1595e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatMatMult 60 1.0 1.3723e+01 1.0 1.24e+09 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 90 >>> MatMatMultSym 100 1.0 1.3651e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatMatMultNum 100 1.0 7.5159e+00 1.0 2.06e+09 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 274 >>> MatMatMatMult 40 1.0 1.8674e+01 1.0 1.66e+09 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 89 >>> MatMatMatMultSym 40 1.0 1.1848e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> MatMatMatMultNum 40 1.0 6.8266e+00 1.0 1.66e+09 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 243 >>> MatPtAP 40 1.0 1.9042e+01 1.0 1.66e+09 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 87 >>> MatTrnMatMult 40 1.0 7.7990e+00 1.0 8.24e+08 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 106 >>> DMPlexStratify 1 1.0 5.1223e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> DMPlexPrealloc 2 1.0 1.5242e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecSet 914053 1.0 1.4929e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecAssemblyBegin 1 1.0 1.3411e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecAssemblyEnd 1 1.0 8.0094e-08 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecScatterBegin 1 1.0 2.6399e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> VecSetRandom 10 1.0 8.6088e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> EPSSetUp 10 1.0 2.9988e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> EPSSolve 10 1.0 2.8695e+04 1.0 9.31e+13 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 93100 0 0 0 93100 0 0 0 3246 >>> STSetUp 10 1.0 9.7291e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> STApply 152338 1.0 8.2803e+03 1.0 8.20e+12 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 27 9 0 0 0 27 9 0 0 0 990 >>> BVCopy 1814 1.0 1.1076e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> BVMultVec 304639 1.0 9.8281e+03 1.0 3.34e+13 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 32 36 0 0 0 32 36 0 0 0 3397 >>> BVMultInPlace 1824 1.0 7.0999e+02 1.0 1.79e+13 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 2 19 0 0 0 2 19 0 0 0 25213 >>> BVDotVec 304639 1.0 9.8037e+03 1.0 3.36e+13 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 32 36 0 0 0 32 36 0 0 0 3427 >>> BVOrthogonalizeV 152348 1.0 1.9633e+04 1.0 6.70e+13 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 63 72 0 0 0 63 72 0 0 0 3411 >>> BVScale 152348 1.0 3.7888e+01 1.0 5.32e+10 1.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1403 >>> BVSetRandom 10 1.0 8.6364e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> DSSolve 1824 1.0 1.7363e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> DSVectors 2797 1.0 1.2353e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> DSOther 1824 1.0 9.8627e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>> ------------------------------------------------------------------------------------------------------------------------ >>> >>> Memory usage is given in bytes: >>> >>> Object Type Creations Destructions Memory Descendants' Mem. >>> Reports information only for process 0. >>> >>> --- Event Stage 0: Main Stage >>> >>> Container 1 1 584 0. >>> Distributed Mesh 1 1 5184 0. >>> GraphPartitioner 1 1 624 0. >>> Matrix 320 320 3469402576 0. >>> Index Set 53 53 2777932 0. >>> IS L to G Mapping 1 1 249320 0. >>> Section 13 11 7920 0. >>> Star Forest Graph 6 6 4896 0. >>> Discrete System 1 1 936 0. >>> Vector 609405 609405 857220847896 0. >>> Vec Scatter 1 1 704 0. >>> Viewer 22 11 9328 0. >>> EPS Solver 10 10 86360 0. >>> Spectral Transform 10 10 8400 0. >>> Basis Vectors 10 10 530336 0. >>> PetscRandom 10 10 6540 0. >>> Region 10 10 6800 0. >>> Direct Solver 10 10 9838880 0. >>> Krylov Solver 10 10 13920 0. >>> Preconditioner 10 10 10080 0. >>> ======================================================================================================================== >>> Average time to get PetscTime(): 2.50991e-08 >>> #PETSc Option Table entries: >>> -config=benchmark3.json >>> -eps_converged_reason >>> -log_view >>> #End of PETSc Option Table entries >>> Compiled without FORTRAN kernels >>> Compiled with full precision matrices (default) >>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >>> sizeof(PetscScalar) 8 sizeof(PetscInt) 4 >>> Configure options: >>> --prefix=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit >>> --with-ssl=0 --download-c2html=0 --download-sowing=0 --download-hwloc=0 >>> CFLAGS="-ftree-vectorize -O2 -march=core-avx2 -fPIC -mavx2" FFLAGS= >>> CXXFLAGS="-ftree-vectorize -O2 -march=core-avx2 -fPIC -mavx2" >>> --with-cc=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc >>> >>> --with-cxx=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpic++ >>> >>> --with-fc=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90 >>> --with-precision=double --with-scalar-type=real --with-shared-libraries=1 >>> --with-debugging=0 --with-64-bit-indices=0 COPTFLAGS= FOPTFLAGS= >>> CXXOPTFLAGS= >>> --with-blaslapack-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib/libopenblas.so >>> --with-x=0 --with-cxx-dialect=C++11 --with-boost=1 --with-clanguage=C >>> --with-scalapack-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib/libscalapack.so >>> --with-scalapack=1 --with-metis=1 >>> --with-metis-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk >>> --with-hdf5=1 >>> --with-hdf5-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5 >>> --with-hypre=1 >>> --with-hypre-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne >>> --with-parmetis=1 >>> --with-parmetis-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4 >>> --with-mumps=1 >>> --with-mumps-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b >>> --with-trilinos=1 >>> --with-trilinos-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo >>> --with-fftw=0 --with-cxx-dialect=C++11 >>> --with-superlu_dist-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/include >>> >>> --with-superlu_dist-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib/libsuperlu_dist.a >>> --with-superlu_dist=1 >>> --with-suitesparse-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/include >>> >>> --with-suitesparse-lib="/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libumfpack.so >>> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libklu.so >>> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcholmod.so >>> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libbtf.so >>> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libccolamd.so >>> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcolamd.so >>> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcamd.so >>> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libamd.so >>> >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libsuitesparseconfig.so >>> /lib64/librt.so" --with-suitesparse=1 >>> --with-zlib-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/include >>> >>> --with-zlib-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib/libz.so >>> --with-zlib=1 >>> ----------------------------------------- >>> Libraries compiled on 2020-01-22 15:21:53 on eu-c7-051-02 >>> Machine characteristics: >>> Linux-3.10.0-862.14.4.el7.x86_64-x86_64-with-centos-7.5.1804-Core >>> Using PETSc directory: >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit >>> Using PETSc arch: >>> ----------------------------------------- >>> >>> Using C compiler: >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc >>> -ftree-vectorize -O2 -march=core-avx2 -fPIC -mavx2 >>> Using Fortran compiler: >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90 >>> >>> ----------------------------------------- >>> >>> Using include paths: >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/include >>> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/include >>> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/include >>> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/include >>> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/include >>> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/include >>> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/include >>> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/include >>> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/include >>> >>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/include >>> ----------------------------------------- >>> >>> Using C linker: >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc >>> Using Fortran linker: >>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90 >>> Using libraries: >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/lib >>> -lpetsc >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/lib >>> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/lib >>> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib >>> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib >>> /lib64/librt.so >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib >>> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/lib >>> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib >>> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/lib >>> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/lib >>> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/lib >>> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib >>> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hwloc-1.11.9-a436y6rdahnn57u6oe6snwemjhcfmrso/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hwloc-1.11.9-a436y6rdahnn57u6oe6snwemjhcfmrso/lib >>> -Wl,-rpath,/cluster/apps/lsf/10.1/linux2.6-glibc2.3-x86_64/lib >>> -L/cluster/apps/lsf/10.1/linux2.6-glibc2.3-x86_64/lib >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/lib >>> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib:/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64 >>> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib/gcc/x86_64-pc-linux-gnu/6.3.0 >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib/gcc/x86_64-pc-linux-gnu/6.3.0 >>> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64 >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64 >>> >>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib >>> >>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib >>> -lmuelu-adapters -lmuelu-interface -lmuelu -lstratimikos >>> -lstratimikosbelos -lstratimikosaztecoo -lstratimikosamesos -lstratimikosml >>> -lstratimikosifpack -lModeLaplace -lanasaziepetra -lanasazi -lmapvarlib >>> -lsuplib_cpp -lsuplib_c -lsuplib -lsupes -laprepro_lib -lchaco >>> -lio_info_lib -lIonit -lIotr -lIohb -lIogs -lIogn -lIovs -lIopg -lIoexo_fac >>> -lIopx -lIofx -lIoex -lIoss -lnemesis -lexoIIv2for32 -lexodus_for -lexodus >>> -lmapvarlib -lsuplib_cpp -lsuplib_c -lsuplib -lsupes -laprepro_lib -lchaco >>> -lio_info_lib -lIonit -lIotr -lIohb -lIogs -lIogn -lIovs -lIopg -lIoexo_fac >>> -lIopx -lIofx -lIoex -lIoss -lnemesis -lexoIIv2for32 -lexodus_for -lexodus >>> -lbelosxpetra -lbelosepetra -lbelos -lml -lifpack -lpamgen_extras -lpamgen >>> -lamesos -lgaleri-xpetra -lgaleri-epetra -laztecoo -lisorropia -lxpetra-sup >>> -lxpetra -lthyraepetraext -lthyraepetra -lthyracore -lthyraepetraext >>> -lthyraepetra -lthyracore -lepetraext -ltrilinosss -ltriutils -lzoltan >>> -lepetra -lsacado -lrtop -lkokkoskernels -lteuchoskokkoscomm >>> -lteuchoskokkoscompat -lteuchosremainder -lteuchosnumerics -lteuchoscomm >>> -lteuchosparameterlist -lteuchosparser -lteuchoscore -lteuchoskokkoscomm >>> -lteuchoskokkoscompat -lteuchosremainder -lteuchosnumerics -lteuchoscomm >>> -lteuchosparameterlist -lteuchosparser -lteuchoscore -lkokkosalgorithms >>> -lkokkoscontainers -lkokkoscore -lkokkosalgorithms -lkokkoscontainers >>> -lkokkoscore -lgtest -lpthread -lcmumps -ldmumps -lsmumps -lzmumps >>> -lmumps_common -lpord -lscalapack -lumfpack -lklu -lcholmod -lbtf -lccolamd >>> -lcolamd -lcamd -lamd -lsuitesparseconfig -lsuperlu_dist -lHYPRE -lopenblas >>> -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lparmetis -lmetis -lm -lz >>> -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi >>> -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ -ldl >>> ----------------------------------------- >>> >>> >>>> Am 30.04.2020 um 17:14 schrieb Matthew Knepley <[email protected]>: >>>> >>>> On Thu, Apr 30, 2020 at 10:55 AM Walker Andreas <[email protected]> >>>> wrote: >>>> Hello everyone, >>>> >>>> I have used SLEPc successfully on a FEM-related project. Even though it is >>>> very powerful overall, the speedup I measure is a bit below my >>>> expectations. Compared to using a single core, the speedup is for example >>>> around 1.8 for two cores but only maybe 50-60 for 128 cores and maybe 70 >>>> or 80 for 256 cores. Some details about my problem: >>>> >>>> - The problem is based on meshes with up to 400k degrees of freedom. >>>> DMPlex is used for organizing it. >>>> - ParMetis is used to partition the mesh. This yields a stiffness matrix >>>> where the vast majority of entries is in the diagonal blocks (i.e. looking >>>> at the rows owned by a core, there is a very dense square-shaped region >>>> around the diagonal and some loosely scattered nozeroes in the other >>>> columns). >>>> - The actual matrix from which I need eigenvalues is a 2x2 block matrix, >>>> saved as MATNEST - matrix. Each of these four matrices is computed based >>>> on the stiffness matrix and has a similar size and nonzero pattern. For a >>>> mesh of 200k dofs, one such matrix has a size of about 174kx174k and on >>>> average about 40 nonzeroes per row. >>>> - I use the default Krylov-Schur solver and look for the 100 smallest >>>> eigenvalues >>>> - The output of -log_view for the 200k-dof - mesh described above run on >>>> 128 cores is at the end of this mail. >>>> >>>> I noticed that the problem matrices are not perfectly balanced, i.e. the >>>> number of rows per core might vary between 2500 and 3000, for example. But >>>> I am not sure if this is the main reason for the poor speedup. >>>> >>>> I tried to reduce the subspace size but without effect. I also attempted >>>> to use the shift-and-invert spectral transformation but the MATNEST-type >>>> prevents this. >>>> >>>> Are there any suggestions to improve the speedup further or is this the >>>> maximum speedup that I can expect? >>>> >>>> Can you also give us the performance for this problem on one node using >>>> the same number of cores per node? Then we can calculate speedup >>>> and look at which functions are not speeding up. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Thanks a lot in advance, >>>> >>>> Andreas Walker >>>> >>>> m&m group >>>> D-MAVT >>>> ETH Zurich >>>> >>>> ************************************************************************************************************************ >>>> *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r >>>> -fCourier9' to print this document *** >>>> ************************************************************************************************************************ >>>> >>>> ---------------------------------------------- PETSc Performance Summary: >>>> ---------------------------------------------- >>>> >>>> ./Solver on a named eu-g1-050-2 with 128 processors, by awalker Thu Apr >>>> 30 15:50:22 2020 >>>> Using Petsc Release Version 3.10.5, Mar, 28, 2019 >>>> >>>> Max Max/Min Avg Total >>>> Time (sec): 6.209e+02 1.000 6.209e+02 >>>> Objects: 6.068e+05 1.001 6.063e+05 >>>> Flop: 9.230e+11 1.816 7.212e+11 9.231e+13 >>>> Flop/sec: 1.487e+09 1.816 1.161e+09 1.487e+11 >>>> MPI Messages: 1.451e+07 2.999 8.265e+06 1.058e+09 >>>> MPI Message Lengths: 6.062e+09 2.011 5.029e+02 5.321e+11 >>>> MPI Reductions: 1.512e+06 1.000 >>>> >>>> Flop counting convention: 1 flop = 1 real number operation of type >>>> (multiply/divide/add/subtract) >>>> e.g., VecAXPY() for real vectors of length N >>>> --> 2N flop >>>> and VecAXPY() for complex vectors of length N >>>> --> 8N flop >>>> >>>> Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages >>>> --- -- Message Lengths -- -- Reductions -- >>>> Avg %Total Avg %Total Count %Total >>>> Avg %Total Count %Total >>>> 0: Main Stage: 6.2090e+02 100.0% 9.2309e+13 100.0% 1.058e+09 100.0% >>>> 5.029e+02 100.0% 1.512e+06 100.0% >>>> >>>> ------------------------------------------------------------------------------------------------------------------------ >>>> See the 'Profiling' chapter of the users' manual for details on >>>> interpreting output. >>>> Phase summary info: >>>> Count: number of times phase was executed >>>> Time and Flop: Max - maximum over all processors >>>> Ratio - ratio of maximum to minimum over all processors >>>> Mess: number of messages sent >>>> AvgLen: average message length (bytes) >>>> Reduct: number of global reductions >>>> Global: entire computation >>>> Stage: stages of a computation. Set stages with PetscLogStagePush() and >>>> PetscLogStagePop(). >>>> %T - percent time in this phase %F - percent flop in this >>>> phase >>>> %M - percent messages in this phase %L - percent message lengths >>>> in this phase >>>> %R - percent reductions in this phase >>>> Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over >>>> all processors) >>>> ------------------------------------------------------------------------------------------------------------------------ >>>> Event Count Time (sec) Flop >>>> --- Global --- --- Stage ---- Total >>>> Max Ratio Max Ratio Max Ratio Mess AvgLen >>>> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >>>> ------------------------------------------------------------------------------------------------------------------------ >>>> >>>> --- Event Stage 0: Main Stage >>>> >>>> BuildTwoSided 20 1.0 2.3249e-01 2.2 0.00e+00 0.0 2.2e+04 4.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> BuildTwoSidedF 317 1.0 8.5016e-01 4.8 0.00e+00 0.0 2.1e+04 1.4e+04 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> MatMult 150986 1.0 2.1963e+02 1.3 8.07e+10 1.8 1.1e+09 5.0e+02 >>>> 1.2e+06 31 9100100 80 31 9100100 80 37007 >>>> MatMultAdd 603944 1.0 1.6209e+02 1.4 8.07e+10 1.8 1.1e+09 5.0e+02 >>>> 0.0e+00 23 9100100 0 23 9100100 0 50145 >>>> MatConvert 30 1.0 1.6488e-02 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> MatScale 10 1.0 1.0347e-03 3.9 6.68e+05 1.8 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 65036 >>>> MatAssemblyBegin 916 1.0 8.6715e-01 1.4 0.00e+00 0.0 2.1e+04 1.4e+04 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> MatAssemblyEnd 916 1.0 2.0682e-01 1.1 0.00e+00 0.0 4.7e+05 1.3e+02 >>>> 1.5e+03 0 0 0 0 0 0 0 0 0 0 0 >>>> MatZeroEntries 42 1.0 7.2787e-03 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> MatView 10 1.0 1.4816e+00 1.0 0.00e+00 0.0 6.4e+03 1.3e+05 >>>> 3.0e+01 0 0 0 0 0 0 0 0 0 0 0 >>>> MatAXPY 40 1.0 1.0752e-02 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> MatTranspose 80 1.0 3.0198e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> MatMatMult 60 1.0 3.0391e-01 1.0 7.82e+06 1.6 3.8e+05 2.8e+02 >>>> 7.8e+02 0 0 0 0 0 0 0 0 0 0 2711 >>>> MatMatMultSym 60 1.0 2.4238e-01 1.0 0.00e+00 0.0 3.3e+05 2.4e+02 >>>> 7.2e+02 0 0 0 0 0 0 0 0 0 0 0 >>>> MatMatMultNum 60 1.0 5.8508e-02 1.0 7.82e+06 1.6 4.7e+04 5.7e+02 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 14084 >>>> MatPtAP 40 1.0 4.5617e-01 1.0 1.59e+07 1.6 3.3e+05 1.0e+03 >>>> 6.4e+02 0 0 0 0 0 0 0 0 0 0 3649 >>>> MatPtAPSymbolic 40 1.0 2.6002e-01 1.0 0.00e+00 0.0 1.7e+05 6.5e+02 >>>> 2.8e+02 0 0 0 0 0 0 0 0 0 0 0 >>>> MatPtAPNumeric 40 1.0 1.9293e-01 1.0 1.59e+07 1.6 1.5e+05 1.5e+03 >>>> 3.2e+02 0 0 0 0 0 0 0 0 0 0 8629 >>>> MatTrnMatMult 40 1.0 2.3801e-01 1.0 6.09e+06 1.8 1.8e+05 1.0e+03 >>>> 6.4e+02 0 0 0 0 0 0 0 0 0 0 2442 >>>> MatTrnMatMultSym 40 1.0 1.6962e-01 1.0 0.00e+00 0.0 1.7e+05 4.4e+02 >>>> 6.4e+02 0 0 0 0 0 0 0 0 0 0 0 >>>> MatTrnMatMultNum 40 1.0 6.9000e-02 1.0 6.09e+06 1.8 9.7e+03 1.1e+04 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 8425 >>>> MatGetLocalMat 240 1.0 4.9149e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> MatGetBrAoCol 160 1.0 2.0470e-02 1.6 0.00e+00 0.0 3.3e+05 4.1e+02 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> MatTranspose_SeqAIJ_FAST 80 1.0 2.9940e-03 1.4 0.00e+00 0.0 0.0e+00 >>>> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> Mesh Partition 1 1.0 1.4825e+00 1.0 0.00e+00 0.0 9.8e+04 6.9e+01 >>>> 6.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> Mesh Migration 1 1.0 3.6680e-02 1.0 0.00e+00 0.0 1.5e+03 1.4e+04 >>>> 6.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> DMPlexDistribute 1 1.0 1.5269e+00 1.0 0.00e+00 0.0 1.0e+05 3.5e+02 >>>> 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 >>>> DMPlexDistCones 1 1.0 1.8845e-02 1.2 0.00e+00 0.0 1.0e+03 1.7e+04 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> DMPlexDistLabels 1 1.0 9.7280e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> DMPlexDistData 1 1.0 3.1499e-01 1.4 0.00e+00 0.0 9.8e+04 4.3e+01 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> DMPlexStratify 2 1.0 9.3421e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> DMPlexPrealloc 2 1.0 3.5980e-02 1.0 0.00e+00 0.0 4.0e+04 1.8e+03 >>>> 3.0e+01 0 0 0 0 0 0 0 0 0 0 0 >>>> SFSetGraph 20 1.0 1.6069e-05 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> SFSetUp 20 1.0 2.8043e-01 1.9 0.00e+00 0.0 6.7e+04 5.0e+02 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> SFBcastBegin 25 1.0 3.9653e-02 2.5 0.00e+00 0.0 6.1e+04 4.9e+02 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> SFBcastEnd 25 1.0 9.0128e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> SFReduceBegin 10 1.0 4.3473e-04 5.5 0.00e+00 0.0 7.4e+03 4.0e+03 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> SFReduceEnd 10 1.0 5.7962e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> SFFetchOpBegin 2 1.0 1.6069e-0434.7 0.00e+00 0.0 1.8e+03 4.4e+03 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> SFFetchOpEnd 2 1.0 8.9251e-04 2.6 0.00e+00 0.0 1.8e+03 4.4e+03 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> VecSet 302179 1.0 1.3128e+00 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> VecAssemblyBegin 1 1.0 1.3844e-03 7.3 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> VecAssemblyEnd 1 1.0 3.4710e-05 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> VecScatterBegin 603945 1.0 2.2874e+01 4.4 0.00e+00 0.0 1.1e+09 5.0e+02 >>>> 1.0e+00 2 0100100 0 2 0100100 0 0 >>>> VecScatterEnd 603944 1.0 8.2651e+01 4.5 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 7 0 0 0 0 7 0 0 0 0 0 >>>> VecSetRandom 11 1.0 2.7061e-03 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> EPSSetUp 10 1.0 5.0371e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 4.0e+01 0 0 0 0 0 0 0 0 0 0 0 >>>> EPSSolve 10 1.0 6.1329e+02 1.0 9.23e+11 1.8 1.1e+09 5.0e+02 >>>> 1.5e+06 99100100100100 99100100100100 150509 >>>> STSetUp 10 1.0 2.5475e-04 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> STApply 150986 1.0 2.1997e+02 1.3 8.07e+10 1.8 1.1e+09 5.0e+02 >>>> 1.2e+06 31 9100100 80 31 9100100 80 36950 >>>> BVCopy 1791 1.0 5.1953e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> BVMultVec 301925 1.0 1.5007e+02 3.1 3.31e+11 1.8 0.0e+00 0.0e+00 >>>> 0.0e+00 14 36 0 0 0 14 36 0 0 0 220292 >>>> BVMultInPlace 1801 1.0 8.0080e+00 1.8 1.78e+11 1.8 0.0e+00 0.0e+00 >>>> 0.0e+00 1 19 0 0 0 1 19 0 0 0 2222543 >>>> BVDotVec 301925 1.0 3.2807e+02 1.4 3.33e+11 1.8 0.0e+00 0.0e+00 >>>> 3.0e+05 47 36 0 0 20 47 36 0 0 20 101409 >>>> BVOrthogonalizeV 150996 1.0 4.0292e+02 1.1 6.64e+11 1.8 0.0e+00 0.0e+00 >>>> 3.0e+05 62 72 0 0 20 62 72 0 0 20 164619 >>>> BVScale 150996 1.0 4.1660e-01 3.2 5.27e+08 1.8 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 126494 >>>> BVSetRandom 10 1.0 2.5061e-03 2.9 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> DSSolve 1801 1.0 2.0764e+01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 >>>> DSVectors 2779 1.0 1.2691e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >>>> DSOther 1801 1.0 1.2944e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>>> 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 >>>> ------------------------------------------------------------------------------------------------------------------------ >>>> >>>> Memory usage is given in bytes: >>>> >>>> Object Type Creations Destructions Memory Descendants' Mem. >>>> Reports information only for process 0. >>>> >>>> --- Event Stage 0: Main Stage >>>> >>>> Container 1 1 584 0. >>>> Distributed Mesh 6 6 29160 0. >>>> GraphPartitioner 2 2 1244 0. >>>> Matrix 1104 1104 136615232 0. >>>> Index Set 930 930 9125912 0. >>>> IS L to G Mapping 3 3 2235608 0. >>>> Section 28 26 18720 0. >>>> Star Forest Graph 30 30 25632 0. >>>> Discrete System 6 6 5616 0. >>>> PetscRandom 11 11 7194 0. >>>> Vector 604372 604372 8204816368 0. >>>> Vec Scatter 203 203 272192 0. >>>> Viewer 21 10 8480 0. >>>> EPS Solver 10 10 86360 0. >>>> Spectral Transform 10 10 8400 0. >>>> Basis Vectors 10 10 530848 0. >>>> Region 10 10 6800 0. >>>> Direct Solver 10 10 9838880 0. >>>> Krylov Solver 10 10 13920 0. >>>> Preconditioner 10 10 10080 0. >>>> ======================================================================================================================== >>>> Average time to get PetscTime(): 3.49944e-08 >>>> Average time for MPI_Barrier(): 5.842e-06 >>>> Average time for zero size MPI_Send(): 8.72551e-06 >>>> #PETSc Option Table entries: >>>> -config=benchmark3.json >>>> -log_view >>>> #End of PETSc Option Table entries >>>> Compiled without FORTRAN kernels >>>> Compiled with full precision matrices (default) >>>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 >>>> sizeof(PetscScalar) 8 sizeof(PetscInt) 4 >>>> Configure options: >>>> --prefix=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit >>>> --with-ssl=0 --download-c2html=0 --download-sowing=0 --download-hwloc=0 >>>> CFLAGS="-ftree-vectorize -O2 -march=core-avx2 -fPIC -mavx2" FFLAGS= >>>> CXXFLAGS="-ftree-vectorize -O2 -march=core-avx2 -fPIC -mavx2" >>>> --with-cc=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc >>>> >>>> --with-cxx=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpic++ >>>> >>>> --with-fc=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90 >>>> --with-precision=double --with-scalar-type=real --with-shared-libraries=1 >>>> --with-debugging=0 --with-64-bit-indices=0 COPTFLAGS= FOPTFLAGS= >>>> CXXOPTFLAGS= >>>> --with-blaslapack-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib/libopenblas.so >>>> --with-x=0 --with-cxx-dialect=C++11 --with-boost=1 --with-clanguage=C >>>> --with-scalapack-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib/libscalapack.so >>>> --with-scalapack=1 --with-metis=1 >>>> --with-metis-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk >>>> --with-hdf5=1 >>>> --with-hdf5-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5 >>>> --with-hypre=1 >>>> --with-hypre-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne >>>> --with-parmetis=1 >>>> --with-parmetis-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4 >>>> --with-mumps=1 >>>> --with-mumps-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b >>>> --with-trilinos=1 >>>> --with-trilinos-dir=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo >>>> --with-fftw=0 --with-cxx-dialect=C++11 >>>> --with-superlu_dist-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/include >>>> >>>> --with-superlu_dist-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib/libsuperlu_dist.a >>>> --with-superlu_dist=1 >>>> --with-suitesparse-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/include >>>> >>>> --with-suitesparse-lib="/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libumfpack.so >>>> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libklu.so >>>> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcholmod.so >>>> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libbtf.so >>>> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libccolamd.so >>>> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcolamd.so >>>> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libcamd.so >>>> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libamd.so >>>> >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib/libsuitesparseconfig.so >>>> /lib64/librt.so" --with-suitesparse=1 >>>> --with-zlib-include=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/include >>>> >>>> --with-zlib-lib=/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib/libz.so >>>> --with-zlib=1 >>>> ----------------------------------------- >>>> Libraries compiled on 2020-01-22 15:21:53 on eu-c7-051-02 >>>> Machine characteristics: >>>> Linux-3.10.0-862.14.4.el7.x86_64-x86_64-with-centos-7.5.1804-Core >>>> Using PETSc directory: >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit >>>> Using PETSc arch: >>>> ----------------------------------------- >>>> >>>> Using C compiler: >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc >>>> -ftree-vectorize -O2 -march=core-avx2 -fPIC -mavx2 >>>> Using Fortran compiler: >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90 >>>> >>>> ----------------------------------------- >>>> >>>> Using include paths: >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/include >>>> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/include >>>> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/include >>>> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/include >>>> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/include >>>> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/include >>>> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/include >>>> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/include >>>> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/include >>>> >>>> -I/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/include >>>> ----------------------------------------- >>>> >>>> Using C linker: >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpicc >>>> Using Fortran linker: >>>> /cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/bin/mpif90 >>>> Using libraries: >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/petsc-3.10.5-3czpbqhprn65yalty4o46knmhytixlit/lib >>>> -lpetsc >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/trilinos-12.14.1-hcdtxkqirqt6wkui3vkie5qse64payqo/lib >>>> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/mumps-5.1.1-36fzslrywwsg7gxnoxbjbzwuz6o74n6b/lib >>>> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/netlib-scalapack-2.0.2-bq6sqixlc4zwxpfrtbu7jt7twhps5ldv/lib >>>> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/suite-sparse-5.1.0-sk4v2rs7dfpese3zgsyigwtv2w66v2gz/lib >>>> /lib64/librt.so >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/superlu-dist-6.1.1-ejpmx43wk4vplnmry5n5njvgqvcvfe6x/lib >>>> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hypre-2.14.0-ly5dmcaty5wx4opqwspvoim6zss6sxne/lib >>>> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openblas-0.2.20-cot3cawsqf4pkxjwzjexaykbwn2ch3ii/lib >>>> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hdf5-1.10.1-sbxt5qlg2pojshva2b6kdflsy64i4rs5/lib >>>> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/parmetis-4.0.3-ik3r6faxeb6uzyywppuc2niuvivwiux4/lib >>>> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/metis-5.1.0-bqbfmcvyqigdaeetkg6fuhdh4eplu3fk/lib >>>> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/zlib-1.2.11-bu2rglshnlxrwc24334r76jr34jm2fxy/lib >>>> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hwloc-1.11.9-a436y6rdahnn57u6oe6snwemjhcfmrso/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/hwloc-1.11.9-a436y6rdahnn57u6oe6snwemjhcfmrso/lib >>>> -Wl,-rpath,/cluster/apps/lsf/10.1/linux2.6-glibc2.3-x86_64/lib >>>> -L/cluster/apps/lsf/10.1/linux2.6-glibc2.3-x86_64/lib >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-6.3.0/openmpi-3.0.1-k6n5k3l3baqlkdw3w7il7dwb6wilr6r6/lib >>>> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib:/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64 >>>> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib/gcc/x86_64-pc-linux-gnu/6.3.0 >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib/gcc/x86_64-pc-linux-gnu/6.3.0 >>>> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64 >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib64 >>>> >>>> -Wl,-rpath,/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib >>>> >>>> -L/cluster/spack/apps/linux-centos7-x86_64/gcc-4.8.5/gcc-6.3.0-sqhtfh32p5gerbkvi5hih7cfvcpmewvj/lib >>>> -lmuelu-adapters -lmuelu-interface -lmuelu -lstratimikos >>>> -lstratimikosbelos -lstratimikosaztecoo -lstratimikosamesos >>>> -lstratimikosml -lstratimikosifpack -lModeLaplace -lanasaziepetra >>>> -lanasazi -lmapvarlib -lsuplib_cpp -lsuplib_c -lsuplib -lsupes >>>> -laprepro_lib -lchaco -lio_info_lib -lIonit -lIotr -lIohb -lIogs -lIogn >>>> -lIovs -lIopg -lIoexo_fac -lIopx -lIofx -lIoex -lIoss -lnemesis >>>> -lexoIIv2for32 -lexodus_for -lexodus -lmapvarlib -lsuplib_cpp -lsuplib_c >>>> -lsuplib -lsupes -laprepro_lib -lchaco -lio_info_lib -lIonit -lIotr -lIohb >>>> -lIogs -lIogn -lIovs -lIopg -lIoexo_fac -lIopx -lIofx -lIoex -lIoss >>>> -lnemesis -lexoIIv2for32 -lexodus_for -lexodus -lbelosxpetra -lbelosepetra >>>> -lbelos -lml -lifpack -lpamgen_extras -lpamgen -lamesos -lgaleri-xpetra >>>> -lgaleri-epetra -laztecoo -lisorropia -lxpetra-sup -lxpetra >>>> -lthyraepetraext -lthyraepetra -lthyracore -lthyraepetraext -lthyraepetra >>>> -lthyracore -lepetraext -ltrilinosss -ltriutils -lzoltan -lepetra -lsacado >>>> -lrtop -lkokkoskernels -lteuchoskokkoscomm -lteuchoskokkoscompat >>>> -lteuchosremainder -lteuchosnumerics -lteuchoscomm -lteuchosparameterlist >>>> -lteuchosparser -lteuchoscore -lteuchoskokkoscomm -lteuchoskokkoscompat >>>> -lteuchosremainder -lteuchosnumerics -lteuchoscomm -lteuchosparameterlist >>>> -lteuchosparser -lteuchoscore -lkokkosalgorithms -lkokkoscontainers >>>> -lkokkoscore -lkokkosalgorithms -lkokkoscontainers -lkokkoscore -lgtest >>>> -lpthread -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord >>>> -lscalapack -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd >>>> -lamd -lsuitesparseconfig -lsuperlu_dist -lHYPRE -lopenblas >>>> -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lparmetis -lmetis -lm >>>> -lz -lstdc++ -ldl -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh >>>> -lmpi -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lpthread -lstdc++ >>>> -ldl >>>> ----------------------------------------- >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>
