Hi,
I am looking at a single node performance of MUMPS and SuperLU on KNL
7230 (on Theta). I am using KSP example ex2
(http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex2.c.html)
with m X n = 2880 x 2880. KNL runs in cache and quad modes.
Times in seconds for 24 cores:
mumps: 279
superlu: 326
cg: 116
Times in seconds for 64 cores:
mumps: 316
superlu: 410
cg : 49
The performance for 24 cores is OK - both direct solvers are roughly 3.5
times slower than 2x E5-2680v3. (According to people from Intel, the
single core performance of KNL is about 3-4 times lower than that of
E5-2680v3). However, strong scalability is really bad.
I am using cray-petsc/3.7.6.0 module. I tried my own PETSc compilation
with MKL and MUMPS/SuperLU installed by PETSc configure but the results
are similar.
Please find attached Theta submission script and logs for KNL and Haswells.
Why the performance of direct solvers on a full node is so bad?
Best,
Jakub
#!/bin/bash --login
#COBALT -A ATPESC2017
#COBALT -t 60
#COBALT -n 1
#COBALT --attrs mcdram=cache:numa=quad
#COBALT -q debug-cache-quad
##COBALT --attrs mcdram=cache:numa=snc4
##COBALT -q training
export OMP_NUM_THREADS=1
#source /opt/intel/vtune_amplifier_xe_2017.3.0.510739/amplxe-vars.sh
#MUMPS VTune
#time aprun -e LD_LIBRARY_PATH=/opt/intel/vtune_amplifier_xe_2017.3.0.510739/lib64:$LD_LIBRARY_PATH -e PATH=/opt/intel/vtune_amplifier_xe_2017.3.0.510739/bin64:$PATH -n 1 --cc=depth -d 1 -j 1 amplxe-cl -collect advanced-hotspots -r log ./ex2 -m 362 -n 362 -options_left -ksp_type preonly -pc_type cholesky -pc_factor_mat_solver_package mumps -log_view
#CG VTune
#time aprun -e LD_LIBRARY_PATH=/opt/intel/vtune_amplifier_xe_2017.3.0.510739/lib64:$LD_LIBRARY_PATH -e PATH=/opt/intel/vtune_amplifier_xe_2017.3.0.510739/bin64:$PATH -n 1 --cc=depth -d 1 -j 1 amplxe-cl -collect advanced-hotspots -r logcg ./ex2 -m 362 -n 362 -options_left -ksp_type cg -pc_type none -log_view
#echo 'EXEC mumps 1'
#time aprun -n 1 --cc=depth -d 1 -j 1 ./ex2 -m 120 -n 120 -options_left -ksp_type preonly -pc_type cholesky -pc_factor_mat_solver_package mumps -log_view
#echo 'EXEC superlu 1'
#time aprun -n 1 --cc=depth -d 1 -j 1 ./ex2 -m 120 -n 120 -options_left -ksp_type preonly -pc_type cholesky -pc_factor_mat_solver_package superlu -log_view
#echo 'EXEC cg 1'
#time aprun -n 1 --cc=depth -d 1 -j 1 ./ex2 -m 120 -n 120 -options_left -ksp_type cg -pc_type none -log_view
#
echo 'EXEC mumps 24'
time aprun -n 24 --cc=depth -d 1 -j 1 ./ex2 -m 2880 -n 2880 -options_left -ksp_type preonly -pc_type cholesky -pc_factor_mat_solver_package mumps -log_view
echo 'EXEC superlu 24'
time aprun -n 24 --cc=depth -d 1 -j 1 ./ex2 -m 2880 -n 2880 -options_left -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -log_view
echo 'EXEC cg 24'
time aprun -n 24 --cc=depth -d 1 -j 1 ./ex2 -m 2880 -n 2880 -options_left -ksp_type cg -pc_type none -log_view
##full node
echo 'EXEC mumps 64'
time aprun -n 64 --cc=depth -d 1 -j 1 ./ex2 -m 2880 -n 2880 -options_left -ksp_type preonly -pc_type cholesky -pc_factor_mat_solver_package mumps -log_view
echo 'EXEC superlu 64'
time aprun -n 64 --cc=depth -d 1 -j 1 ./ex2 -m 2880 -n 2880 -options_left -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -log_view
echo 'EXEC cg 64'
time aprun -n 64 --cc=depth -d 1 -j 1 ./ex2 -m 2880 -n 2880 -options_left -ksp_type cg -pc_type none -log_view
EXEC mumps 24
Norm of error 4.84159e-08 iterations 1
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./ex2 on a mic-knl named nid03834 with 24 processors, by kruzik Fri Sep 1 08:45:53 2017
Using Petsc Release Version 3.7.6, Apr, 24, 2017
Max Max/Min Avg Total
Time (sec): 2.790e+02 1.00000 2.790e+02
Objects: 2.900e+01 1.00000 2.900e+01
Flops: 4.492e+06 1.00128 4.492e+06 1.078e+08
Flops/sec: 1.610e+04 1.00129 1.610e+04 3.864e+05
MPI Messages: 2.900e+01 1.93333 2.375e+01 5.700e+02
MPI Message Lengths: 5.220e+06 1.60024 1.757e+05 1.001e+08
MPI Reductions: 3.200e+01 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 2.7852e+02 99.8% 1.0780e+08 100.0% 4.780e+02 83.9% 1.747e+05 99.5% 2.100e+01 65.6%
1: Assembly: 4.6272e-01 0.2% 0.0000e+00 0.0% 9.200e+01 16.1% 9.300e+02 0.5% 1.000e+01 31.2%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 1 1.0 1.7899e-02 1.1 3.11e+06 1.0 0.0e+00 0.0e+00 1.0e+00 0 69 0 1 3 0 69 0 1 5 4169
MatSolve 1 1.0 1.0600e+01 1.0 0.00e+00 0.0 4.8e+02 2.1e+05 4.0e+00 4 0 84 99 12 4 0100 99 19 0
MatCholFctrSym 1 1.0 2.4392e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 87 0 0 0 16 88 0 0 0 24 0
MatCholFctrNum 1 1.0 2.1394e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 0
MatGetRowIJ 1 1.0 4.0531e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 1.1401e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNorm 1 1.0 9.2468e-0314.3 6.91e+05 1.0 0.0e+00 0.0e+00 1.0e+00 0 15 0 0 3 0 15 0 0 5 1794
VecSet 3 1.0 6.1807e-0280.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 1 1.0 9.7513e-04 1.6 6.91e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 15 0 0 0 0 15 0 0 0 17012
VecScatterBegin 3 1.0 9.7893e-02 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 67 9 0 0 0 67 14 0
VecScatterEnd 2 1.0 1.9369e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 1 1.0 3.0994e-06 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 2.7602e+02 1.0 0.00e+00 0.0 4.8e+02 2.1e+05 1.6e+01 99 0 84 99 50 99 0100 99 76 0
PCSetUp 1 1.0 2.6542e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 95 0 0 0 38 95 0 0 0 57 0
PCApply 1 1.0 1.0600e+01 1.0 0.00e+00 0.0 4.8e+02 2.1e+05 4.0e+00 4 0 84 99 12 4 0100 99 19 0
--- Event Stage 1: Assembly
MatAssemblyBegin 1 1.0 8.8351e-03 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 6 2 0 0 0 20 0
MatAssemblyEnd 1 1.0 7.2102e-02 1.0 0.00e+00 0.0 9.2e+01 5.8e+03 8.0e+00 0 0 16 1 25 16 0100100 80 0
VecSet 1 1.0 5.2214e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 6 6 60863356 0.
Vector 6 7 77448616 0.
Vector Scatter 2 3 2800 0.
Index Set 7 7 3466300 0.
Krylov Solver 1 1 1160 0.
Preconditioner 1 1 976 0.
Viewer 1 0 0 0.
--- Event Stage 1: Assembly
Vector 2 1 1648 0.
Vector Scatter 1 0 0 0.
Index Set 2 2 13072 0.
========================================================================================================================
Average time to get PetscTime(): 5.96046e-07
Average time for MPI_Barrier(): 2.17915e-05
Average time for zero size MPI_Send(): 2.95043e-06
#PETSc Option Table entries:
-ksp_type preonly
-log_view
-m 2880
-n 2880
-options_left
-pc_factor_mat_solver_package mumps
-pc_type cholesky
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-has-attribute-aligned=1 --known-mpi-int64_t=0 --known-bits-per-byte=8 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --known-level1-dcache-assoc=8 --known-level1-dcache-linesize=64 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mpi-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8 --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with-fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib --with-scalar-type=real --with-shared-ld=ar --with-etags=0 --with-dependencies=0 --with-x=0 --with-ssl=0 --with-shared-libraries=0 --with-dependencies=0 --with-mpi-lib="[]" --with-mpi-include="[]" --with-blas-lapack-lib="-L/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/lib -lsci_intel_mp" --with-superlu=1 --with-superlu-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-superlu-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsuperlu" --with-superlu_dist=1 --with-superlu_dist-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-superlu_dist-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsuperlu_dist" --with-parmetis=1 --with-parmetis-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-parmetis-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lparmetis" --with-metis=1 --with-metis-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-metis-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lmetis" --with-ptscotch=1 --with-ptscotch-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-ptscotch-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/include --with-scalapack-lib="-L/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/lib -lsci_intel_mpi_mp -lsci_intel_mp" --with-mumps=1 --with-mumps-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-mumps-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lptesmumps -lesmumps -lpord" --with-hdf5=1 --with-hdf5-include=/opt/cray/hdf5-parallel/1.10.0.3/INTEL/16.0/include --with-hdf5-lib="-L/opt/cray/hdf5-parallel/1.10.0.3/INTEL/16.0/lib -lhdf5_parallel -lz -ldl" --CFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --CPPFLAGS= --CXXFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --FFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --LIBS=-lstdc++ --CXX_LINKER_FLAGS= --PETSC_ARCH=mic-knl --prefix=/opt/cray/pe/petsc/3.7.6.0/real/INTEL/16.0/mic_knl --with-hypre=1 --with-hypre-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-hypre-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lHYPRE" --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-sundials-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsundials_cvode -lsundials_cvodes -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel -lsundials_nvecserial"
#PETSc Option Table entries:
-ksp_type preonly
-log_view
-m 2880
-n 2880
-options_left
-pc_factor_mat_solver_package mumps
-pc_type cholesky
#End of PETSc Option Table entries
There are no unused options.
Application 3559443 resources: utime ~6459s, stime ~255s, Rss ~1556492, inblocks ~0, outblocks ~8
EXEC superlu 24
Norm of error 3.70407e-08 iterations 1
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./ex2 on a mic-knl named nid03834 with 24 processors, by kruzik Fri Sep 1 08:51:29 2017
Using Petsc Release Version 3.7.6, Apr, 24, 2017
Max Max/Min Avg Total
Time (sec): 3.257e+02 1.00000 3.257e+02
Objects: 2.100e+01 1.00000 2.100e+01
Flops: 4.492e+06 1.00128 4.492e+06 1.078e+08
Flops/sec: 1.379e+04 1.00128 1.379e+04 3.309e+05
MPI Messages: 4.000e+00 2.00000 3.833e+00 9.200e+01
MPI Message Lengths: 4.609e+04 2.00000 1.152e+04 1.060e+06
MPI Reductions: 2.200e+01 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 3.2528e+02 99.9% 1.0780e+08 100.0% 0.000e+00 0.0% 5.760e+03 50.0% 1.100e+01 50.0%
1: Assembly: 4.6172e-01 0.1% 0.0000e+00 0.0% 9.200e+01 100.0% 5.762e+03 50.0% 1.000e+01 45.5%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 1 1.0 1.7704e-02 1.1 3.11e+06 1.0 0.0e+00 0.0e+00 1.0e+00 0 69 0 50 5 0 69 0100 9 4215
MatSolve 1 1.0 6.5075e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
MatLUFactorSym 1 1.0 8.7190e-04 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 1 1.0 3.1822e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 98 0 0 0 0 98 0 0 0 0 0
MatGetRowIJ 1 1.0 5.0068e-06 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 1.0146e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNorm 1 1.0 3.2341e-03 3.0 6.91e+05 1.0 0.0e+00 0.0e+00 1.0e+00 0 15 0 0 5 0 15 0 0 9 5129
VecCopy 1 1.0 8.6808e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 2 1.0 8.1110e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 1 1.0 1.5659e-03 1.5 6.91e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 15 0 0 0 0 15 0 0 0 10594
VecScatterBegin 1 1.0 7.6699e-04 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 50 5 0 0 0100 9 0
VecScatterEnd 1 1.0 9.2983e-05 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 1 1.0 2.1458e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 3.2457e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00100 0 0 0 27 100 0 0 0 55 0
PCSetUp 1 1.0 3.1826e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 98 0 0 0 27 98 0 0 0 55 0
PCApply 1 1.0 6.5075e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
--- Event Stage 1: Assembly
MatAssemblyBegin 1 1.0 6.9442e-03 3.9 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 9 1 0 0 0 20 0
MatAssemblyEnd 1 1.0 7.2409e-02 1.0 0.00e+00 0.0 9.2e+01 5.8e+03 8.0e+00 0 0100 50 36 16 0100100 80 0
VecSet 1 1.0 6.6996e-05 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 6 6 60855820 0.
Vector 3 4 8323912 0.
Vector Scatter 0 1 1072 0.
Index Set 4 4 1385504 0.
Krylov Solver 1 1 1160 0.
Preconditioner 1 1 992 0.
Viewer 1 0 0 0.
--- Event Stage 1: Assembly
Vector 2 1 1648 0.
Vector Scatter 1 0 0 0.
Index Set 2 2 13072 0.
========================================================================================================================
Average time to get PetscTime(): 5.96046e-07
Average time for MPI_Barrier(): 2.77996e-05
Average time for zero size MPI_Send(): 2.87096e-06
#PETSc Option Table entries:
-ksp_type preonly
-log_view
-m 2880
-n 2880
-options_left
-pc_factor_mat_solver_package superlu_dist
-pc_type lu
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-has-attribute-aligned=1 --known-mpi-int64_t=0 --known-bits-per-byte=8 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --known-level1-dcache-assoc=8 --known-level1-dcache-linesize=64 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mpi-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8 --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with-fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib --with-scalar-type=real --with-shared-ld=ar --with-etags=0 --with-dependencies=0 --with-x=0 --with-ssl=0 --with-shared-libraries=0 --with-dependencies=0 --with-mpi-lib="[]" --with-mpi-include="[]" --with-blas-lapack-lib="-L/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/lib -lsci_intel_mp" --with-superlu=1 --with-superlu-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-superlu-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsuperlu" --with-superlu_dist=1 --with-superlu_dist-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-superlu_dist-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsuperlu_dist" --with-parmetis=1 --with-parmetis-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-parmetis-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lparmetis" --with-metis=1 --with-metis-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-metis-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lmetis" --with-ptscotch=1 --with-ptscotch-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-ptscotch-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/include --with-scalapack-lib="-L/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/lib -lsci_intel_mpi_mp -lsci_intel_mp" --with-mumps=1 --with-mumps-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-mumps-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lptesmumps -lesmumps -lpord" --with-hdf5=1 --with-hdf5-include=/opt/cray/hdf5-parallel/1.10.0.3/INTEL/16.0/include --with-hdf5-lib="-L/opt/cray/hdf5-parallel/1.10.0.3/INTEL/16.0/lib -lhdf5_parallel -lz -ldl" --CFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --CPPFLAGS= --CXXFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --FFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --LIBS=-lstdc++ --CXX_LINKER_FLAGS= --PETSC_ARCH=mic-knl --prefix=/opt/cray/pe/petsc/3.7.6.0/real/INTEL/16.0/mic_knl --with-hypre=1 --with-hypre-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-hypre-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lHYPRE" --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-sundials-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsundials_cvode -lsundials_cvodes -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel -lsundials_nvecserial"
#PETSc Option Table entries:
-ksp_type preonly
-log_view
-m 2880
-n 2880
-options_left
-pc_factor_mat_solver_package superlu_dist
-pc_type lu
#End of PETSc Option Table entries
There are no unused options.
Application 3559573 resources: utime ~7251s, stime ~641s, Rss ~2935384, inblocks ~0, outblocks ~8
EXEC cg 24
Norm of error 4.42164e-05 iterations 5073
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./ex2 on a mic-knl named nid03834 with 24 processors, by kruzik Fri Sep 1 08:53:39 2017
Using Petsc Release Version 3.7.6, Apr, 24, 2017
Max Max/Min Avg Total
Time (sec): 1.158e+02 1.00000 1.158e+02
Objects: 1.700e+01 1.00000 1.700e+01
Flops: 3.682e+10 1.00079 3.682e+10 8.836e+11
Flops/sec: 3.181e+08 1.00080 3.181e+08 7.634e+09
MPI Messages: 4.000e+00 2.00000 3.833e+00 9.200e+01
MPI Message Lengths: 1.169e+08 2.00000 2.923e+07 2.689e+09
MPI Reductions: 2.031e+04 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 1.1529e+02 99.6% 8.8362e+11 100.0% 0.000e+00 0.0% 2.923e+07 100.0% 2.030e+04 99.9%
1: Assembly: 4.6260e-01 0.4% 0.0000e+00 0.0% 9.200e+01 100.0% 5.762e+03 0.0% 1.000e+01 0.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 5074 1.0 8.8045e+01 1.1 1.58e+10 1.0 0.0e+00 0.0e+00 5.1e+03 72 43 0100 25 72 43 0100 25 4301
VecTDot 10146 1.0 1.3462e+01 1.9 7.01e+09 1.0 0.0e+00 0.0e+00 1.0e+04 11 19 0 0 50 11 19 0 0 50 12502
VecNorm 5075 1.0 2.6521e+00 1.1 3.51e+09 1.0 0.0e+00 0.0e+00 5.1e+03 2 10 0 0 25 2 10 0 0 25 31744
VecCopy 5076 1.0 2.8991e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 3 0 0 0 0 0
VecSet 2 1.0 9.0909e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 10147 1.0 9.6226e+00 1.0 7.01e+09 1.0 0.0e+00 0.0e+00 0.0e+00 8 19 0 0 0 8 19 0 0 0 17493
VecAYPX 5072 1.0 4.6650e+00 1.0 3.51e+09 1.0 0.0e+00 0.0e+00 0.0e+00 4 10 0 0 0 4 10 0 0 0 18036
VecScatterBegin 5074 1.0 5.8432e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 5.1e+03 0 0 0100 25 0 0 0100 25 0
VecScatterEnd 5074 1.0 1.7175e-01 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 1 1.0 2.5356e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 1.1508e+02 1.0 3.68e+10 1.0 0.0e+00 0.0e+00 2.0e+04 99100 0100100 100100 0100100 7677
PCSetUp 1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCApply 5074 1.0 2.9219e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0
--- Event Stage 1: Assembly
MatAssemblyBegin 1 1.0 8.0721e-03 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 1 0 0 0 20 0
MatAssemblyEnd 1 1.0 7.1958e-02 1.0 0.00e+00 0.0 9.2e+01 5.8e+03 8.0e+00 0 0100 0 0 16 0100100 80 0
VecSet 1 1.0 5.1022e-05 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 3 3 52551868 0.
Vector 6 7 16623256 0.
Vector Scatter 0 1 1072 0.
Krylov Solver 1 1 1232 0.
Preconditioner 1 1 816 0.
Viewer 1 0 0 0.
--- Event Stage 1: Assembly
Vector 2 1 1648 0.
Vector Scatter 1 0 0 0.
Index Set 2 2 13072 0.
========================================================================================================================
Average time to get PetscTime(): 5.96046e-07
Average time for MPI_Barrier(): 4.00543e-06
Average time for zero size MPI_Send(): 1.80403e-05
#PETSc Option Table entries:
-ksp_type cg
-log_view
-m 2880
-n 2880
-options_left
-pc_type none
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-has-attribute-aligned=1 --known-mpi-int64_t=0 --known-bits-per-byte=8 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --known-level1-dcache-assoc=8 --known-level1-dcache-linesize=64 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mpi-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8 --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with-fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib --with-scalar-type=real --with-shared-ld=ar --with-etags=0 --with-dependencies=0 --with-x=0 --with-ssl=0 --with-shared-libraries=0 --with-dependencies=0 --with-mpi-lib="[]" --with-mpi-include="[]" --with-blas-lapack-lib="-L/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/lib -lsci_intel_mp" --with-superlu=1 --with-superlu-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-superlu-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsuperlu" --with-superlu_dist=1 --with-superlu_dist-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-superlu_dist-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsuperlu_dist" --with-parmetis=1 --with-parmetis-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-parmetis-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lparmetis" --with-metis=1 --with-metis-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-metis-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lmetis" --with-ptscotch=1 --with-ptscotch-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-ptscotch-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/include --with-scalapack-lib="-L/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/lib -lsci_intel_mpi_mp -lsci_intel_mp" --with-mumps=1 --with-mumps-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-mumps-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lptesmumps -lesmumps -lpord" --with-hdf5=1 --with-hdf5-include=/opt/cray/hdf5-parallel/1.10.0.3/INTEL/16.0/include --with-hdf5-lib="-L/opt/cray/hdf5-parallel/1.10.0.3/INTEL/16.0/lib -lhdf5_parallel -lz -ldl" --CFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --CPPFLAGS= --CXXFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --FFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --LIBS=-lstdc++ --CXX_LINKER_FLAGS= --PETSC_ARCH=mic-knl --prefix=/opt/cray/pe/petsc/3.7.6.0/real/INTEL/16.0/mic_knl --with-hypre=1 --with-hypre-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-hypre-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lHYPRE" --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-sundials-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsundials_cvode -lsundials_cvodes -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel -lsundials_nvecserial"
#PETSc Option Table entries:
-ksp_type cg
-log_view
-m 2880
-n 2880
-options_left
-pc_type none
#End of PETSc Option Table entries
There are no unused options.
Application 3559576 resources: utime ~2772s, stime ~12s, Rss ~54076, inblocks ~0, outblocks ~8
EXEC mumps 64
Norm of error 4.84002e-08 iterations 1
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./ex2 on a mic-knl named nid03834 with 64 processors, by kruzik Fri Sep 1 08:59:05 2017
Using Petsc Release Version 3.7.6, Apr, 24, 2017
Max Max/Min Avg Total
Time (sec): 3.163e+02 1.00000 3.163e+02
Objects: 2.900e+01 1.00000 2.900e+01
Flops: 1.685e+06 1.00343 1.684e+06 1.078e+08
Flops/sec: 5.327e+03 1.00344 5.326e+03 3.409e+05
MPI Messages: 5.000e+01 2.08333 3.500e+01 2.240e+03
MPI Message Lengths: 2.372e+06 1.95677 4.572e+04 1.024e+08
MPI Reductions: 3.200e+01 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 3.1605e+02 99.9% 1.0780e+08 100.0% 1.988e+03 88.8% 4.507e+04 98.6% 2.100e+01 65.6%
1: Assembly: 2.0319e-01 0.1% 0.0000e+00 0.0% 2.520e+02 11.2% 6.482e+02 1.4% 1.000e+01 31.2%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 1 1.0 1.0411e-02 1.6 1.17e+06 1.0 0.0e+00 0.0e+00 1.0e+00 0 69 0 1 3 0 69 0 1 5 7168
MatSolve 1 1.0 9.0226e+00 1.0 0.00e+00 0.0 2.0e+03 5.0e+04 4.0e+00 3 0 89 97 12 3 0100 99 19 0
MatCholFctrSym 1 1.0 2.6055e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 82 0 0 0 16 82 0 0 0 24 0
MatCholFctrNum 1 1.0 3.5724e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 11 0 0 0 0 11 0 0 0 0 0
MatGetRowIJ 1 1.0 5.0068e-06 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 2.3347e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNorm 1 1.0 5.1069e-0310.3 2.59e+05 1.0 0.0e+00 0.0e+00 1.0e+00 0 15 0 0 3 0 15 0 0 5 3248
VecSet 3 1.0 6.8530e-02274.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 1 1.0 5.9485e-04 2.3 2.59e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 15 0 0 0 0 15 0 0 0 27887
VecScatterBegin 3 1.0 1.1337e-0110.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 66 9 0 0 0 67 14 0
VecScatterEnd 2 1.0 9.8991e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 1 1.0 2.7180e-0528.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 3.0541e+02 1.0 0.00e+00 0.0 2.0e+03 5.0e+04 1.6e+01 97 0 89 97 50 97 0100 99 76 0
PCSetUp 1 1.0 2.9639e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 94 0 0 0 38 94 0 0 0 57 0
PCApply 1 1.0 9.0227e+00 1.0 0.00e+00 0.0 2.0e+03 5.0e+04 4.0e+00 3 0 89 97 12 3 0100 99 19 0
--- Event Stage 1: Assembly
MatAssemblyBegin 1 1.0 1.5130e-02 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 6 7 0 0 0 20 0
MatAssemblyEnd 1 1.0 4.3766e-02 1.0 0.00e+00 0.0 2.5e+02 5.8e+03 8.0e+00 0 0 11 1 25 21 0100100 80 0
VecSet 1 1.0 6.5088e-05 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 6 6 22847356 0.
Vector 6 7 70536616 0.
Vector Scatter 2 3 2800 0.
Index Set 7 7 1181164 0.
Krylov Solver 1 1 1160 0.
Preconditioner 1 1 976 0.
Viewer 1 0 0 0.
--- Event Stage 1: Assembly
Vector 2 1 1648 0.
Vector Scatter 1 0 0 0.
Index Set 2 2 13072 0.
========================================================================================================================
Average time to get PetscTime(): 6.91414e-07
Average time for MPI_Barrier(): 2.47955e-05
Average time for zero size MPI_Send(): 2.86102e-06
#PETSc Option Table entries:
-ksp_type preonly
-log_view
-m 2880
-n 2880
-options_left
-pc_factor_mat_solver_package mumps
-pc_type cholesky
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-has-attribute-aligned=1 --known-mpi-int64_t=0 --known-bits-per-byte=8 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --known-level1-dcache-assoc=8 --known-level1-dcache-linesize=64 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mpi-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8 --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with-fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib --with-scalar-type=real --with-shared-ld=ar --with-etags=0 --with-dependencies=0 --with-x=0 --with-ssl=0 --with-shared-libraries=0 --with-dependencies=0 --with-mpi-lib="[]" --with-mpi-include="[]" --with-blas-lapack-lib="-L/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/lib -lsci_intel_mp" --with-superlu=1 --with-superlu-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-superlu-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsuperlu" --with-superlu_dist=1 --with-superlu_dist-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-superlu_dist-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsuperlu_dist" --with-parmetis=1 --with-parmetis-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-parmetis-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lparmetis" --with-metis=1 --with-metis-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-metis-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lmetis" --with-ptscotch=1 --with-ptscotch-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-ptscotch-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/include --with-scalapack-lib="-L/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/lib -lsci_intel_mpi_mp -lsci_intel_mp" --with-mumps=1 --with-mumps-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-mumps-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lptesmumps -lesmumps -lpord" --with-hdf5=1 --with-hdf5-include=/opt/cray/hdf5-parallel/1.10.0.3/INTEL/16.0/include --with-hdf5-lib="-L/opt/cray/hdf5-parallel/1.10.0.3/INTEL/16.0/lib -lhdf5_parallel -lz -ldl" --CFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --CPPFLAGS= --CXXFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --FFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --LIBS=-lstdc++ --CXX_LINKER_FLAGS= --PETSC_ARCH=mic-knl --prefix=/opt/cray/pe/petsc/3.7.6.0/real/INTEL/16.0/mic_knl --with-hypre=1 --with-hypre-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-hypre-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lHYPRE" --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-sundials-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsundials_cvode -lsundials_cvodes -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel -lsundials_nvecserial"
#PETSc Option Table entries:
-ksp_type preonly
-log_view
-m 2880
-n 2880
-options_left
-pc_factor_mat_solver_package mumps
-pc_type cholesky
#End of PETSc Option Table entries
There are no unused options.
Application 3559577 resources: utime ~17242s, stime ~3064s, Rss ~1517632, inblocks ~0, outblocks ~8
EXEC superlu 64
Norm of error 3.70406e-08 iterations 1
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./ex2 on a mic-knl named nid03834 with 64 processors, by kruzik Fri Sep 1 09:06:06 2017
Using Petsc Release Version 3.7.6, Apr, 24, 2017
Max Max/Min Avg Total
Time (sec): 4.096e+02 1.00000 4.096e+02
Objects: 2.100e+01 1.00000 2.100e+01
Flops: 1.685e+06 1.00343 1.684e+06 1.078e+08
Flops/sec: 4.113e+03 1.00343 4.113e+03 2.632e+05
MPI Messages: 4.000e+00 2.00000 3.938e+00 2.520e+02
MPI Message Lengths: 4.609e+04 2.00000 1.152e+04 2.904e+06
MPI Reductions: 2.200e+01 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 4.0936e+02 100.0% 1.0780e+08 100.0% 0.000e+00 0.0% 5.760e+03 50.0% 1.100e+01 50.0%
1: Assembly: 2.0069e-01 0.0% 0.0000e+00 0.0% 2.520e+02 100.0% 5.762e+03 50.0% 1.000e+01 45.5%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 1 1.0 1.0118e-02 1.5 1.17e+06 1.0 0.0e+00 0.0e+00 1.0e+00 0 69 0 50 5 0 69 0100 9 7376
MatSolve 1 1.0 6.3445e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
MatLUFactorSym 1 1.0 2.0211e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 1 1.0 4.0098e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 98 0 0 0 0 98 0 0 0 0 0
MatGetRowIJ 1 1.0 5.0068e-06 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 2.2703e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNorm 1 1.0 3.1960e-03 2.6 2.59e+05 1.0 0.0e+00 0.0e+00 1.0e+00 0 15 0 0 5 0 15 0 0 9 5190
VecCopy 1 1.0 3.4308e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 2 1.0 2.6298e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 1 1.0 1.0440e-03 1.8 2.59e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 15 0 0 0 0 15 0 0 0 15889
VecScatterBegin 1 1.0 3.8829e-0315.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 50 5 0 0 0100 9 0
VecScatterEnd 1 1.0 2.1410e-04 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 1 1.0 3.0994e-06 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 4.0713e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 99 0 0 0 27 99 0 0 0 55 0
PCSetUp 1 1.0 4.0108e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 98 0 0 0 27 98 0 0 0 55 0
PCApply 1 1.0 6.3445e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
--- Event Stage 1: Assembly
MatAssemblyBegin 1 1.0 1.2871e-02 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 9 5 0 0 0 20 0
MatAssemblyEnd 1 1.0 4.3551e-02 1.0 0.00e+00 0.0 2.5e+02 5.8e+03 8.0e+00 0 0100 50 36 21 0100100 80 0
VecSet 1 1.0 8.9884e-05 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 6 6 22839820 0.
Vector 3 4 3139912 0.
Vector Scatter 0 1 1072 0.
Index Set 4 4 521504 0.
Krylov Solver 1 1 1160 0.
Preconditioner 1 1 992 0.
Viewer 1 0 0 0.
--- Event Stage 1: Assembly
Vector 2 1 1648 0.
Vector Scatter 1 0 0 0.
Index Set 2 2 13072 0.
========================================================================================================================
Average time to get PetscTime(): 5.96046e-07
Average time for MPI_Barrier(): 2.52247e-05
Average time for zero size MPI_Send(): 4.28036e-06
#PETSc Option Table entries:
-ksp_type preonly
-log_view
-m 2880
-n 2880
-options_left
-pc_factor_mat_solver_package superlu_dist
-pc_type lu
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-has-attribute-aligned=1 --known-mpi-int64_t=0 --known-bits-per-byte=8 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --known-level1-dcache-assoc=8 --known-level1-dcache-linesize=64 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mpi-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8 --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with-fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib --with-scalar-type=real --with-shared-ld=ar --with-etags=0 --with-dependencies=0 --with-x=0 --with-ssl=0 --with-shared-libraries=0 --with-dependencies=0 --with-mpi-lib="[]" --with-mpi-include="[]" --with-blas-lapack-lib="-L/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/lib -lsci_intel_mp" --with-superlu=1 --with-superlu-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-superlu-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsuperlu" --with-superlu_dist=1 --with-superlu_dist-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-superlu_dist-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsuperlu_dist" --with-parmetis=1 --with-parmetis-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-parmetis-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lparmetis" --with-metis=1 --with-metis-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-metis-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lmetis" --with-ptscotch=1 --with-ptscotch-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-ptscotch-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/include --with-scalapack-lib="-L/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/lib -lsci_intel_mpi_mp -lsci_intel_mp" --with-mumps=1 --with-mumps-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-mumps-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lptesmumps -lesmumps -lpord" --with-hdf5=1 --with-hdf5-include=/opt/cray/hdf5-parallel/1.10.0.3/INTEL/16.0/include --with-hdf5-lib="-L/opt/cray/hdf5-parallel/1.10.0.3/INTEL/16.0/lib -lhdf5_parallel -lz -ldl" --CFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --CPPFLAGS= --CXXFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --FFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --LIBS=-lstdc++ --CXX_LINKER_FLAGS= --PETSC_ARCH=mic-knl --prefix=/opt/cray/pe/petsc/3.7.6.0/real/INTEL/16.0/mic_knl --with-hypre=1 --with-hypre-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-hypre-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lHYPRE" --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-sundials-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsundials_cvode -lsundials_cvodes -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel -lsundials_nvecserial"
#PETSc Option Table entries:
-ksp_type preonly
-log_view
-m 2880
-n 2880
-options_left
-pc_factor_mat_solver_package superlu_dist
-pc_type lu
#End of PETSc Option Table entries
There are no unused options.
Application 3559578 resources: utime ~18785s, stime ~7931s, Rss ~2388096, inblocks ~0, outblocks ~8
EXEC cg 64
Norm of error 4.42164e-05 iterations 5073
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./ex2 on a mic-knl named nid03834 with 64 processors, by kruzik Fri Sep 1 09:07:13 2017
Using Petsc Release Version 3.7.6, Apr, 24, 2017
Max Max/Min Avg Total
Time (sec): 4.898e+01 1.00002 4.898e+01
Objects: 1.700e+01 1.00000 1.700e+01
Flops: 1.381e+10 1.00212 1.381e+10 8.836e+11
Flops/sec: 2.819e+08 1.00214 2.819e+08 1.804e+10
MPI Messages: 4.000e+00 2.00000 3.938e+00 2.520e+02
MPI Message Lengths: 1.169e+08 2.00000 2.923e+07 7.366e+09
MPI Reductions: 2.031e+04 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 4.8775e+01 99.6% 8.8362e+11 100.0% 0.000e+00 0.0% 2.923e+07 100.0% 2.030e+04 99.9%
1: Assembly: 2.0200e-01 0.4% 0.0000e+00 0.0% 2.520e+02 100.0% 5.762e+03 0.0% 1.000e+01 0.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 5074 1.0 3.4802e+01 1.1 5.92e+09 1.0 0.0e+00 0.0e+00 5.1e+03 66 43 0100 25 66 43 0100 25 10880
VecTDot 10146 1.0 6.0924e+00 1.8 2.63e+09 1.0 0.0e+00 0.0e+00 1.0e+04 12 19 0 0 50 12 19 0 0 50 27626
VecNorm 5075 1.0 1.3568e+00 1.1 1.32e+09 1.0 0.0e+00 0.0e+00 5.1e+03 3 10 0 0 25 3 10 0 0 25 62051
VecCopy 5076 1.0 2.4999e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0
VecSet 2 1.0 3.0589e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 10147 1.0 4.3560e+00 1.0 2.63e+09 1.0 0.0e+00 0.0e+00 0.0e+00 9 19 0 0 0 9 19 0 0 0 38642
VecAYPX 5072 1.0 2.0096e+00 1.0 1.31e+09 1.0 0.0e+00 0.0e+00 0.0e+00 4 10 0 0 0 4 10 0 0 0 41868
VecScatterBegin 5074 1.0 7.7954e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 5.1e+03 2 0 0100 25 2 0 0100 25 0
VecScatterEnd 5074 1.0 1.7750e-01 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 1 1.0 6.2883e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 4.8374e+01 1.0 1.38e+10 1.0 0.0e+00 0.0e+00 2.0e+04 99100 0100100 99100 0100100 18264
PCSetUp 1 1.0 2.1458e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCApply 5074 1.0 2.5297e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0
--- Event Stage 1: Assembly
MatAssemblyBegin 1 1.0 1.5381e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 7 0 0 0 20 0
MatAssemblyEnd 1 1.0 4.2407e-02 1.0 0.00e+00 0.0 2.5e+02 5.8e+03 8.0e+00 0 0100 0 0 21 0100100 80 0
VecSet 1 1.0 5.2929e-05 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 3 3 19719868 0.
Vector 6 7 6255256 0.
Vector Scatter 0 1 1072 0.
Krylov Solver 1 1 1232 0.
Preconditioner 1 1 816 0.
Viewer 1 0 0 0.
--- Event Stage 1: Assembly
Vector 2 1 1648 0.
Vector Scatter 1 0 0 0.
Index Set 2 2 13072 0.
========================================================================================================================
Average time to get PetscTime(): 5.96046e-07
Average time for MPI_Barrier(): 6.58035e-06
Average time for zero size MPI_Send(): 2.02991e-05
#PETSc Option Table entries:
-ksp_type cg
-log_view
-m 2880
-n 2880
-options_left
-pc_type none
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-has-attribute-aligned=1 --known-mpi-int64_t=0 --known-bits-per-byte=8 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --known-level1-dcache-assoc=8 --known-level1-dcache-linesize=64 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mpi-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8 --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with-fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib --with-scalar-type=real --with-shared-ld=ar --with-etags=0 --with-dependencies=0 --with-x=0 --with-ssl=0 --with-shared-libraries=0 --with-dependencies=0 --with-mpi-lib="[]" --with-mpi-include="[]" --with-blas-lapack-lib="-L/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/lib -lsci_intel_mp" --with-superlu=1 --with-superlu-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-superlu-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsuperlu" --with-superlu_dist=1 --with-superlu_dist-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-superlu_dist-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsuperlu_dist" --with-parmetis=1 --with-parmetis-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-parmetis-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lparmetis" --with-metis=1 --with-metis-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-metis-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lmetis" --with-ptscotch=1 --with-ptscotch-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-ptscotch-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/include --with-scalapack-lib="-L/opt/cray/libsci/17.06.1.1/INTEL/15.0/x86_64/lib -lsci_intel_mpi_mp -lsci_intel_mp" --with-mumps=1 --with-mumps-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-mumps-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lptesmumps -lesmumps -lpord" --with-hdf5=1 --with-hdf5-include=/opt/cray/hdf5-parallel/1.10.0.3/INTEL/16.0/include --with-hdf5-lib="-L/opt/cray/hdf5-parallel/1.10.0.3/INTEL/16.0/lib -lhdf5_parallel -lz -ldl" --CFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --CPPFLAGS= --CXXFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --FFLAGS="-xMIC-AVX512 -qopenmp -O3 -fpic" --LIBS=-lstdc++ --CXX_LINKER_FLAGS= --PETSC_ARCH=mic-knl --prefix=/opt/cray/pe/petsc/3.7.6.0/real/INTEL/16.0/mic_knl --with-hypre=1 --with-hypre-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-hypre-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lHYPRE" --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/include --with-sundials-lib="-L/opt/cray/tpsl/17.06.1/INTEL/16.0/mic_knl/lib -lsundials_cvode -lsundials_cvodes -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel -lsundials_nvecserial"
#PETSc Option Table entries:
-ksp_type cg
-log_view
-m 2880
-n 2880
-options_left
-pc_type none
#End of PETSc Option Table entries
There are no unused options.
Application 3559581 resources: utime ~3105s, stime ~46s, Rss ~24944, inblocks ~0, outblocks ~8
EXEC mumps 24
Norm of error 5.01395e-08 iterations 1
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./ex2 on a icc17-impi-mkl-opt named r26u23n707 with 24 processors, by kru0097 Fri Sep 1 10:18:45 2017
Using Petsc Release Version 3.7.4, Oct, 02, 2016
Max Max/Min Avg Total
Time (sec): 8.099e+01 1.00000 8.099e+01
Objects: 2.900e+01 1.00000 2.900e+01
Flops: 4.492e+06 1.00128 4.492e+06 1.078e+08
Flops/sec: 5.547e+04 1.00128 5.546e+04 1.331e+06
MPI Messages: 4.350e+01 1.93333 3.562e+01 8.550e+02
MPI Message Lengths: 6.626e+06 1.49442 1.563e+05 1.336e+08
MPI Reductions: 3.000e+01 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 8.0874e+01 99.9% 1.0780e+08 100.0% 7.630e+02 89.2% 1.556e+05 99.6% 1.900e+01 63.3%
1: Assembly: 1.1931e-01 0.1% 0.0000e+00 0.0% 9.200e+01 10.8% 6.200e+02 0.4% 1.000e+01 33.3%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 1 1.0 6.9609e-03 1.0 3.11e+06 1.0 4.6e+01 2.3e+04 0.0e+00 0 69 5 1 0 0 69 6 1 0 10721
MatSolve 1 1.0 4.6740e+00 1.0 0.00e+00 0.0 7.2e+02 1.8e+05 3.0e+00 6 0 84 99 10 6 0 94 99 16 0
MatCholFctrSym 1 1.0 6.7636e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 84 0 0 0 17 84 0 0 0 26 0
MatCholFctrNum 1 1.0 8.3748e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 10 0 0 0 0 10 0 0 0 0 0
MatGetRowIJ 1 1.0 6.9141e-06 7.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 2.1758e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNorm 1 1.0 2.4281e-0314.8 6.91e+05 1.0 0.0e+00 0.0e+00 1.0e+00 0 15 0 0 3 0 15 0 0 5 6832
VecSet 3 1.0 4.6908e-0227.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 1 1.0 1.0700e-03 2.1 6.91e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 15 0 0 0 0 15 0 0 0 15503
VecScatterBegin 3 1.0 1.6796e-02 5.8 0.00e+00 0.0 2.8e+02 3.5e+05 1.0e+00 0 0 33 75 3 0 0 37 75 5 0
VecScatterEnd 2 1.0 5.8331e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 1 1.0 9.5367e-07 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 8.0708e+01 1.0 0.00e+00 0.0 7.2e+02 1.8e+05 1.5e+01100 0 84 99 50 100 0 94 99 79 0
PCSetUp 1 1.0 7.6033e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 94 0 0 0 40 94 0 0 0 63 0
PCApply 1 1.0 4.6741e+00 1.0 0.00e+00 0.0 7.2e+02 1.8e+05 3.0e+00 6 0 84 99 10 6 0 94 99 16 0
--- Event Stage 1: Assembly
MatAssemblyBegin 1 1.0 1.7312e-0312.3 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 7 1 0 0 0 20 0
MatAssemblyEnd 1 1.0 2.8178e-02 1.0 0.00e+00 0.0 9.2e+01 5.8e+03 8.0e+00 0 0 11 0 27 24 0100100 80 0
VecSet 1 1.0 2.6941e-05 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 6 6 60862020 0.
Vector 6 7 77448616 0.
Vector Scatter 2 3 2800 0.
Index Set 7 7 3466300 0.
Krylov Solver 1 1 1160 0.
Preconditioner 1 1 976 0.
Viewer 1 0 0 0.
--- Event Stage 1: Assembly
Vector 2 1 1648 0.
Vector Scatter 1 0 0 0.
Index Set 2 2 13072 0.
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 3.38554e-06
Average time for zero size MPI_Send(): 1.70867e-06
#PETSc Option Table entries:
-ksp_type preonly
-log_view
-m 2880
-n 2880
-options_left
-pc_factor_mat_solver_package mumps
-pc_type cholesky
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-has-attribute-aligned=1 --known-level1-dcache-size=32768 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=8 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --CFLAGS="-fPIC -O3 -xHost" --CXXFLAGS="-fPIC -O3 -xHost" --FFLAGS="-fPIC -O3 -xHost" --with-blas-lapack-lib="[/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64_lin/libmkl_intel_lp64.a,libmkl_sequential.a,libmkl_core.a]" --with-c++-support --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-gnu-compilers=0 --with-mpi=1 --known-mpi-shared-libraries=0 --with-ar=ar --with-batch=1 --download-metis --download-mumps --download-parmetis --download-superlu_dist --download-superlu --download-suitesparse --with-scalapack-include=/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/include --with-scalapack-lib="[/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64_lin/libmkl_scalapack_lp64.a,libmkl_blacs_intelmpi_lp64.a,libmkl_intel_lp64.a,libmkl_sequential.a,libmkl_core.a]" --with-scalapack=1 --with-shared-libraries=0 --with-windows-graphics=0 --with-x=0 --with-pic=1 PETSC_ARCH=icc17-impi-mkl-opt
-----------------------------------------
Libraries compiled on Tue Dec 6 14:50:11 2016 on login1
Machine characteristics: Linux-2.6.32-573.12.1.el6.noc0w.x86_64-x86_64-with-centos-6.7-Final
Using PETSc directory: /scratch/work/project/permon/petsc/petsc-3.7.4
Using PETSc arch: icc17-impi-mkl-opt
-----------------------------------------
Using C compiler: mpicc -fPIC -O3 -xHost ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90 -fPIC -O3 -xHost ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/include -I/scratch/work/project/permon/petsc/petsc-3.7.4/include -I/scratch/work/project/permon/petsc/petsc-3.7.4/include -I/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/include
-----------------------------------------
Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib -L/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib -lpetsc -Wl,-rpath,/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib -L/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib -lsuperlu_dist -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -Wl,-rpath,/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib64 -L/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib64 -lsuperlu -Wl,-rpath,/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64_lin -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64_lin -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lhwloc -lssl -lcrypto -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/binutils/2.26-GCCcore-5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64_lin -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64_lin -Wl,-rpath,/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -Wl,-rpath,/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -Wl,-rpath,/opt/intel/mpi-rt/2017.0.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/2017.0.0/intel64/lib -lifport -lifcoremt_pic -lm -lmpicxx -ldl -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -lmpifort -lmpi -lmpigi -lrt -lpthread -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/binutils/2.26-GCCcore-5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64_lin -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/binutils/2.26-GCCcore-5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib -Wl,-rpath,/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -Wl,-rpath,/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -Wl,-rpath,/opt/intel/mpi-rt/2017.0.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/2017.0.0/intel64/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/binutils/2.26-GCCcore-5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64_lin -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-G
#PETSc Option Table entries:
-ksp_type preonly
-log_view
-m 2880
-n 2880
-options_left
-pc_factor_mat_solver_package mumps
-pc_type cholesky
#End of PETSc Option Table entries
There are no unused options.
EXEC superlu 24
Norm of error 3.59803e-08 iterations 1
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./ex2 on a icc17-impi-mkl-opt named r26u23n707 with 24 processors, by kru0097 Fri Sep 1 10:20:12 2017
Using Petsc Release Version 3.7.4, Oct, 02, 2016
Max Max/Min Avg Total
Time (sec): 8.672e+01 1.00000 8.672e+01
Objects: 2.100e+01 1.00000 2.100e+01
Flops: 4.492e+06 1.00128 4.492e+06 1.078e+08
Flops/sec: 5.180e+04 1.00128 5.180e+04 1.243e+06
MPI Messages: 6.000e+00 2.00000 5.750e+00 1.380e+02
MPI Message Lengths: 6.913e+04 2.00000 1.152e+04 1.590e+06
MPI Reductions: 2.100e+01 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 8.6585e+01 99.8% 1.0780e+08 100.0% 4.600e+01 33.3% 7.680e+03 66.7% 1.000e+01 47.6%
1: Assembly: 1.3247e-01 0.2% 0.0000e+00 0.0% 9.200e+01 66.7% 3.841e+03 33.3% 1.000e+01 47.6%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 1 1.0 6.9120e-03 1.0 3.11e+06 1.0 4.6e+01 2.3e+04 0.0e+00 0 69 33 67 0 0 69100100 0 10797
MatSolve 1 1.0 1.8347e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
MatLUFactorSym 1 1.0 1.0290e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 1 1.0 8.4601e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 98 0 0 0 0 98 0 0 0 0 0
MatGetRowIJ 1 1.0 5.0068e-06 5.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 2.1930e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNorm 1 1.0 1.4660e-03 3.2 6.91e+05 1.0 0.0e+00 0.0e+00 1.0e+00 0 15 0 0 5 0 15 0 0 10 11315
VecCopy 1 1.0 1.6260e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 2 1.0 1.9021e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 1 1.0 1.7030e-03 1.1 6.91e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 15 0 0 0 0 15 0 0 0 9741
VecScatterBegin 1 1.0 1.5783e-04 1.8 0.00e+00 0.0 4.6e+01 2.3e+04 0.0e+00 0 0 33 67 0 0 0100100 0 0
VecScatterEnd 1 1.0 9.8228e-05 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 8.6448e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00100 0 0 0 29 100 0 0 0 60 0
PCSetUp 1 1.0 8.4612e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 98 0 0 0 29 98 0 0 0 60 0
PCApply 1 1.0 1.8347e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
--- Event Stage 1: Assembly
MatAssemblyBegin 1 1.0 1.6551e-0311.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 10 0 0 0 0 20 0
MatAssemblyEnd 1 1.0 4.1639e-02 1.0 0.00e+00 0.0 9.2e+01 5.8e+03 8.0e+00 0 0 67 33 38 31 0100100 80 0
VecSet 1 1.0 2.6941e-05 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 6 6 60855828 0.
Vector 3 4 8323912 0.
Vector Scatter 0 1 1072 0.
Index Set 4 4 1385504 0.
Krylov Solver 1 1 1160 0.
Preconditioner 1 1 992 0.
Viewer 1 0 0 0.
--- Event Stage 1: Assembly
Vector 2 1 1648 0.
Vector Scatter 1 0 0 0.
Index Set 2 2 13072 0.
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 3.00407e-06
Average time for zero size MPI_Send(): 1.41064e-06
#PETSc Option Table entries:
-ksp_type preonly
-log_view
-m 2880
-n 2880
-options_left
-pc_factor_mat_solver_package superlu_dist
-pc_type lu
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-has-attribute-aligned=1 --known-level1-dcache-size=32768 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=8 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --CFLAGS="-fPIC -O3 -xHost" --CXXFLAGS="-fPIC -O3 -xHost" --FFLAGS="-fPIC -O3 -xHost" --with-blas-lapack-lib="[/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64_lin/libmkl_intel_lp64.a,libmkl_sequential.a,libmkl_core.a]" --with-c++-support --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-gnu-compilers=0 --with-mpi=1 --known-mpi-shared-libraries=0 --with-ar=ar --with-batch=1 --download-metis --download-mumps --download-parmetis --download-superlu_dist --download-superlu --download-suitesparse --with-scalapack-include=/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/include --with-scalapack-lib="[/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64_lin/libmkl_scalapack_lp64.a,libmkl_blacs_intelmpi_lp64.a,libmkl_intel_lp64.a,libmkl_sequential.a,libmkl_core.a]" --with-scalapack=1 --with-shared-libraries=0 --with-windows-graphics=0 --with-x=0 --with-pic=1 PETSC_ARCH=icc17-impi-mkl-opt
-----------------------------------------
Libraries compiled on Tue Dec 6 14:50:11 2016 on login1
Machine characteristics: Linux-2.6.32-573.12.1.el6.noc0w.x86_64-x86_64-with-centos-6.7-Final
Using PETSc directory: /scratch/work/project/permon/petsc/petsc-3.7.4
Using PETSc arch: icc17-impi-mkl-opt
-----------------------------------------
Using C compiler: mpicc -fPIC -O3 -xHost ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90 -fPIC -O3 -xHost ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/include -I/scratch/work/project/permon/petsc/petsc-3.7.4/include -I/scratch/work/project/permon/petsc/petsc-3.7.4/include -I/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/include
-----------------------------------------
Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib -L/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib -lpetsc -Wl,-rpath,/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib -L/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib -lsuperlu_dist -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -Wl,-rpath,/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib64 -L/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib64 -lsuperlu -Wl,-rpath,/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64_lin -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64_lin -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lhwloc -lssl -lcrypto -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/binutils/2.26-GCCcore-5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64_lin -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64_lin -Wl,-rpath,/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -Wl,-rpath,/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -Wl,-rpath,/opt/intel/mpi-rt/2017.0.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/2017.0.0/intel64/lib -lifport -lifcoremt_pic -lm -lmpicxx -ldl -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -lmpifort -lmpi -lmpigi -lrt -lpthread -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/binutils/2.26-GCCcore-5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64_lin -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/binutils/2.26-GCCcore-5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib -Wl,-rpath,/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -Wl,-rpath,/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -Wl,-rpath,/opt/intel/mpi-rt/2017.0.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/2017.0.0/intel64/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/binutils/2.26-GCCcore-5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64_lin -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-G
#PETSc Option Table entries:
-ksp_type preonly
-log_view
-m 2880
-n 2880
-options_left
-pc_factor_mat_solver_package superlu_dist
-pc_type lu
#End of PETSc Option Table entries
There are no unused options.
EXEC cg 24
Norm of error 4.42164e-05 iterations 5073
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./ex2 on a icc17-impi-mkl-opt named r26u23n707 with 24 processors, by kru0097 Fri Sep 1 10:21:33 2017
Using Petsc Release Version 3.7.4, Oct, 02, 2016
Max Max/Min Avg Total
Time (sec): 8.033e+01 1.00000 8.033e+01
Objects: 1.700e+01 1.00000 1.700e+01
Flops: 3.682e+10 1.00079 3.682e+10 8.836e+11
Flops/sec: 4.584e+08 1.00080 4.584e+08 1.100e+10
MPI Messages: 1.015e+04 2.00000 9.729e+03 2.335e+05
MPI Message Lengths: 2.338e+08 2.00000 2.303e+04 5.378e+09
MPI Reductions: 1.524e+04 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 8.0208e+01 99.9% 8.8362e+11 100.0% 2.334e+05 100.0% 2.303e+04 100.0% 1.522e+04 99.9%
1: Assembly: 1.1718e-01 0.1% 0.0000e+00 0.0% 9.200e+01 0.0% 2.270e+00 0.0% 1.000e+01 0.1%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 5074 1.0 3.3480e+01 1.0 1.58e+10 1.0 2.3e+05 2.3e+04 0.0e+00 41 43100100 0 41 43100100 0 11310
VecTDot 10146 1.0 1.3470e+01 1.1 7.01e+09 1.0 0.0e+00 0.0e+00 1.0e+04 16 19 0 0 67 16 19 0 0 67 12495
VecNorm 5075 1.0 4.5686e+00 1.2 3.51e+09 1.0 0.0e+00 0.0e+00 5.1e+03 5 10 0 0 33 5 10 0 0 33 18428
VecCopy 5076 1.0 7.1750e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 9 0 0 0 0 9 0 0 0 0 0
VecSet 2 1.0 2.2919e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 10147 1.0 1.6098e+01 1.0 7.01e+09 1.0 0.0e+00 0.0e+00 0.0e+00 20 19 0 0 0 20 19 0 0 0 10456
VecAYPX 5072 1.0 7.2772e+00 1.0 3.51e+09 1.0 0.0e+00 0.0e+00 0.0e+00 9 10 0 0 0 9 10 0 0 0 11562
VecScatterBegin 5074 1.0 2.7446e-01 2.0 0.00e+00 0.0 2.3e+05 2.3e+04 0.0e+00 0 0100100 0 0 0100100 0 0
VecScatterEnd 5074 1.0 4.0531e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetUp 1 1.0 4.7140e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 8.0175e+01 1.0 3.68e+10 1.0 2.3e+05 2.3e+04 1.5e+04100100100100100 100100100100100 11020
PCSetUp 1 1.0 9.5367e-07 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCApply 5074 1.0 7.1815e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 9 0 0 0 0 9 0 0 0 0 0
--- Event Stage 1: Assembly
MatAssemblyBegin 1 1.0 1.3180e-03 9.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 1 0 0 0 20 0
MatAssemblyEnd 1 1.0 2.8013e-02 1.0 0.00e+00 0.0 9.2e+01 5.8e+03 8.0e+00 0 0 0 0 0 24 0100100 80 0
VecSet 1 1.0 2.5034e-05 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 3 3 52551868 0.
Vector 6 7 16623256 0.
Vector Scatter 0 1 1072 0.
Krylov Solver 1 1 1232 0.
Preconditioner 1 1 816 0.
Viewer 1 0 0 0.
--- Event Stage 1: Assembly
Vector 2 1 1648 0.
Vector Scatter 1 0 0 0.
Index Set 2 2 13072 0.
========================================================================================================================
Average time to get PetscTime(): 0.
Average time for MPI_Barrier(): 9.58443e-06
Average time for zero size MPI_Send(): 4.91738e-06
#PETSc Option Table entries:
-ksp_type cg
-log_view
-m 2880
-n 2880
-options_left
-pc_type none
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-has-attribute-aligned=1 --known-level1-dcache-size=32768 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=8 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --CFLAGS="-fPIC -O3 -xHost" --CXXFLAGS="-fPIC -O3 -xHost" --FFLAGS="-fPIC -O3 -xHost" --with-blas-lapack-lib="[/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64_lin/libmkl_intel_lp64.a,libmkl_sequential.a,libmkl_core.a]" --with-c++-support --with-cc=mpicc --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-gnu-compilers=0 --with-mpi=1 --known-mpi-shared-libraries=0 --with-ar=ar --with-batch=1 --download-metis --download-mumps --download-parmetis --download-superlu_dist --download-superlu --download-suitesparse --with-scalapack-include=/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/include --with-scalapack-lib="[/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64_lin/libmkl_scalapack_lp64.a,libmkl_blacs_intelmpi_lp64.a,libmkl_intel_lp64.a,libmkl_sequential.a,libmkl_core.a]" --with-scalapack=1 --with-shared-libraries=0 --with-windows-graphics=0 --with-x=0 --with-pic=1 PETSC_ARCH=icc17-impi-mkl-opt
-----------------------------------------
Libraries compiled on Tue Dec 6 14:50:11 2016 on login1
Machine characteristics: Linux-2.6.32-573.12.1.el6.noc0w.x86_64-x86_64-with-centos-6.7-Final
Using PETSc directory: /scratch/work/project/permon/petsc/petsc-3.7.4
Using PETSc arch: icc17-impi-mkl-opt
-----------------------------------------
Using C compiler: mpicc -fPIC -O3 -xHost ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90 -fPIC -O3 -xHost ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/include -I/scratch/work/project/permon/petsc/petsc-3.7.4/include -I/scratch/work/project/permon/petsc/petsc-3.7.4/include -I/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/include
-----------------------------------------
Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib -L/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib -lpetsc -Wl,-rpath,/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib -L/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib -lsuperlu_dist -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -Wl,-rpath,/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib64 -L/scratch/work/project/permon/petsc/petsc-3.7.4/icc17-impi-mkl-opt/lib64 -lsuperlu -Wl,-rpath,/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64_lin -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64_lin -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lumfpack -lklu -lcholmod -lbtf -lccolamd -lcolamd -lcamd -lamd -lsuitesparseconfig -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lhwloc -lssl -lcrypto -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/binutils/2.26-GCCcore-5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64_lin -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64_lin -Wl,-rpath,/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -Wl,-rpath,/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -Wl,-rpath,/opt/intel/mpi-rt/2017.0.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/2017.0.0/intel64/lib -lifport -lifcoremt_pic -lm -lmpicxx -ldl -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -lmpifort -lmpi -lmpigi -lrt -lpthread -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/binutils/2.26-GCCcore-5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64_lin -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/binutils/2.26-GCCcore-5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib -Wl,-rpath,/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -Wl,-rpath,/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -Wl,-rpath,/opt/intel/mpi-rt/2017.0.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/2017.0.0/intel64/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib/release_mt -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/intel64/lib -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/binutils/2.26-GCCcore-5.4.0/lib -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64_lin -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/GCCcore/5.4.0/lib64 -L/apps/all/ncurses/6.0-intel-2017.00/lib -L/apps/all/imkl/2017.0.098-iimpi-2017.00-GCC-5.4.0-2.26/mkl/lib/intel64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/impi/2017.0.098-iccifort-2017.0.098-GCC-5.4.0-2.26/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/lib64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/mpi/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/ifort/2017.0.098-GCC-5.4.0-2.26/lib -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/compilers_and_libraries_2017.0.098/linux/compiler/lib/intel64 -L/apps/all/icc/2017.0.098-GCC-5.4.0-2.26/lib/intel64 -L/apps/all/icc/2017.0.098-G
#PETSc Option Table entries:
-ksp_type cg
-log_view
-m 2880
-n 2880
-options_left
-pc_type none
#End of PETSc Option Table entries
There are no unused options.