Thank you all for you reply!

Are you using a KSP/PC configuration which should weak scale?
Yes the system is solved with KSPSolve. There is no preconditioner yet, but I fixed the number of CG iterations to 3 to ensure an apples to apples comparison during the scaling measurements.

VecScatter has been greatly refactored (and the default implementation
is entirely new) since 3.7.

I now tried to use PETSc 3.11 and the code runs fine. The communication seems to show a better weak scaling behavior now.

I'll see if we can just upgrade to 3.11.



Anyway, I'm curious about your
configuration and how you determine that MPI_Alltoallv/MPI_Alltoallw is
being used.
I used the Extrae profiler which intercepts all MPI calls and logs them into a file. This showed that Alltoall is being used for the communication, which I found surprising. With PETSc 3.11 the Alltoall calls are replaced by MPI_Start(all) and MPI_Wait(all), which sounds more reasonable to me.
This has never been a default code path, so I suspect
something in your environment or code making this happen.

I attached some log files for some PETSc 3.7 runs on 1,19 and 115 nodes (24 cores each) which suggest polynomial scaling (vs logarithmic scaling). Could it be some installation setting of the PETSc version? (I use a preinstalled PETSc)

Can you please send representative log files which characterize the lack of scaling (include the full log_view)?

"Stage 1: activation" is the stage of interest, as it wraps the KSPSolve. The number of unkowns per rank is very small in the measurement, so most of the time should be communication. However, I just noticed, that the stage also contains an additional setup step which might be the reason why the MatMul takes longer than the KSPSolve. I can repeat the measurements if necessary. I should add, that I put a MPI_Barrier before the KSPSolve, to avoid any previous work imbalance to effect the KSPSolve call.


Best regards,
Felix

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

... on a haswell named nid04236 with 18 processors, by me Fri Jan 24 14:00:09 2020
Using Petsc Release Version 3.7.6, Apr, 24, 2017 

                         Max       Max/Min        Avg      Total 
Time (sec):           1.955e+01      1.00000   1.955e+01
Objects:              2.677e+04      1.26112   2.300e+04
Flops:                2.304e+08      1.55728   1.977e+08  3.559e+09
Flops/sec:            1.179e+07      1.55728   1.012e+07  1.821e+08
MPI Messages:         3.953e+03      1.28469   3.426e+03  6.168e+04
MPI Message Lengths:  7.619e+06      1.76512   1.608e+03  9.915e+07
MPI Reductions:       4.489e+04      1.16111

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 1.0610e+01  54.3%  0.0000e+00   0.0%  5.456e+03   8.8%  1.421e+00        0.1%  2.540e+03   5.7% 
 1:      activation: 3.8488e+00  19.7%  1.6746e+06   0.0%  0.000e+00   0.0%  1.497e+00        0.1%  2.200e+01   0.0% 
 2:  activation_rhs: 6.9804e-06   0.0%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0% 
 3:             run: 5.0884e+00  26.0%  3.5578e+09 100.0%  5.622e+04  91.2%  1.605e+03       99.8%  3.811e+04  84.9% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage


--- Event Stage 1: activation

VecTDot                7 1.0 5.9128e-05 1.4 5.59e+03 1.7 0.0e+00 0.0e+00 7.0e+00  0  0  0  0  0   0  5  0  0 32  1437
VecNorm                5 1.0 7.2002e-05 2.2 4.00e+03 1.7 0.0e+00 0.0e+00 5.0e+00  0  0  0  0  0   0  4  0  0 23   844
VecCopy                6 1.0 2.8610e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY                6 1.0 6.9141e-06 1.8 4.80e+03 1.7 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  4  0  0  0 10545
VecAYPX                3 1.0 2.1458e-06 0.0 2.00e+03 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  2  0  0  0 14158
VecScatterBegin        5 1.0 6.2842e+00104594.2 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 18  0  0  0  0  90  0  0100 23     0
VecScatterEnd          5 1.0 6.9141e-06 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMult                5 1.0 6.2843e+0034957.9 9.01e+04 1.4 0.0e+00 0.0e+00 5.0e+00 18  0  0  0  0  90 85  0100 23     0
KSPSetUp               1 1.0 2.7895e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 3.8600e-04 1.0 8.85e+04 1.5 0.0e+00 0.0e+00 2.1e+01  0  0  0  0  0   0 83  0 80 95  3600
PCSetUp                1 1.0 9.5367e-07 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply                5 1.0 5.9605e-06 6.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

--- Event Stage 2: activation_rhs


--- Event Stage 3: run

VecMDot             4876 1.0 6.8255e-02 1.5 6.03e+07 1.7 0.0e+00 0.0e+00 4.9e+03  0 26  0  0 11   1 26  0  0 13 13416
VecTDot              336 1.3 6.2203e-04 1.5 4.97e+05 1.3 0.0e+00 0.0e+00 2.9e+02  0  0  0  0  1   0  0  0  0  1 12288
VecNorm             5369 1.0 4.0783e-02 1.2 4.52e+06 1.6 0.0e+00 0.0e+00 5.3e+03  0  2  0  0 12   1  2  0  0 14  1686
VecScale            5039 1.0 4.8268e-03 2.9 2.02e+06 1.7 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  6343
VecCopy             6019 1.0 1.4820e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet              3361 1.2 2.5723e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY             1798 1.2 4.0171e-03 1.1 6.75e+06 1.3 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  3  0  0  0 25923
VecAYPX               12 0.0 7.8678e-06 0.0 1.78e+04 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 15801
VecMAXPY            5039 1.0 9.2657e-03 1.6 6.43e+07 1.7 0.0e+00 0.0e+00 0.0e+00  0 27  0  0  0   0 27  0  0  0 105369
VecScatterBegin     5808 1.0 5.4529e+00146.7 0.00e+00 0.0 0.0e+00 0.0e+00 5.7e+03  4  0  0 94 13  17  0  0 94 15     0
VecScatterEnd       5808 1.0 1.8566e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize        5039 1.0 4.0384e-02 1.1 6.05e+06 1.7 0.0e+00 0.0e+00 5.0e+03  0  3  0  0 11   1  3  0  0 13  2274
MatMult             5206 1.0 1.1557e-01 1.2 9.14e+07 1.4 0.0e+00 0.0e+00 5.2e+03  1 41  0 94 12   2 41  0 94 14 12512
MatScale              82 1.3 2.6333e-03 1.1 1.89e+05 1.3 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1109
MatAssemblyBegin    1147 1.3 1.6677e-0116.5 0.00e+00 0.0 8.5e+03 6.0e+02 1.6e+03  1  0 14  5  3   2  0 15  5  4     0
MatAssemblyEnd      1147 1.3 3.7627e-02 1.3 0.00e+00 0.0 8.6e+03 1.0e+01 2.2e+03  0  0 14  0  5   1  0 15  0  6     0
MatGetValues          36 1.0 6.4850e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRow          59940 1.3 4.7204e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMatMult            81 1.3 2.7820e-02 1.3 3.59e+05 1.3 2.5e+03 4.0e+00 1.1e+03  0  0  4  0  2   1  0  4  0  3   199
MatMatMultSym         81 1.3 2.4759e-02 1.3 0.00e+00 0.0 2.5e+03 4.0e+00 9.7e+02  0  0  4  0  2   0  0  4  0  3     0
MatMatMultNum         81 1.3 2.9891e-03 1.3 3.59e+05 1.3 0.0e+00 0.0e+00 1.4e+02  0  0  0  0  0   0  0  0  0  0  1856
MatGetLocalMat       162 1.3 6.6185e-03 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetBrAoCol        162 1.3 1.1911e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog      4876 1.0 7.8934e-02 1.5 1.21e+08 1.7 0.0e+00 0.0e+00 4.9e+03  0 51  0  0 11   1 52  0  0 13 23219
KSPSetUp               2 1.0 8.1062e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve             163 1.3 2.2359e-01 1.0 2.24e+08 1.6 0.0e+00 0.0e+00 1.6e+04  1 97  0 94 35   4 97  0 94 41 15460
PCSetUp                2 1.0 9.5367e-07 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply             5369 1.0 2.6193e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetGraph            84 1.3 3.3445e-0332.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFReduceBegin         84 1.3 4.1623e-03 4.7 0.00e+00 0.0 4.5e+01 7.0e+01 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFReduceEnd           84 1.3 7.8917e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BuildTwoSided         84 1.3 3.7937e-03 7.3 0.00e+00 0.0 9.0e+00 4.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

           Index Set  1066            656       513568     0.
   IS L to G Mapping   328              0            0     0.
              Vector   738              0            0     0.
      Vector Scatter   328              0            0     0.
              Viewer     1              0            0     0.

--- Event Stage 1: activation

              Vector     8              2         7784     0.

--- Event Stage 2: activation_rhs


--- Event Stage 3: run

           Index Set  8417           5402      4224128     0.
   IS L to G Mapping  4015            661      2363088     0.
              Vector  6616            817      2765416     0.
      Vector Scatter  2700              0            0     0.
              Matrix  2458            661      1824360     0.
   Matrix Null Space     1              0            0     0.
       Krylov Solver     3              0            0     0.
      Preconditioner     3              0            0     0.
Star Forest Bipartite Graph    84             84        71232     0.
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 6.19888e-06
Average time for zero size MPI_Send(): 5.54985e-06
#PETSc Option Table entries:
--cellml_file ../../../input/hodgkin_huxley_1952.c
--diffusion_solver_maxit 5
--disable_firing_output
--dt_0D 1e-3
--dt_1D 2e-3
--dt_3D 4e-3
--dt_splitting 2e-3
--emg_initial_guess_nonzero
--emg_preconditioner_type none
--emg_solver_maxit 3
--emg_solver_type cg
--end_time 4e-3
--fiber_distribution_file ../../../input/MU_fibre_distribution_3780.txt
--fiber_file ../../../input/25x25fibers.bin
--firing_times_file ../../../input/MU_firing_times_real.txt
--n_subdomains 3
--scenario_name weak_scaling_3_3_2_
-on_error_abort
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-has-attribute-aligned=1 --known-mpi-int64_t=0 --known-bits-per-byte=8 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --known-level1-dcache-assoc=0 --known-level1-dcache-linesize=32 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mpi-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8 --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with-fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib --with-scalar-type=real --with-shared-ld=ar --with-etags=0 --with-dependencies=0 --with-x=0 --with-ssl=0 --with-shared-libraries=0 --with-dependencies=0 --with-mpi-lib="[]" --with-mpi-include="[]" --with-blas-lapack-lib="-L/opt/cray/libsci/13.2.0/GNU/5.1/x86_64/lib -lsci_gnu_mp" --with-superlu=1 --with-superlu-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-superlu-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lsuperlu" --with-superlu_dist=1 --with-superlu_dist-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-superlu_dist-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lsuperlu_dist" --with-parmetis=1 --with-parmetis-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-parmetis-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lparmetis" --with-metis=1 --with-metis-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-metis-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lmetis" --with-ptscotch=1 --with-ptscotch-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-ptscotch-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/13.2.0/GNU/5.1/x86_64/include --with-scalapack-lib="-L/opt/cray/libsci/13.2.0/GNU/5.1/x86_64/lib -lsci_gnu_mpi_mp -lsci_gnu_mp" --with-mumps=1 --with-mumps-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-mumps-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lptesmumps -lesmumps -lpord" --with-hdf5=1 --with-hdf5-include=/opt/cray/hdf5-parallel/1.10.1.1/GNU/5.1/include --with-hdf5-lib="-L/opt/cray/hdf5-parallel/1.10.1.1/GNU/5.1/lib -lhdf5_parallel -lz -ldl" --CFLAGS="-march=haswell -fopenmp -O3 -ffast-math  -fPIC" --CPPFLAGS= --CXXFLAGS="-march=haswell -fopenmp -O3 -ffast-math   -fPIC" --FFLAGS="-march=haswell -fopenmp -O3 -ffast-math   -fPIC" --LIBS= --CXX_LINKER_FLAGS= --PETSC_ARCH=haswell --prefix=/opt/cray/pe/petsc/3.7.6.2/real/GNU/5.3/haswell --with-hypre=1 --with-hypre-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-hypre-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lHYPRE" --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-sundials-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lsundials_cvode -lsundials_cvodes -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel -lsundials_nvecserial"








---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

... on a haswell named nid04454 with 450 processors, by me Fri Jan 24 14:10:16 2020
Using Petsc Release Version 3.7.6, Apr, 24, 2017 

                         Max       Max/Min        Avg      Total 
Time (sec):           3.901e+01      1.00001   3.901e+01
Objects:              2.122e+04      1.29938   1.757e+04
Flops:                3.160e+08      1.15372   3.020e+08  1.359e+11
Flops/sec:            8.100e+06      1.15373   7.742e+06  3.484e+09
MPI Messages:         3.364e+03      1.40029   2.820e+03  1.269e+06
MPI Message Lengths:  1.516e+07      2.22510   4.830e+03  6.131e+09
MPI Reductions:       5.445e+04      1.11343

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 1.8516e+01  47.5%  0.0000e+00   0.0%  1.116e+05   8.8%  1.842e+00        0.0%  1.941e+03   3.6% 
 1:      activation: 7.6666e-03   0.0%  3.2847e+07   0.0%  0.000e+00   0.0%  2.268e+00        0.0%  2.200e+01   0.0% 
 2:  activation_rhs: 7.2331e-06   0.0%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0% 
 3:             run: 2.0490e+01  52.5%  1.3588e+11 100.0%  1.158e+06  91.2%  4.826e+03       99.9%  4.834e+04  88.8% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage


--- Event Stage 1: activation

VecTDot                7 1.0 1.6212e-04 1.3 3.58e+03 1.1 0.0e+00 0.0e+00 7.0e+00  0  0  0  0  0   2  5  0  0 32  9618
VecNorm                5 1.0 1.7476e-04 2.0 2.56e+03 1.1 0.0e+00 0.0e+00 5.0e+00  0  0  0  0  0   2  3  0  0 23  6386
VecCopy                6 1.0 5.2452e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY                6 1.0 1.0252e-05 2.1 3.07e+03 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  4  0  0  0 130628
VecAYPX                3 1.0 3.3379e-06 0.0 1.28e+03 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  2  0  0  0 167173
VecScatterBegin        5 1.0 6.8860e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00  0  0  0  0  0  90  0  0100 23     0
VecScatterEnd          5 1.0 1.3351e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMult                5 1.0 7.0133e-03 1.0 6.64e+04 1.3 0.0e+00 0.0e+00 5.0e+00  0  0  0  0  0  91 86  0100 23  4032
KSPSetUp               1 1.0 3.0994e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 1.8902e-03 1.0 6.36e+04 1.2 0.0e+00 0.0e+00 2.1e+01  0  0  0  0  0  24 83  0 80 95 14386
PCSetUp                1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply                5 1.0 7.1526e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

--- Event Stage 2: activation_rhs


--- Event Stage 3: run

VecMDot            10000 1.0 3.0873e-01 1.3 7.92e+07 1.1 0.0e+00 0.0e+00 1.0e+04  1 25  0  0 18   1 25  0  0 21 111760
VecTDot              292 1.5 5.1212e-04 1.7 4.32e+05 1.5 0.0e+00 0.0e+00 2.2e+02  0  0  0  0  0   0  0  0  0  0 281566
VecNorm            10608 1.0 3.9872e-01 1.8 5.70e+06 1.1 0.0e+00 0.0e+00 1.1e+04  1  2  0  0 19   1  2  0  0 22  6142
VecScale           10334 1.0 1.3559e-02 2.0 2.65e+06 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 85057
VecCopy            11327 1.0 3.6891e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet              2869 1.3 2.0351e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY             1858 1.2 5.2643e-03 1.6 5.51e+06 1.3 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0 389840
VecAYPX               18 0.0 8.8215e-06 0.0 2.66e+04 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 205353
VecMAXPY           10334 1.0 1.4979e-02 1.2 8.44e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0 27  0  0  0   0 27  0  0  0 2457210
VecScatterBegin    10962 1.0 4.9950e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+04 12  0  0 97 20  23  0  0 97 22     0
VecScatterEnd      10962 1.0 8.7821e-03 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize       10334 1.0 2.4845e-01 1.1 7.94e+06 1.1 0.0e+00 0.0e+00 1.0e+04  1  3  0  0 19   1  3  0  0 21 13925
MatMult            10479 1.0 2.6717e+00 1.0 1.38e+08 1.3 0.0e+00 0.0e+00 1.0e+04  7 43  0 97 19  13 43  0 97 22 21938
MatScale              65 1.3 2.6226e-03 1.2 1.49e+05 1.3 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 21205
MatAssemblyBegin     909 1.3 1.4655e-01 9.1 0.00e+00 0.0 1.8e+05 8.5e+02 1.2e+03  0  0 14  3  2   0  0 16  3  2     0
MatAssemblyEnd       909 1.3 3.5992e-02 1.7 0.00e+00 0.0 1.8e+05 1.3e+01 1.7e+03  0  0 15  0  3   0  0 16  0  4     0
MatGetValues          36 1.4 9.4175e-05 3.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRow          47360 1.3 4.0329e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMatMult            64 1.3 2.5717e-02 1.6 2.84e+05 1.3 4.8e+04 4.0e+00 8.4e+02  0  0  4  0  2   0  0  4  0  2  4101
MatMatMultSym         64 1.3 2.2983e-02 1.6 0.00e+00 0.0 4.8e+04 4.0e+00 7.4e+02  0  0  4  0  1   0  0  4  0  2     0
MatMatMultNum         64 1.3 2.5036e-03 1.4 2.84e+05 1.3 0.0e+00 0.0e+00 1.1e+02  0  0  0  0  0   0  0  0  0  0 42121
MatGetLocalMat       128 1.3 6.2647e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetBrAoCol        128 1.3 9.9993e-04 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog     10000 1.0 3.2691e-01 1.2 1.58e+08 1.1 0.0e+00 0.0e+00 1.0e+04  1 51  0  0 18   1 51  0  0 21 211305
KSPSetUp               2 1.0 6.5250e-03181.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve             129 1.3 3.3797e+00 1.1 3.11e+08 1.2 0.0e+00 0.0e+00 3.1e+04  8 99  0 97 57  16 99  0 97 65 39637
PCSetUp                2 1.0 2.1458e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply            10608 1.0 6.7325e-03 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetGraph            67 1.3 1.2636e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFReduceBegin         67 1.3 1.2133e-03 1.6 0.00e+00 0.0 1.1e+03 5.2e+01 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFReduceEnd           67 1.3 8.5115e-05 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BuildTwoSided         67 1.3 8.2541e-04 1.7 0.00e+00 0.0 2.2e+02 4.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

           Index Set   845            520       407104     0.
   IS L to G Mapping   260              0            0     0.
              Vector   585              0            0     0.
      Vector Scatter   260              0            0     0.
              Viewer     1              0            0     0.

--- Event Stage 1: activation

              Vector     8              2         6376     0.

--- Event Stage 2: activation_rhs


--- Event Stage 3: run

           Index Set  6666           4280      3346816     0.
   IS L to G Mapping  3182            525      1865968     0.
              Vector  5256            647      2184824     0.
      Vector Scatter  2139              0            0     0.
              Matrix  1948            525      1449000     0.
   Matrix Null Space     1              0            0     0.
       Krylov Solver     3              0            0     0.
      Preconditioner     3              0            0     0.
Star Forest Bipartite Graph    67             67        56816     0.
========================================================================================================================
Average time to get PetscTime(): 1.19209e-07
Average time for MPI_Barrier(): 1.67847e-05
Average time for zero size MPI_Send(): 5.97318e-06
#PETSc Option Table entries:
--cellml_file ../../../input/hodgkin_huxley_1952.c
--diffusion_solver_maxit 5
--disable_firing_output
--dt_0D 1e-3
--dt_1D 2e-3
--dt_3D 4e-3
--dt_splitting 2e-3
--emg_initial_guess_nonzero
--emg_preconditioner_type none
--emg_solver_maxit 3
--emg_solver_type cg
--end_time 4e-3
--fiber_distribution_file ../../../input/MU_fibre_distribution_3780.txt
--fiber_file ../../../input/109x109fibers.bin
--firing_times_file ../../../input/MU_firing_times_real.txt
--n_subdomains 15
--scenario_name weak_scaling_15_15_2
-on_error_abort
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-has-attribute-aligned=1 --known-mpi-int64_t=0 --known-bits-per-byte=8 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --known-level1-dcache-assoc=0 --known-level1-dcache-linesize=32 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mpi-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8 --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with-fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib --with-scalar-type=real --with-shared-ld=ar --with-etags=0 --with-dependencies=0 --with-x=0 --with-ssl=0 --with-shared-libraries=0 --with-dependencies=0 --with-mpi-lib="[]" --with-mpi-include="[]" --with-blas-lapack-lib="-L/opt/cray/libsci/13.2.0/GNU/5.1/x86_64/lib -lsci_gnu_mp" --with-superlu=1 --with-superlu-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-superlu-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lsuperlu" --with-superlu_dist=1 --with-superlu_dist-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-superlu_dist-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lsuperlu_dist" --with-parmetis=1 --with-parmetis-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-parmetis-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lparmetis" --with-metis=1 --with-metis-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-metis-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lmetis" --with-ptscotch=1 --with-ptscotch-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-ptscotch-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/13.2.0/GNU/5.1/x86_64/include --with-scalapack-lib="-L/opt/cray/libsci/13.2.0/GNU/5.1/x86_64/lib -lsci_gnu_mpi_mp -lsci_gnu_mp" --with-mumps=1 --with-mumps-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-mumps-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lptesmumps -lesmumps -lpord" --with-hdf5=1 --with-hdf5-include=/opt/cray/hdf5-parallel/1.10.1.1/GNU/5.1/include --with-hdf5-lib="-L/opt/cray/hdf5-parallel/1.10.1.1/GNU/5.1/lib -lhdf5_parallel -lz -ldl" --CFLAGS="-march=haswell -fopenmp -O3 -ffast-math  -fPIC" --CPPFLAGS= --CXXFLAGS="-march=haswell -fopenmp -O3 -ffast-math   -fPIC" --FFLAGS="-march=haswell -fopenmp -O3 -ffast-math   -fPIC" --LIBS= --CXX_LINKER_FLAGS= --PETSC_ARCH=haswell --prefix=/opt/cray/pe/petsc/3.7.6.2/real/GNU/5.3/haswell --with-hypre=1 --with-hypre-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-hypre-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lHYPRE" --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-sundials-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lsundials_cvode -lsundials_cvodes -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel -lsundials_nvecserial"








---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

... on a haswell named nid03252 with 2738 processors, by me Fri Jan 24 15:32:45 2020
Using Petsc Release Version 3.7.6, Apr, 24, 2017 

                         Max       Max/Min        Avg      Total 
Time (sec):           1.215e+02      1.00001   1.215e+02
Objects:              2.122e+04      1.29938   1.863e+04
Flops:                3.162e+08      1.15455   3.042e+08  8.328e+11
Flops/sec:            2.603e+06      1.15455   2.504e+06  6.856e+09
MPI Messages:         3.364e+03      1.40029   2.990e+03  8.185e+06
MPI Message Lengths:  1.516e+07      2.22511   4.777e+03  3.910e+10
MPI Reductions:       5.447e+04      1.11392

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 6.8289e+01  56.2%  0.0000e+00   0.0%  7.198e+05   8.8%  1.822e+00        0.0%  2.058e+03   3.8% 
 1:      activation: 1.1864e-02   0.0%  2.0221e+08   0.0%  0.000e+00   0.0%  2.243e+00        0.0%  2.200e+01   0.0% 
 2:  activation_rhs: 7.2035e-06   0.0%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0% 
 3:             run: 5.3171e+01  43.8%  8.3258e+11 100.0%  7.466e+06  91.2%  4.773e+03       99.9%  4.941e+04  90.8% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage


--- Event Stage 1: activation

VecTDot                7 1.0 3.8028e-04 1.3 3.58e+03 1.1 0.0e+00 0.0e+00 7.0e+00  0  0  0  0  0   3  5  0  0 32 24948
VecNorm                5 1.0 3.2425e-04 1.6 2.56e+03 1.1 0.0e+00 0.0e+00 5.0e+00  0  0  0  0  0   2  3  0  0 23 20941
VecCopy                6 1.0 1.0967e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY                6 1.0 1.4067e-05 3.7 3.07e+03 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  4  0  0  0 579261
VecAYPX                3 1.0 8.8215e-06 0.0 1.28e+03 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  2  0  0  0 384869
VecScatterBegin        5 1.0 1.0569e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00  0  0  0  0  0  89  0  0100 23     0
VecScatterEnd          5 1.0 2.9087e-0530.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMult                5 1.0 1.0730e-02 1.0 6.64e+04 1.3 0.0e+00 0.0e+00 5.0e+00  0  0  0  0  0  90 86  0100 23 16252
KSPSetUp               1 1.0 5.1975e-05 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 5.2700e-03 1.0 6.36e+04 1.2 0.0e+00 0.0e+00 2.1e+01  0  0  0  0  0  44 83  0 80 95 31751
PCSetUp                1 1.0 1.1921e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply                5 1.0 1.1921e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

--- Event Stage 2: activation_rhs


--- Event Stage 3: run

VecMDot            10000 1.0 6.0100e-01 1.1 7.92e+07 1.1 0.0e+00 0.0e+00 1.0e+04  0 25  0  0 18   1 25  0  0 20 349316
VecTDot              304 1.6 5.8770e-04 3.2 4.50e+05 1.6 0.0e+00 0.0e+00 2.3e+02  0  0  0  0  0   0  0  0  0  0 1584199
VecNorm            10614 1.0 6.8425e-01 1.5 5.71e+06 1.1 0.0e+00 0.0e+00 1.1e+04  0  2  0  0 19   1  2  0  0 21 21855
VecScale           10334 1.0 1.4122e-02 2.1 2.65e+06 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 496878
VecCopy            11333 1.0 4.0164e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet              2869 1.3 2.1954e-03 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY             1870 1.2 6.5804e-03 1.7 5.53e+06 1.3 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0 2005584
VecAYPX               24 0.0 1.5497e-05 0.0 3.55e+04 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 748346
VecMAXPY           10334 1.0 1.5738e-02 1.3 8.44e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0 27  0  0  0   0 27  0  0  0 14229163
VecScatterBegin    10968 1.0 2.1788e+01 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+04 17  0  0 97 20  40  0  0 97 22     0
VecScatterEnd      10968 1.0 9.3315e-03 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize       10334 1.0 4.8330e-01 1.0 7.94e+06 1.1 0.0e+00 0.0e+00 1.0e+04  0  3  0  0 19   1  3  0  0 21 43557
MatMult            10485 1.0 1.0264e+01 1.0 1.38e+08 1.3 0.0e+00 0.0e+00 1.0e+04  8 43  0 97 19  19 43  0 97 21 35225
MatScale              65 1.3 2.8410e-03 1.3 1.49e+05 1.3 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 126118
MatAssemblyBegin     909 1.3 1.6064e-0114.3 0.00e+00 0.0 1.2e+06 8.4e+02 1.3e+03  0  0 14  3  2   0  0 16  3  3     0
MatAssemblyEnd       909 1.3 3.2645e-02 1.5 0.00e+00 0.0 1.2e+06 1.3e+01 1.8e+03  0  0 15  0  3   0  0 16  0  4     0
MatGetValues          36 1.4 1.0395e-04 3.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRow          47360 1.3 4.7610e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMatMult            64 1.3 2.9266e-02 1.9 2.84e+05 1.3 3.1e+05 4.0e+00 9.0e+02  0  0  4  0  2   0  0  4  0  2 23271
MatMatMultSym         64 1.3 2.6498e-02 2.0 0.00e+00 0.0 3.1e+05 4.0e+00 7.8e+02  0  0  4  0  1   0  0  4  0  2     0
MatMatMultNum         64 1.3 2.5842e-03 1.7 2.84e+05 1.3 0.0e+00 0.0e+00 1.1e+02  0  0  0  0  0   0  0  0  0  0 263541
MatGetLocalMat       128 1.3 5.5459e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetBrAoCol        128 1.3 1.0481e-03 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog     10000 1.0 6.1988e-01 1.1 1.58e+08 1.1 0.0e+00 0.0e+00 1.0e+04  0 50  0  0 18   1 50  0  0 20 678034
KSPSetUp               2 1.0 9.9897e-05 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve             129 1.3 1.1538e+01 1.0 3.11e+08 1.2 0.0e+00 0.0e+00 3.1e+04  9 98  0 97 57  21 99  0 97 63 71082
PCSetUp                2 1.0 7.8678e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply            10614 1.0 6.8228e-03 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFSetGraph            67 1.3 1.9526e-04 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFReduceBegin         67 1.3 1.5805e-03 1.8 0.00e+00 0.0 6.8e+03 5.2e+01 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFReduceEnd           67 1.3 9.5844e-05 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
BuildTwoSided         67 1.3 1.1051e-03 1.8 0.00e+00 0.0 1.4e+03 4.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

           Index Set   845            520       407104     0.
   IS L to G Mapping   260              0            0     0.
              Vector   585              0            0     0.
      Vector Scatter   260              0            0     0.
              Viewer     1              0            0     0.

--- Event Stage 1: activation

              Vector     8              2         6376     0.

--- Event Stage 2: activation_rhs


--- Event Stage 3: run

           Index Set  6666           4280      3346816     0.
   IS L to G Mapping  3182            525      1865968     0.
              Vector  5256            647      2184824     0.
      Vector Scatter  2139              0            0     0.
              Matrix  1948            525      1449000     0.
   Matrix Null Space     1              0            0     0.
       Krylov Solver     3              0            0     0.
      Preconditioner     3              0            0     0.
Star Forest Bipartite Graph    67             67        56816     0.
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 3.70026e-05
Average time for zero size MPI_Send(): 6.16989e-06
#PETSc Option Table entries:
--cellml_file ../../../input/hodgkin_huxley_1952.c
--diffusion_solver_maxit 5
--disable_firing_output
--dt_0D 1e-3
--dt_1D 2e-3
--dt_3D 4e-3
--dt_splitting 2e-3
--emg_initial_guess_nonzero
--emg_preconditioner_type none
--emg_solver_maxit 3
--emg_solver_type cg
--end_time 4e-3
--fiber_distribution_file ../../../input/MU_fibre_distribution_3780.txt
--fiber_file ../../../input/277x277fibers.bin
--firing_times_file ../../../input/MU_firing_times_real.txt
--n_subdomains 37
--scenario_name weak_scaling_37_37_2
-on_error_abort
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --known-has-attribute-aligned=1 --known-mpi-int64_t=0 --known-bits-per-byte=8 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --known-level1-dcache-assoc=0 --known-level1-dcache-linesize=32 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mpi-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8 --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with-fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib --with-scalar-type=real --with-shared-ld=ar --with-etags=0 --with-dependencies=0 --with-x=0 --with-ssl=0 --with-shared-libraries=0 --with-dependencies=0 --with-mpi-lib="[]" --with-mpi-include="[]" --with-blas-lapack-lib="-L/opt/cray/libsci/13.2.0/GNU/5.1/x86_64/lib -lsci_gnu_mp" --with-superlu=1 --with-superlu-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-superlu-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lsuperlu" --with-superlu_dist=1 --with-superlu_dist-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-superlu_dist-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lsuperlu_dist" --with-parmetis=1 --with-parmetis-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-parmetis-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lparmetis" --with-metis=1 --with-metis-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-metis-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lmetis" --with-ptscotch=1 --with-ptscotch-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-ptscotch-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/13.2.0/GNU/5.1/x86_64/include --with-scalapack-lib="-L/opt/cray/libsci/13.2.0/GNU/5.1/x86_64/lib -lsci_gnu_mpi_mp -lsci_gnu_mp" --with-mumps=1 --with-mumps-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-mumps-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lptesmumps -lesmumps -lpord" --with-hdf5=1 --with-hdf5-include=/opt/cray/hdf5-parallel/1.10.1.1/GNU/5.1/include --with-hdf5-lib="-L/opt/cray/hdf5-parallel/1.10.1.1/GNU/5.1/lib -lhdf5_parallel -lz -ldl" --CFLAGS="-march=haswell -fopenmp -O3 -ffast-math  -fPIC" --CPPFLAGS= --CXXFLAGS="-march=haswell -fopenmp -O3 -ffast-math   -fPIC" --FFLAGS="-march=haswell -fopenmp -O3 -ffast-math   -fPIC" --LIBS= --CXX_LINKER_FLAGS= --PETSC_ARCH=haswell --prefix=/opt/cray/pe/petsc/3.7.6.2/real/GNU/5.3/haswell --with-hypre=1 --with-hypre-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-hypre-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lHYPRE" --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/include --with-sundials-lib="-L/opt/cray/tpsl/17.11.1/GNU/5.1/haswell/lib -lsundials_cvode -lsundials_cvodes -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel -lsundials_nvecserial"
  

Reply via email to