Re: [petsc-users] Prometheus vs GAMG for elasticity/plasticity problems

Thomas Gross Mon, 27 Jan 2014 08:41:06 -0800

Please find enclosed the log summary for the runs.

I used the following options for Prometheus and GAMG:
Prometheus:
-ksp_type cg -pc_type prometheus -log_summary -ksp_monitor -ksp_view 
-aggmg_smooths 1 -options_left
GAMG:
-ksp_type cg -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 
-log_summary -ksp_monitor -ksp_view -options_left


> Make sure to use the same smoother you used with Prometheus.
How can I make sure, that the same smoothing is applied in both cases?

Many thanks and best regards,
Thomas


    F I N I T E   E L E M E N T   A N A L Y S I S   P R O G R A M

           FEAP (C) Regents of the University of California
                         All Rights Reserved.
                       VERSION: Release 8.3.19      
                          DATE: 29 March 2011       

         Files are set as:   Status    Filename

           Input   (read ) : Exists  Icube_0001                      
           Output  (write) : Exists  Ocube_0001                      
           Restart (read ) : New     Rcube_0001                      
           Restart (write) : New     Rcube_0001                      
           Plots   (write) : New     Pcube_0001                      

         Caution, existing write files will be overwritten.

         Are filenames correct? ( y or n; s = stop) :
         R U N N I N G    F E A P    P R O B L E M    N O W

          --> Please report errors by e-mail to:
              [email protected] 

  0 KSP Residual norm 1.444126847260e-01 
  1 KSP Residual norm 5.357525404213e-03 
  2 KSP Residual norm 1.471040678379e-03 
  3 KSP Residual norm 3.704652302293e-04 
  4 KSP Residual norm 9.809180893460e-05 
  5 KSP Residual norm 3.175497350277e-05 
  6 KSP Residual norm 8.859979496890e-06 
  7 KSP Residual norm 2.071384344082e-06 
  8 KSP Residual norm 5.035483717523e-07 
  9 KSP Residual norm 1.516500637412e-07 
 10 KSP Residual norm 5.134577847338e-08 
 11 KSP Residual norm 1.270806138401e-08 
 12 KSP Residual norm 3.074793756862e-09 
 13 KSP Residual norm 1.048189477307e-09 
KSP Object: 2 MPI processes
  type: cg
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-08, absolute=1e-16, divergence=1e+16
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 2 MPI processes
  type: prometheus
  linear system matrix = precond matrix:
  Matrix Object:   2 MPI processes  
    type: mpibaij
    rows=2013, cols=2013
    total: nonzeros=100899, allocated nonzeros=100899
    total number of mallocs used during MatSetValues calls =0
        block size is 3
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r 
-fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: 
----------------------------------------------

/usr2/tgross/parFEAP/parFEAP83_mod/ver83/parfeap/feap on a linux-gnu named 
ilfb35.ilsb.tuwien.ac.at with 2 processors, by tgross Mon Jan 27 17:11:10 2014
Using Petsc Release Version 3.2.0, Patch 7, Thu Mar 15 09:30:51 CDT 2012 

                         Max       Max/Min        Avg      Total 
Time (sec):           1.379e-01      1.00662   1.374e-01
Objects:              6.800e+01      1.00000   6.800e+01
Flops:                4.104e+07      1.16967   3.807e+07  7.613e+07
Flops/sec:            2.997e+08      1.17742   2.771e+08  5.542e+08
MPI Messages:         1.515e+02      1.00664   1.510e+02  3.020e+02
MPI Message Lengths:  4.012e+05      1.02186   2.628e+03  7.938e+05
MPI Reductions:       2.170e+02      1.01402

Flop counting convention: 1 flop = 1 real number operation of type 
(multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N 
flops
                            and VecAXPY() for complex vectors of length N --> 
8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- 
Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     
Avg         %Total   counts   %Total 
 0:      Main Stage: 1.3742e-01 100.0%  7.6132e+07 100.0%  3.020e+02 100.0%  
2.628e+03      100.0%  2.145e+02  98.8% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting 
output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and 
PetscLogStagePop().
      %T - percent time in this phase         %f - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in 
this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all 
processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                            
 --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct 
 %T %f %M %L %R  %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

MatMult               50 1.0 4.6628e-03 1.6 5.02e+06 1.0 1.0e+02 1.4e+03 
0.0e+00  3 13 33 18  0   3 13 33 18  0  2142
MatMultAdd            14 1.0 1.0583e-03 2.3 4.16e+05 1.1 1.4e+01 1.4e+03 
0.0e+00  1  1  5  2  0   1  1  5  2  0   762
MatMultTranspose      14 1.0 5.5742e-04 1.2 4.16e+05 1.1 1.4e+01 1.4e+03 
0.0e+00  0  1  5  2  0   0  1  5  2  0  1446
MatSolve              14 0.0 3.0231e-04 0.0 9.14e+05 0.0 0.0e+00 0.0e+00 
0.0e+00  0  1  0  0  0   0  1  0  0  0  3023
MatLUFactorSym         1 1.0 4.2915e-05 3.9 0.00e+00 0.0 0.0e+00 0.0e+00 
3.0e+00  0  0  0  0  1   0  0  0  0  1     0
MatLUFactorNum         1 1.0 1.9600e-03483.6 3.25e+06 0.0 0.0e+00 0.0e+00 
0.0e+00  1  4  0  0  0   1  4  0  0  0  1659
MatAssemblyBegin       9 1.0 3.3593e-04 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 
2.1e+01  0  0  0  0 10   0  0  0  0 10     0
MatAssemblyEnd         9 1.0 8.5688e-04 1.0 0.00e+00 0.0 6.0e+00 2.3e+02 
2.6e+01  1  0  2  0 12   1  0  2  0 12     0
MatGetRow            336 1.0 1.1373e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            1 0.0 7.1526e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetSubMatrice       1 1.0 1.3590e-04 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 
6.0e+00  0  0  0  0  3   0  0  0  0  3     0
MatGetOrdering         1 0.0 3.0041e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
1.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries         2 1.0 4.2915e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView                1 1.0 4.0054e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 
1.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecDot                19 1.0 1.2422e-04 1.8 3.83e+04 1.0 0.0e+00 0.0e+00 
1.9e+01  0  0  0  0  9   0  0  0  0  9   616
VecTDot               26 1.0 2.5082e-04 2.7 5.24e+04 1.0 0.0e+00 0.0e+00 
2.6e+01  0  0  0  0 12   0  0  0  0 12   417
VecNorm               18 1.0 3.7885e-04 1.3 3.02e+04 1.0 0.0e+00 0.0e+00 
1.8e+01  0  0  0  0  8   0  0  0  0  8   159
VecScale              28 1.0 4.0770e-05 1.2 2.82e+04 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0  1383
VecCopy               88 1.0 5.5552e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               145 1.0 6.1274e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY               58 1.0 7.9393e-05 1.1 1.17e+05 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0  2941
VecAYPX               48 1.0 8.5354e-05 1.0 6.85e+04 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0  1604
VecAssemblyBegin       3 1.0 2.4080e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 
9.0e+00  0  0  0  0  4   0  0  0  0  4     0
VecAssemblyEnd         3 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin      106 1.0 1.4353e-04 1.3 0.00e+00 0.0 1.3e+02 1.4e+03 
0.0e+00  0  0 42 22  0   0  0 42 22  0     0
VecScatterEnd        106 1.0 2.8653e-0320.9 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  1  0  0  0  0   1  0  0  0  0     0
KSPSetup               2 1.0 2.0027e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 4.6679e-02 1.0 4.10e+07 1.2 3.0e+02 2.7e+03 
1.9e+02 34100 99100 86  34100 99100 87  1631
PCSetUp                1 1.0 4.1773e-02 1.1 3.50e+07 1.2 1.9e+02 3.4e+03 
1.5e+02 30 85 62 81 68  30 85 62 81 69  1554
PCSetUpOnBlocks        1 1.0 2.0559e-0351.3 3.25e+06 0.0 0.0e+00 0.0e+00 
5.0e+00  1  4  0  0  2   1  4  0  0  2  1582
PCApply               14 1.0 5.5327e-03 1.5 4.56e+06 1.2 8.4e+01 1.4e+03 
0.0e+00  3 11 28 15  0   3 11 28 15  0  1498
FEI: init. str.        1 1.0 1.9560e-03 1.0 0.00e+00 0.0 4.0e+00 4.7e+03 
9.0e+00  1  0  1  2  4   1  0  1  2  4     0
FEI: Prom setup        4 1.0 1.1263e-02 1.0 0.00e+00 0.0 1.6e+02 1.8e+03 
2.2e+01  8  0 53 36 10   8  0 53 36 10     0
FEI: solv.setup        3 1.0 2.8233e-02 1.1 3.50e+07 1.2 2.3e+01 1.4e+04 
1.1e+02 20 85  8 42 51  20 85  8 42 52  2300
 FEI: BCs & reg.       1 1.0 6.1989e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
7.0e+00  0  0  0  0  3   0  0  0  0  3     0
 FEI: MakeRest.        1 1.0 2.5464e-02 1.0 3.18e+07 1.1 2.3e+01 1.4e+04 
6.7e+01 19 81  8 42 31  19 81  8 42 31  2422
FEI: SLESSolve*3      14 1.0 5.5134e-03 1.6 4.56e+06 1.2 8.4e+01 1.4e+03 
0.0e+00  3 11 28 15  0   3 11 28 15  0  1503
Fine grid      1       1 1.0 1.5974e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 
3.0e+00  0  0  0  0  1   0  0  0  0  1     0
MG++(new grid) 1       1 1.0 3.8149e-03 1.0 0.00e+00 0.0 1.1e+01 1.4e+04 
1.0e+00  3  0  4 19  0   3  0  4 19  0     0
RAP            2       1 1.0 2.1625e-02 1.0 2.99e+07 1.1 1.0e+00 1.9e+05 
1.6e+01 16 76  0 24  7  16 76  0 24  7  2682
SLES setup    *2       1 1.0 2.5468e-02 1.0 3.18e+07 1.1 2.3e+01 1.4e+04 
6.7e+01 19 81  8 42 31  19 81  8 42 31  2421
Prometheus    *1       5 1.0 1.3446e-02 1.0 0.00e+00 0.0 1.6e+02 1.9e+03 
3.2e+01 10  0 54 39 15  10  0 54 39 15     0
Coarse Grid Solv      14 1.0 4.7636e-04 3.7 9.14e+05 0.0 0.0e+00 0.0e+00 
0.0e+00  0  1  0  0  0   0  1  0  0  0  1918
Grid coarsen   1       1 1.0 4.2310e-03 1.0 0.00e+00 0.0 1.2e+02 7.5e+02 
1.1e+01  3  0 41 12  5   3  0 41 12  5     0
New Coarse     1       1 1.0 3.0611e-03 1.0 0.00e+00 0.0 2.3e+01 1.7e+03 
6.5e+00  2  0  8  5  3   2  0  8  5  3     0
SmoothP        2       1 1.0 2.2371e-03 1.0 1.83e+06 1.0 2.0e+01 7.2e+03 
2.8e+01  2  5  7 18 13   2  5  7 18 13  1634
CG Est. lamb.1 2       1 1.0 1.0400e-03 1.0 1.00e+06 1.0 1.8e+01 1.4e+03 
2.8e+01  1  3  6  3 13   1  3  6  3 13  1915
AP_0           2       1 1.0 9.2292e-04 1.3 7.91e+05 1.0 0.0e+00 0.0e+00 
0.0e+00  1  2  0  0  0   1  2  0  0  0  1714
 RAP core      2       1 1.0 2.1017e-02 1.2 2.99e+07 1.1 0.0e+00 0.0e+00 
0.0e+00 14 76  0  0  0  14 76  0  0  0  2760
 RAP my assem. 2       1 1.0 4.2980e-0310.2 0.00e+00 0.0 1.0e+00 1.9e+05 
1.0e+00  2  0  0 24  0   2  0  0 24  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Matrix    11              8      1241380     0
              Vector    33             32       228544     0
      Vector Scatter     4              3         3180     0
           Index Set    14             14        11872     0
       Krylov Solver     2              2         2224     0
      Preconditioner     3              3         2896     0
              Viewer     1              0            0     0
========================================================================================================================
Average time to get PetscTime(): 0
Average time for MPI_Barrier(): 4.29153e-07
Average time for zero size MPI_Send(): 1.07288e-06
#PETSc Option Table entries:
-aggmg_smooths 1
-ksp_monitor
-ksp_type cg
-ksp_view
-log_summary
-mat_inode_limit 3
-options_left
-pc_type prometheus
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 
sizeof(PetscScalar) 8
Configure run at: Tue Feb 12 09:44:26 2013
Configure options: --with-mpi-dir=/usr/local/openmpi/1.5.4/gcc/x86_64 
--download-f-blas-lapack=1 --download-parmetis=1 --download-prometheus=1 
--with-debugging=0 --with-shared-libraries=0 --download-spooles=1 
--download-hypre=1 --download-superlu_dist=1
-----------------------------------------
Libraries compiled on Tue Feb 12 09:44:26 2013 on ilfb46.ilsb.tuwien.ac.at 
Machine characteristics: 
Linux-2.6.32-279.5.1.el6.x86_64-x86_64-with-redhat-6.3-Carbon
Using PETSc directory: /usr2/pahr/software/feap/petsc-3.2-p7
Using PETSc arch: linux-gnu-c
-----------------------------------------

Using C compiler: /usr/local/openmpi/1.5.4/gcc/x86_64/bin/mpicc  -Wall 
-Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O  ${COPTFLAGS} 
${CFLAGS}
Using Fortran compiler: /usr/local/openmpi/1.5.4/gcc/x86_64/bin/mpif90  -Wall 
-Wno-unused-variable -O   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: 
-I/usr2/pahr/software/feap/petsc-3.2-p7/linux-gnu-c/include 
-I/usr2/pahr/software/feap/petsc-3.2-p7/include 
-I/usr2/pahr/software/feap/petsc-3.2-p7/include 
-I/usr2/pahr/software/feap/petsc-3.2-p7/linux-gnu-c/include 
-I/usr/local/openmpi/1.5.4/gcc/x86_64/include -I/usr/local/include
-----------------------------------------

Using C linker: /usr/local/openmpi/1.5.4/gcc/x86_64/bin/mpicc
Using Fortran linker: /usr/local/openmpi/1.5.4/gcc/x86_64/bin/mpif90
Using libraries: 
-Wl,-rpath,/usr2/pahr/software/feap/petsc-3.2-p7/linux-gnu-c/lib 
-L/usr2/pahr/software/feap/petsc-3.2-p7/linux-gnu-c/lib -lpetsc -lX11 -lpthread 
-Wl,-rpath,/usr2/pahr/software/feap/petsc-3.2-p7/linux-gnu-c/lib 
-L/usr2/pahr/software/feap/petsc-3.2-p7/linux-gnu-c/lib -lpromfei -lprometheus 
-lmpi_cxx -lstdc++ -lsuperlu_dist_2.5 -lparmetis -lmetis -lHYPRE -lmpi_cxx 
-lstdc++ -lspooles -lflapack -lfblas -L/usr/local/lib64 
-L/usr/local/lib64/openmpi -L/usr/local/openmpi/1.5.4/gcc/x86_64/lib64 
-L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -ldl -lmpi -lnsl -lutil -lgcc_s 
-lpthread -lmpi_f90 -lmpi_f77 -lgfortran -lm -lm -lm -lm -lmpi_cxx -lstdc++ 
-lmpi_cxx -lstdc++ -ldl -lmpi -lnsl -lutil -lgcc_s -lpthread -ldl 
-----------------------------------------

#PETSc Option Table entries:
-aggmg_smooths 1
-ksp_monitor
-ksp_type cg
-ksp_view
-log_summary
-mat_inode_limit 3
-options_left
-pc_type prometheus
#End of PETSc Option Table entries
There are no unused options.


    F I N I T E   E L E M E N T   A N A L Y S I S   P R O G R A M

           FEAP (C) Regents of the University of California
                         All Rights Reserved.
                       VERSION: Release 8.4.1d      
                          DATE: 01 January 2014     

         Files are set as:   Status    Filename

           Input   (read ) : Exists  Icube_0001                      
           Output  (write) : Exists  Ocube_0001                      
           Restart (read ) : New     Rcube_0001                      
           Restart (write) : New     Rcube_0001                      
           Plots   (write) : New     Pcube_0001                      

         Caution, existing write files will be overwritten.

         Are filenames correct?( y or n; r = redefine all, s = stop) :
         R U N N I N G    F E A P    P R O B L E M    N O W

          --> Please report errors by e-mail to:
              [email protected] 

  0 KSP Residual norm 1.174597003965e-01 
  1 KSP Residual norm 1.703870321834e-02 
  2 KSP Residual norm 1.021316498741e-03 
  3 KSP Residual norm 1.918005367975e-04 
  4 KSP Residual norm 1.957699357627e-05 
  5 KSP Residual norm 1.055648440464e-06 
  6 KSP Residual norm 1.802665374208e-07 
  7 KSP Residual norm 1.676093936891e-08 
  8 KSP Residual norm 8.185630477236e-10 
KSP Object: 2 MPI processes
  type: cg
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-08, absolute=1e-16, divergence=1e+16
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 2 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=2 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     2 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     2 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 2
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_coarse_sub_)       1 MPI processes
        type: preonly
        maximum iterations=1, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_coarse_sub_)       1 MPI processes
        type: lu
          LU: out-of-place factorization
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
          matrix ordering: nd
          factor fill ratio given 5, needed 1.08247
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=96, cols=96, bs=6
                package used to perform factorization: petsc
                total: nonzeros=7560, allocated nonzeros=7560
                total number of mallocs used during MatSetValues calls =0
                  using I-node routines: found 27 nodes, limit used is 5
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=96, cols=96, bs=6
          total: nonzeros=6984, allocated nonzeros=6984
          total number of mallocs used during MatSetValues calls =0
            using I-node routines: found 32 nodes, limit used is 5
      linear system matrix = precond matrix:
      Matrix Object:       2 MPI processes
        type: mpiaij
        rows=96, cols=96, bs=6
        total: nonzeros=6984, allocated nonzeros=6984
        total number of mallocs used during MatSetValues calls =0
          using I-node (on process 0) routines: found 32 nodes, limit used is 5
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     2 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.170852, max = 3.58789
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     2 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       2 MPI processes
        type: mpiaij
        rows=2013, cols=2013, bs=3
        total: nonzeros=100899, allocated nonzeros=100899
        total number of mallocs used during MatSetValues calls =0
          using I-node (on process 0) routines: found 336 nodes, limit used is 5
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   2 MPI processes
    type: mpiaij
    rows=2013, cols=2013, bs=3
    total: nonzeros=100899, allocated nonzeros=100899
    total number of mallocs used during MatSetValues calls =0
      using I-node (on process 0) routines: found 336 nodes, limit used is 5
  0 KSP Residual norm 2.786433242749e-05 
  1 KSP Residual norm 2.879155786719e-06 
  2 KSP Residual norm 2.468572476389e-07 
  3 KSP Residual norm 2.819800213284e-08 
  4 KSP Residual norm 2.628207284894e-09 
  5 KSP Residual norm 2.537967773642e-10 
  6 KSP Residual norm 2.702578211041e-11 
  7 KSP Residual norm 2.705257315894e-12 
  8 KSP Residual norm 2.588375739486e-13 
KSP Object: 2 MPI processes
  type: cg
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-08, absolute=1e-16, divergence=1e+16
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 2 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=2 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     2 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     2 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 2
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_coarse_sub_)       1 MPI processes
        type: preonly
        maximum iterations=1, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_coarse_sub_)       1 MPI processes
        type: lu
          LU: out-of-place factorization
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
          matrix ordering: nd
          factor fill ratio given 5, needed 1.08247
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=96, cols=96, bs=6
                package used to perform factorization: petsc
                total: nonzeros=7560, allocated nonzeros=7560
                total number of mallocs used during MatSetValues calls =0
                  using I-node routines: found 27 nodes, limit used is 5
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=96, cols=96, bs=6
          total: nonzeros=6984, allocated nonzeros=6984
          total number of mallocs used during MatSetValues calls =0
            using I-node routines: found 32 nodes, limit used is 5
      linear system matrix = precond matrix:
      Matrix Object:       2 MPI processes
        type: mpiaij
        rows=96, cols=96, bs=6
        total: nonzeros=6984, allocated nonzeros=6984
        total number of mallocs used during MatSetValues calls =0
          using I-node (on process 0) routines: found 32 nodes, limit used is 5
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     2 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.170984, max = 3.59065
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     2 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       2 MPI processes
        type: mpiaij
        rows=2013, cols=2013, bs=3
        total: nonzeros=100899, allocated nonzeros=100899
        total number of mallocs used during MatSetValues calls =0
          using I-node (on process 0) routines: found 336 nodes, limit used is 5
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   2 MPI processes
    type: mpiaij
    rows=2013, cols=2013, bs=3
    total: nonzeros=100899, allocated nonzeros=100899
    total number of mallocs used during MatSetValues calls =0
      using I-node (on process 0) routines: found 336 nodes, limit used is 5
  0 KSP Residual norm 1.567241860739e-10 
  1 KSP Residual norm 1.949797962509e-11 
  2 KSP Residual norm 1.882291311511e-12 
  3 KSP Residual norm 1.985926088767e-13 
  4 KSP Residual norm 1.628297318609e-14 
  5 KSP Residual norm 1.556553520720e-15 
  6 KSP Residual norm 1.773769704074e-16 
  7 KSP Residual norm 1.554166465541e-17 
KSP Object: 2 MPI processes
  type: cg
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-08, absolute=1e-16, divergence=1e+16
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 2 MPI processes
  type: gamg
    MG: type is MULTIPLICATIVE, levels=2 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     2 MPI processes
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     2 MPI processes
      type: bjacobi
        block Jacobi: number of blocks = 2
        Local solve is same for all blocks, in the following KSP and PC objects:
      KSP Object:      (mg_coarse_sub_)       1 MPI processes
        type: preonly
        maximum iterations=1, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
        using NONE norm type for convergence test
      PC Object:      (mg_coarse_sub_)       1 MPI processes
        type: lu
          LU: out-of-place factorization
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
          matrix ordering: nd
          factor fill ratio given 5, needed 1.08247
            Factored matrix follows:
              Matrix Object:               1 MPI processes
                type: seqaij
                rows=96, cols=96, bs=6
                package used to perform factorization: petsc
                total: nonzeros=7560, allocated nonzeros=7560
                total number of mallocs used during MatSetValues calls =0
                  using I-node routines: found 27 nodes, limit used is 5
        linear system matrix = precond matrix:
        Matrix Object:         1 MPI processes
          type: seqaij
          rows=96, cols=96, bs=6
          total: nonzeros=6984, allocated nonzeros=6984
          total number of mallocs used during MatSetValues calls =0
            using I-node routines: found 32 nodes, limit used is 5
      linear system matrix = precond matrix:
      Matrix Object:       2 MPI processes
        type: mpiaij
        rows=96, cols=96, bs=6
        total: nonzeros=6984, allocated nonzeros=6984
        total number of mallocs used during MatSetValues calls =0
          using I-node (on process 0) routines: found 32 nodes, limit used is 5
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     2 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0.170983, max = 3.59065
      maximum iterations=2
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     2 MPI processes
      type: jacobi
      linear system matrix = precond matrix:
      Matrix Object:       2 MPI processes
        type: mpiaij
        rows=2013, cols=2013, bs=3
        total: nonzeros=100899, allocated nonzeros=100899
        total number of mallocs used during MatSetValues calls =0
          using I-node (on process 0) routines: found 336 nodes, limit used is 5
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Matrix Object:   2 MPI processes
    type: mpiaij
    rows=2013, cols=2013, bs=3
    total: nonzeros=100899, allocated nonzeros=100899
    total number of mallocs used during MatSetValues calls =0
      using I-node (on process 0) routines: found 336 nodes, limit used is 5
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r 
-fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: 
----------------------------------------------

/usr2/tgross/parFEAP/parFEAP84_mod/FEAP84/ver84/parfeap/feap on a linux-gnu-c 
named ilfb35.ilsb.tuwien.ac.at with 2 processors, by tgross Mon Jan 27 17:10:03 
2014
Using Petsc Release Version 3.4.3, Oct, 15, 2013 

                         Max       Max/Min        Avg      Total 
Time (sec):           3.675e-01      1.00024   3.675e-01
Objects:              5.360e+02      1.01132   5.330e+02
Flops:                3.356e+07      1.00065   3.355e+07  6.709e+07
Flops/sec:            9.134e+07      1.00089   9.129e+07  1.826e+08
MPI Messages:         4.320e+02      1.00000   4.320e+02  8.640e+02
MPI Message Lengths:  9.964e+05      1.00000   2.307e+03  1.993e+06
MPI Reductions:       1.269e+03      1.00475

Flop counting convention: 1 flop = 1 real number operation of type 
(multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N 
flops
                            and VecAXPY() for complex vectors of length N --> 
8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- 
Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     
Avg         %Total   counts   %Total 
 0:      Main Stage: 3.6745e-01 100.0%  6.7093e+07 100.0%  8.640e+02 100.0%  
2.307e+03      100.0%  1.265e+03  99.7% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting 
output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and 
PetscLogStagePop().
      %T - percent time in this phase         %f - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in 
this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all 
processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                            
 --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct 
 %T %f %M %L %R  %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

MatMult              209 1.0 2.0997e-02 1.0 2.12e+07 1.0 4.2e+02 1.9e+03 
0.0e+00  6 62 48 40  0   6 62 48 40  0  1989
MatMultAdd            26 1.0 3.6705e-03 3.3 7.11e+05 1.0 2.6e+01 6.7e+02 
0.0e+00  1  2  3  1  0   1  2  3  1  0   381
MatMultTranspose      26 1.0 1.7848e-03 1.1 7.11e+05 1.0 2.6e+01 6.7e+02 
0.0e+00  0  2  3  1  0   0  2  3  1  0   784
MatSolve              26 0.0 2.5368e-04 0.0 3.91e+05 0.0 0.0e+00 0.0e+00 
0.0e+00  0  1  0  0  0   0  1  0  0  0  1540
MatLUFactorSym         3 1.0 6.2466e-0429.4 0.00e+00 0.0 0.0e+00 0.0e+00 
9.0e+00  0  0  0  0  1   0  0  0  0  1     0
MatLUFactorNum         3 1.0 9.3007e-04156.0 1.12e+06 0.0 0.0e+00 0.0e+00 
0.0e+00  0  2  0  0  0   0  2  0  0  0  1199
MatScale               9 1.0 3.0088e-04 1.1 1.16e+05 1.0 6.0e+00 6.3e+02 
0.0e+00  0  0  1  0  0   0  0  1  0  0   760
MatAssemblyBegin      60 1.0 3.3300e-03 1.0 0.00e+00 0.0 1.8e+01 2.1e+03 
6.6e+01  1  0  2  2  5   1  0  2  2  5     0
MatAssemblyEnd        60 1.0 1.2232e-02 1.0 0.00e+00 0.0 8.2e+01 1.4e+02 
2.0e+02  3  0  9  1 16   3  0  9  1 16     0
MatGetRow          11088 1.0 1.4744e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            3 0.0 2.7895e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         3 0.0 1.1182e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 
6.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatCoarsen             3 1.0 1.6336e-03 1.0 0.00e+00 0.0 2.4e+01 1.0e+03 
5.1e+01  0  0  3  1  4   0  0  3  1  4     0
MatZeroEntries         3 1.0 2.1291e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView               15 1.7 9.4581e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
9.0e+00  0  0  0  0  1   0  0  0  0  1     0
MatAXPY                3 1.0 1.0014e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMatMult             3 1.0 9.5661e-03 1.0 1.16e+06 1.0 3.6e+01 3.9e+03 
7.2e+01  3  3  4  7  6   3  3  4  7  6   237
MatMatMultSym          3 1.0 6.6750e-03 1.0 0.00e+00 0.0 3.0e+01 2.9e+03 
6.6e+01  2  0  3  4  5   2  0  3  4  5     0
MatMatMultNum          3 1.0 2.9039e-03 1.0 1.16e+06 1.0 6.0e+00 9.0e+03 
6.0e+00  1  3  1  3  0   1  3  1  3  0   780
MatPtAP                3 1.0 3.1274e-02 1.0 7.26e+06 1.1 5.4e+01 8.9e+03 
7.5e+01  9 21  6 24  6   9 21  6 24  6   440
MatPtAPSymbolic        3 1.0 1.5709e-02 1.0 0.00e+00 0.0 3.6e+01 1.0e+04 
4.5e+01  4  0  4 19  4   4  0  4 19  4     0
MatPtAPNumeric         3 1.0 1.5562e-02 1.0 7.26e+06 1.1 1.8e+01 5.8e+03 
3.0e+01  4 21  2  5  2   4 21  2  5  2   884
MatTrnMatMult          3 1.0 7.8683e-03 1.0 3.04e+05 1.0 3.6e+01 4.3e+03 
8.7e+01  2  1  4  8  7   2  1  4  8  7    77
MatGetLocalMat        15 1.0 7.1907e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
1.8e+01  0  0  0  0  1   0  0  0  0  1     0
MatGetBrAoCol          9 1.0 1.2493e-03 1.0 0.00e+00 0.0 4.2e+01 1.1e+04 
1.2e+01  0  0  5 23  1   0  0  5 23  1     0
MatGetSymTrans         6 1.0 1.8501e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecDot                 3 1.0 8.7976e-05 1.0 6.04e+03 1.0 0.0e+00 0.0e+00 
3.0e+00  0  0  0  0  0   0  0  0  0  0   137
VecMDot               30 1.0 1.8568e-03 1.1 3.32e+05 1.0 0.0e+00 0.0e+00 
3.0e+01  0  1  0  0  2   0  1  0  0  2   358
VecTDot               46 1.0 1.3773e-03 1.2 9.27e+04 1.0 0.0e+00 0.0e+00 
4.6e+01  0  0  0  0  4   0  0  0  0  4   134
VecNorm               59 1.0 1.8394e-03 1.1 1.19e+05 1.0 0.0e+00 0.0e+00 
5.9e+01  0  0  0  0  5   0  0  0  0  5   129
VecScale             137 1.0 2.1148e-04 1.1 1.38e+05 1.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0  1304
VecCopy               35 1.0 3.0518e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               220 1.0 7.9632e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY              257 1.0 3.8981e-04 1.1 5.18e+05 1.0 0.0e+00 0.0e+00 
0.0e+00  0  2  0  0  0   0  2  0  0  0  2654
VecAYPX              228 1.0 4.2653e-04 1.0 3.02e+05 1.0 0.0e+00 0.0e+00 
0.0e+00  0  1  0  0  0   0  1  0  0  0  1416
VecMAXPY              33 1.0 1.9550e-04 1.0 3.93e+05 1.0 0.0e+00 0.0e+00 
0.0e+00  0  1  0  0  0   0  1  0  0  0  4016
VecAssemblyBegin      90 1.0 6.1271e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
2.6e+02  2  0  0  0 21   2  0  0  0 21     0
VecAssemblyEnd        90 1.0 6.1750e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult     189 1.0 4.0197e-04 1.0 1.91e+05 1.0 0.0e+00 0.0e+00 
0.0e+00  0  1  0  0  0   0  1  0  0  0   946
VecScatterBegin      351 1.0 3.0189e-03 1.2 0.00e+00 0.0 6.5e+02 1.5e+03 
0.0e+00  1  0 75 50  0   1  0 75 50  0     0
VecScatterEnd        351 1.0 1.1168e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  3  0  0  0  0   3  0  0  0  0     0
VecSetRandom           3 1.0 5.1022e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize          33 1.0 9.9349e-04 1.0 9.98e+04 1.0 0.0e+00 0.0e+00 
3.3e+01  0  0  0  0  3   0  0  0  0  3   201
KSPGMRESOrthog        30 1.0 2.0542e-03 1.1 6.65e+05 1.0 0.0e+00 0.0e+00 
3.0e+01  1  2  0  0  2   1  2  0  0  2   647
KSPSetUp              18 1.0 5.8818e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
1.2e+01  0  0  0  0  1   0  0  0  0  1     0
KSPSolve               3 1.0 1.3235e-01 1.0 3.36e+07 1.0 8.5e+02 2.3e+03 
1.2e+03 36100 99100 96  36100 99100 96   507
PCSetUp                6 1.0 1.0533e-01 1.0 1.30e+07 1.0 4.4e+02 2.9e+03 
1.1e+03 28 38 51 64 90  28 38 51 64 90   244
PCSetUpOnBlocks       26 1.0 1.7312e-0313.4 1.12e+06 0.0 0.0e+00 0.0e+00 
1.8e+01  0  2  0  0  1   0  2  0  0  1   644
PCApply               26 1.0 2.3993e-02 1.0 1.92e+07 1.1 3.6e+02 1.7e+03 
2.4e+01  7 56 42 31  2   7 56 42 31  2  1557
PCGAMGgraph_AGG        1 1.0 7.0369e-03 1.0 1.14e+04 1.0 1.0e+01 2.4e+02 
3.6e+01  2  0  1  0  3   2  0  1  0  3     3
PCGAMGcoarse_AGG       1 1.0 4.0741e-03 1.0 1.01e+05 1.0 3.0e+01 2.3e+03 
6.6e+01  1  0  3  3  5   1  0  3  3  5    50
PCGAMGProl_AGG         1 1.0 4.3368e-03 1.0 0.00e+00 0.0 4.8e+01 1.1e+03 
1.1e+02  1  0  6  3  9   1  0  6  3  9     0
PCGAMGPOpt_AGG         1 1.0 6.1359e-03 1.0 1.71e+06 1.0 3.2e+01 2.7e+03 
5.6e+01  2  5  4  4  4   2  5  4  4  4   551
PCGAMGgraph_AGG        1 1.0 5.7518e-03 1.0 1.14e+04 1.0 1.0e+01 2.4e+02 
3.6e+01  2  0  1  0  3   2  0  1  0  3     4
PCGAMGcoarse_AGG       1 1.0 3.7661e-03 1.0 1.01e+05 1.0 3.0e+01 2.3e+03 
6.6e+01  1  0  3  3  5   1  0  3  3  5    54
PCGAMGProl_AGG         1 1.0 3.9451e-03 1.0 0.00e+00 0.0 4.8e+01 1.1e+03 
1.1e+02  1  0  6  3  9   1  0  6  3  9     0
PCGAMGPOpt_AGG         1 1.0 6.0630e-03 1.0 1.71e+06 1.0 3.2e+01 2.7e+03 
5.6e+01  2  5  4  4  4   2  5  4  4  4   558
PCGAMGgraph_AGG        1 1.0 5.6109e-03 1.0 1.14e+04 1.0 1.0e+01 2.4e+02 
3.6e+01  2  0  1  0  3   2  0  1  0  3     4
PCGAMGcoarse_AGG       1 1.0 3.6681e-03 1.0 1.01e+05 1.0 3.0e+01 2.3e+03 
6.6e+01  1  0  3  3  5   1  0  3  3  5    55
PCGAMGProl_AGG         1 1.0 3.9270e-03 1.0 0.00e+00 0.0 4.8e+01 1.1e+03 
1.1e+02  1  0  6  3  9   1  0  6  3  9     0
PCGAMGPOpt_AGG         1 1.0 5.8789e-03 1.0 1.71e+06 1.0 3.2e+01 2.7e+03 
5.6e+01  2  5  4  4  4   2  5  4  4  4   575
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Matrix   105            105      6082820     0
      Matrix Coarsen     3              3         1884     0
              Vector   259            259      1362280     0
      Vector Scatter    28             28        29456     0
           Index Set    98             98        86424     0
       Krylov Solver    18             18       160152     0
      Preconditioner    18             18        18036     0
              Viewer     4              3         2184     0
         PetscRandom     3              3         1872     0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 6.56128e-05
Average time for zero size MPI_Send(): 2.14577e-05
#PETSc Option Table entries:
-ksp_monitor
-ksp_type cg
-ksp_view
-log_summary
-options_left
-pc_gamg_agg_nsmooths 1
-pc_gamg_type agg
-pc_type gamg
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 
sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure run at: Thu Jan 23 19:10:29 2014
Configure options: --download-parmetis --download-superlu_dist --download-mpich 
--download-hypre --download-metis --download-ml --download-mumps 
--download-scalapack --download-blacs --download-cmake 
--download-f-blas-lapack=1 --with-debugging=0
-----------------------------------------
Libraries compiled on Thu Jan 23 19:10:29 2014 on ilfb35.ilsb.tuwien.ac.at 
Machine characteristics: 
Linux-2.6.32-358.2.1.el6.x86_64-x86_64-with-redhat-6.4-Carbon
Using PETSc directory: /usr2/tgross/parFEAP/parFEAP84_mod/petsc-3.4.3
Using PETSc arch: linux-gnu-c
-----------------------------------------

Using C compiler: 
/usr2/tgross/parFEAP/parFEAP84_mod/petsc-3.4.3/linux-gnu-c/bin/mpicc  -fPIC 
-Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O  
${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: 
/usr2/tgross/parFEAP/parFEAP84_mod/petsc-3.4.3/linux-gnu-c/bin/mpif90  -fPIC  
-Wall -Wno-unused-variable -O  ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: 
-I/usr2/tgross/parFEAP/parFEAP84_mod/petsc-3.4.3/linux-gnu-c/include 
-I/usr2/tgross/parFEAP/parFEAP84_mod/petsc-3.4.3/include 
-I/usr2/tgross/parFEAP/parFEAP84_mod/petsc-3.4.3/include 
-I/usr2/tgross/parFEAP/parFEAP84_mod/petsc-3.4.3/linux-gnu-c/include
-----------------------------------------

Using C linker: 
/usr2/tgross/parFEAP/parFEAP84_mod/petsc-3.4.3/linux-gnu-c/bin/mpicc
Using Fortran linker: 
/usr2/tgross/parFEAP/parFEAP84_mod/petsc-3.4.3/linux-gnu-c/bin/mpif90
Using libraries: 
-Wl,-rpath,/usr2/tgross/parFEAP/parFEAP84_mod/petsc-3.4.3/linux-gnu-c/lib 
-L/usr2/tgross/parFEAP/parFEAP84_mod/petsc-3.4.3/linux-gnu-c/lib -lpetsc 
-Wl,-rpath,/usr2/tgross/parFEAP/parFEAP84_mod/petsc-3.4.3/linux-gnu-c/lib 
-L/usr2/tgross/parFEAP/parFEAP84_mod/petsc-3.4.3/linux-gnu-c/lib -lHYPRE 
-Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 
-L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -lmpichcxx -lstdc++ -lcmumps -ldmumps 
-lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lml -lmpichcxx -lstdc++ 
-lsuperlu_dist_3.3 -lflapack -lfblas -lX11 -lparmetis -lmetis -lpthread 
-lmpichf90 -lgfortran -lm -lm -lmpichcxx -lstdc++ -lmpichcxx -lstdc++ -ldl 
-lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl 
-----------------------------------------

#PETSc Option Table entries:
-ksp_monitor
-ksp_type cg
-ksp_view
-log_summary
-options_left
-pc_gamg_agg_nsmooths 1
-pc_gamg_type agg
-pc_type gamg
#End of PETSc Option Table entries
There are no unused options.




On Jan 27, 2014, at 4:34 PM, Jed Brown <[email protected]> wrote:

> Thomas Gross <[email protected]> writes:
> 
>> Dear petsc users/developers,
>> 
>> I am using petsc in a parallel FEAP framework 
>> (http://www.ce.berkeley.edu/projects/feap/) for mechanics oriented problems 
>> (mainly linear elasticity or plasticity).
>> In earlier petsc versions (3.2.7) I always used the petsc options:
>> -ksp_type cg -pc_type prometheus (with standard settings) and achieved a 
>> great level of performance.
>> 
>> After updating to petsc 3.4.3, where the “Prometheus” preconditioner was 
>> replaced by “GAMG”, I can not reach the same level of performance. My 
>> settings are:
>> -ksp_type cg -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 
>> (settings recommended for FEAP).
>> 
>> Is there a set of GAMG settings, which are equivalent to the standard 
>> Prometheus settings?
> 
> Please send -log_summary -ksp_monitor -ksp_view output for both cases.
> Make sure to use the same smoother you used with Prometheus.

Re: [petsc-users] Prometheus vs GAMG for elasticity/plasticity problems

Reply via email to