On 6/6/2012 10:31 PM, Barry Smith wrote:
> On Jun 6, 2012, at 3:04 PM, TAY wee-beng wrote:
>
>> Hi,
>>
>> I have used 3 KSP, 2 to solve momentum eqns and 1 for the multigrid. I have 
>> used
>>
>> call KSPSetOptionsPrefix(ksp,"mg_",ierr) for the multigrid.
>>
>> I run with :
>>
>> -log_summary -mg_ksp_view so as to single out the multigrid ksp, but I'm not 
>> sure if it's really working...
>     Are you sure the KSPSetOptionsPrefix() is called before the 
> KSPSetFromOptions()? It appears the prefix is not being set correctly.
>
>     From the limited data below it is running multigrid with one level (hence 
> you cannot expect great performance). You need to at least provide MG with a 
> bit more information like how many levels you would like it to use.
>
>     Barry

Ya, I called KSPSetOptionsPrefix after KSPSetFromOptions. I've changed 
it. I've included the input below. Btw, in my code, I followed the 
example in ex29.c which uses:

*call KSPCreate(MPI_COMM_WORLD,ksp,ierr)

call 
DMDACreate2d(MPI_COMM_WORLD,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,1,num_procs,i1,i1,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da,ierr)

call DMSetFunction(da,ComputeRHS,ierr)

call DMSetJacobian(da,ComputeMatrix,ierr)

call KSPSetDM(ksp,da,ierr)

call KSPSetOptionsPrefix(ksp,"mg_",ierr)

call KSPSetFromOptions(ksp,ierr)

tol=1.e-5

call 
KSPSetTolerances(ksp,tol,PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_INTEGER,ierr)

call KSPSetUp(ksp,ierr)

call KSPSolve(ksp,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr) *

I just read the manual and it says for multigrid, I have to use:

KSPGetPC(KSP ksp,PC *pc);
PCSetType(PC pc,PCMG);
PCMGSetLevels(pc,int levels,MPI Comm *comms)

I included after after KSPCreate with:

*call KSPGetPC(ksp,pc,ierr)

call PCSetType(pc_uv,PCMG,ierr)

mg_lvl = 2

call PCMGSetLevels(pc,mg_lvl,MPI_COMM_WORLD,ierr)*

However, I get the error:

Caught signal number 11 SEGV: Segmentation Violation, probably memory 
access out of range

after calling *PCMGSetLevels*

What's the problem? Is there any examples which I can follow?

Thanks!

*---------------------------------------------- PETSc Performance 
Summary: ----------------------------------------------*

./a.out on a petsc-3.2 named n12-50 with 4 processors, by wtay Wed Jun  
6 22:45:37 2012
Using Petsc Development HG revision: 
c76fb3cac2a4ad0dfc9436df80f678898c867e86  HG Date: Thu May 31 00:33:26 
2012 -0500

                          Max       Max/Min        Avg      Total
Time (sec):           1.062e+01      1.00001   1.062e+01
Objects:              2.700e+01      1.00000   2.700e+01
Flops:                4.756e+08      1.00811   4.744e+08  1.897e+09
Flops/sec:            4.477e+07      1.00811   4.466e+07  1.786e+08
MPI Messages:         4.080e+02      2.00000   3.060e+02  1.224e+03
MPI Message Lengths:  2.328e+06      2.00000   5.706e+03  6.984e+06
MPI Reductions:       8.750e+02      1.00000

Flop counting convention: 1 flop = 1 real number operation of type 
(multiply/divide/add/subtract)
                             e.g., VecAXPY() for real vectors of length 
N --> 2N flops
                             and VecAXPY() for complex vectors of length 
N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages 
---  -- Message Lengths --  -- Reductions --
                         Avg     %Total     Avg     %Total   counts   
%Total     Avg         %Total   counts   %Total
  0:      Main Stage: 1.0623e+01 100.0%  1.8975e+09 100.0%  1.224e+03 
100.0%  5.706e+03      100.0%  8.740e+02  99.9%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on 
interpreting output.
Phase summary info:
    Count: number of times phase was executed
    Time and Flops: Max - maximum over all processors
                    Ratio - ratio of maximum to minimum over all processors
    Mess: number of messages sent
    Avg. len: average message length
    Reduct: number of global reductions
    Global: entire computation
    Stage: stages of a computation. Set stages with PetscLogStagePush() 
and PetscLogStagePop().
       %T - percent time in this phase         %f - percent flops in 
this phase
       %M - percent messages in this phase     %L - percent message 
lengths in this phase
       %R - percent reductions in this phase
    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time 
over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     
Flops                             --- Global ---  --- Stage ---   Total
                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg 
len Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage

MatMult              202 1.0 5.5212e-01 1.0 1.38e+08 1.0 1.2e+03 5.7e+03 
0.0e+00  5 29 99100  0   5 29 99100  0   996
MatSolve             252 1.0 6.8899e-01 1.1 1.71e+08 1.0 0.0e+00 0.0e+00 
0.0e+00  6 36  0  0  0   6 36  0  0  0   989
MatLUFactorNum        50 1.0 4.5529e-01 1.0 7.31e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  4 15  0  0  0   4 15  0  0  0   640
MatILUFactorSym        1 1.0 9.7420e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
1.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin      50 1.0 1.7412e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 
1.0e+02  0  0  0  0 11   0  0  0  0 11     0
MatAssemblyEnd        50 1.0 1.0649e-01 1.0 0.00e+00 0.0 1.2e+01 1.4e+03 
8.0e+00  1  0  1  0  1   1  0  1  0  1     0
MatGetRowIJ            1 1.0 4.0531e-06 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 7.0190e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
2.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetUp             100 1.0 2.9013e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve              50 1.0 2.0438e+00 1.0 4.76e+08 1.0 1.2e+03 5.7e+03 
4.6e+02 19100 99100 52  19100 99100 53   928
VecDot               202 1.0 6.9886e-02 1.4 1.63e+07 1.0 0.0e+00 0.0e+00 
2.0e+02  1  3  0  0 23   1  3  0  0 23   932
VecDotNorm2          101 1.0 4.0677e-02 2.1 1.63e+07 1.0 0.0e+00 0.0e+00 
1.0e+02  0  3  0  0 12   0  3  0  0 12  1602
VecNorm              151 1.0 3.5888e-02 1.4 1.22e+07 1.0 0.0e+00 0.0e+00 
1.5e+02  0  3  0  0 17   0  3  0  0 17  1357
VecCopy              100 1.0 2.2957e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               403 1.0 6.1034e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  1  0  0  0  0   1  0  0  0  0     0
VecAXPBYCZ           202 1.0 6.6927e-02 1.0 3.26e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  1  7  0  0  0   1  7  0  0  0  1947
VecWAXPY             202 1.0 7.0219e-02 1.1 1.63e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  1  3  0  0  0   1  3  0  0  0   928
VecAssemblyBegin     100 1.0 4.0812e-0213.3 0.00e+00 0.0 0.0e+00 0.0e+00 
3.0e+02  0  0  0  0 34   0  0  0  0 34     0
VecAssemblyEnd       100 1.0 4.8542e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin      202 1.0 7.2360e-03 1.7 0.00e+00 0.0 1.2e+03 5.7e+03 
0.0e+00  0  0 99100  0   0  0 99100  0     0
VecScatterEnd        202 1.0 2.7255e-02 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCSetUp              100 1.0 4.6843e-01 1.0 7.31e+07 1.0 0.0e+00 0.0e+00 
5.0e+00  4 15  0  0  1   4 15  0  0  1   622
PCSetUpOnBlocks       50 1.0 4.6814e-01 1.0 7.31e+07 1.0 0.0e+00 0.0e+00 
3.0e+00  4 15  0  0  0   4 15  0  0  0   623
PCApply              252 1.0 7.3618e-01 1.1 1.71e+08 1.0 0.0e+00 0.0e+00 
0.0e+00  7 36  0  0  0   7 36  0  0  0   926
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

               Matrix     4              4     16900896     0
        Krylov Solver     2              2         2168     0
               Vector    12             12      2604080     0
       Vector Scatter     1              1         1060     0
            Index Set     5              5       167904     0
       Preconditioner     2              2         1800     0
               Viewer     1              0            0     0
========================================================================================================================
Average time to get PetscTime(): 1.19209e-06
Average time for MPI_Barrier(): 6.62804e-06
Average time for zero size MPI_Send(): 2.02656e-05
#PETSc Option Table entries:
-log_summary
-mg_ksp_view
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 
sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure run at: Thu May 31 09:53:43 2012
Configure options: --with-mpi-dir=/opt/openmpi-1.5.3/ 
--with-blas-lapack-dir=/opt/intelcpro-11.1.059/mkl/lib/em64t/ 
--with-debugging=0 --download-hypre=1 
--prefix=/home/wtay/Lib/petsc-3.2-dev_shared_rel --known-mpi-shared=1 
--with-shared-libraries
-----------------------------------------
Libraries compiled on Thu May 31 09:53:43 2012 on hpc12
Machine characteristics: 
Linux-2.6.32-220.2.1.el6.x86_64-x86_64-with-centos-6.2-Final
Using PETSc directory: /home/wtay/Codes/petsc-dev
Using PETSc arch: petsc-3.2-dev_shared_rel
-----------------------------------------

Using C compiler: /opt/openmpi-1.5.3/bin/mpicc  -fPIC -wd1572 
-Qoption,cpp,--extended_float_type -O3  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: /opt/openmpi-1.5.3/bin/mpif90  -fPIC -O3   
${FOPTFLAGS} ${FFLAGS}
-----------------------------------------

Using include paths: 
-I/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/include 
-I/home/wtay/Codes/petsc-dev/include 
-I/home/wtay/Codes/petsc-dev/include 
-I/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/include 
-I/opt/openmpi-1.5.3/include
-----------------------------------------

Using C linker: /opt/openmpi-1.5.3/bin/mpicc
Using Fortran linker: /opt/openmpi-1.5.3/bin/mpif90
Using libraries: 
-Wl,-rpath,/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib 
-L/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib -lpetsc -lX11 
-lpthread 
-Wl,-rpath,/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib 
-L/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib -lHYPRE 
-lmpi_cxx -Wl,-rpath,/opt/openmpi-1.5.3/lib 
-Wl,-rpath,/opt/intelcpro-11.1.059/lib/intel64 
-Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lstdc++ 
-Wl,-rpath,/opt/intelcpro-11.1.059/mkl/lib/em64t 
-L/opt/intelcpro-11.1.059/mkl/lib/em64t -lmkl_intel_lp64 
-lmkl_intel_thread -lmkl_core -liomp5 -lpthread -ldl 
-L/opt/openmpi-1.5.3/lib -lmpi -lnsl -lutil 
-L/opt/intelcpro-11.1.059/lib/intel64 -limf 
-L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lsvml -lipgo -ldecimal -lgcc_s 
-lirc -lpthread -lirc_s -lmpi_f90 -lmpi_f77 -lm -lm -lifport -lifcore 
-lm -lm -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl -lmpi -lnsl 
-lutil -limf -lsvml -lipgo -ldecimal -lgcc_s -lirc -lpthread -lirc_s -ldl

>> Here's the output:
>>
>> ---------------------------------------------- PETSc Performance Summary: 
>> ----------------------------------------------
>>
>> ./a.out on a petsc-3.2 named n12-50 with 4 processors, by wtay Wed Jun  6 
>> 21:57:33 2012
>> Using Petsc Development HG revision: 
>> c76fb3cac2a4ad0dfc9436df80f678898c867e86  HG Date: Thu May 31 00:33:26 2012 
>> -0500
>>
>>                          Max       Max/Min        Avg      Total
>> Time (sec):           1.064e+01      1.00000   1.064e+01
>> Objects:              2.700e+01      1.00000   2.700e+01
>> Flops:                4.756e+08      1.00811   4.744e+08  1.897e+09
>> Flops/sec:            4.468e+07      1.00811   4.457e+07  1.783e+08
>> MPI Messages:         4.080e+02      2.00000   3.060e+02  1.224e+03
>> MPI Message Lengths:  2.328e+06      2.00000   5.706e+03  6.984e+06
>> MPI Reductions:       8.750e+02      1.00000
>>
>> Flop counting convention: 1 flop = 1 real number operation of type 
>> (multiply/divide/add/subtract)
>>                             e.g., VecAXPY() for real vectors of length N --> 
>>  2N flops
>>                             and VecAXPY() for complex vectors of length N 
>> -->  8N flops
>>
>> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  
>> -- Message Lengths --  -- Reductions --
>>                         Avg     %Total     Avg     %Total   counts   %Total  
>>    Avg         %Total   counts   %Total
>> 0:      Main Stage: 1.0644e+01 100.0%  1.8975e+09 100.0%  1.224e+03 100.0%  
>> 5.706e+03      100.0%  8.740e+02  99.9%
>>
>> ------------------------------------------------------------------------------------------------------------------------
>> See the 'Profiling' chapter of the users' manual for details on interpreting 
>> output.
>> Phase summary info:
>>    Count: number of times phase was executed
>>    Time and Flops: Max - maximum over all processors
>>                    Ratio - ratio of maximum to minimum over all processors
>>    Mess: number of messages sent
>>    Avg. len: average message length
>>    Reduct: number of global reductions
>>    Global: entire computation
>>    Stage: stages of a computation. Set stages with PetscLogStagePush() and 
>> PetscLogStagePop().
>>       %T - percent time in this phase         %f - percent flops in this 
>> phase
>>       %M - percent messages in this phase     %L - percent message lengths 
>> in this phase
>>       %R - percent reductions in this phase
>>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over 
>> all processors)
>> ------------------------------------------------------------------------------------------------------------------------
>> Event                Count      Time (sec)     Flops                         
>>     --- Global ---  --- Stage ---   Total
>>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len 
>> Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
>> ------------------------------------------------------------------------------------------------------------------------
>> --- Event Stage 0: Main Stage
>>
>> MatMult              202 1.0 5.5096e-01 1.0 1.38e+08 1.0 1.2e+03 5.7e+03 
>> 0.0e+00  5 29 99100  0   5 29 99100  0   998
>> MatSolve             252 1.0 6.9136e-01 1.1 1.71e+08 1.0 0.0e+00 0.0e+00 
>> 0.0e+00  6 36  0  0  0   6 36  0  0  0   986
>> MatLUFactorNum        50 1.0 4.6002e-01 1.0 7.31e+07 1.0 0.0e+00 0.0e+00 
>> 0.0e+00  4 15  0  0  0   4 15  0  0  0   634
>> MatILUFactorSym        1 1.0 9.5899e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
>> 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatAssemblyBegin      50 1.0 1.6270e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
>> 1.0e+02  0  0  0  0 11   0  0  0  0 11     0
>> MatAssemblyEnd        50 1.0 1.0896e-01 1.0 0.00e+00 0.0 1.2e+01 1.4e+03 
>> 8.0e+00  1  0  1  0  1   1  0  1  0  1     0
>> MatGetRowIJ            1 1.0 2.8610e-06 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatGetOrdering         1 1.0 7.2002e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
>> 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSetUp             100 1.0 2.9130e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSolve              50 1.0 2.0737e+00 1.0 4.76e+08 1.0 1.2e+03 5.7e+03 
>> 4.6e+02 19100 99100 52  19100 99100 53   915
>> VecDot               202 1.0 7.3588e-02 1.1 1.63e+07 1.0 0.0e+00 0.0e+00 
>> 2.0e+02  1  3  0  0 23   1  3  0  0 23   885
>> VecDotNorm2          101 1.0 3.9155e-02 1.7 1.63e+07 1.0 0.0e+00 0.0e+00 
>> 1.0e+02  0  3  0  0 12   0  3  0  0 12  1664
>> VecNorm              151 1.0 5.8769e-02 1.7 1.22e+07 1.0 0.0e+00 0.0e+00 
>> 1.5e+02  0  3  0  0 17   0  3  0  0 17   829
>> VecCopy              100 1.0 2.3459e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecSet               403 1.0 5.9994e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
>> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
>> VecAXPBYCZ           202 1.0 6.6376e-02 1.0 3.26e+07 1.0 0.0e+00 0.0e+00 
>> 0.0e+00  1  7  0  0  0   1  7  0  0  0  1963
>> VecWAXPY             202 1.0 6.9311e-02 1.0 1.63e+07 1.0 0.0e+00 0.0e+00 
>> 0.0e+00  1  3  0  0  0   1  3  0  0  0   940
>> VecAssemblyBegin     100 1.0 4.0355e-0214.1 0.00e+00 0.0 0.0e+00 0.0e+00 
>> 3.0e+02  0  0  0  0 34   0  0  0  0 34     0
>> VecAssemblyEnd       100 1.0 5.0378e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecScatterBegin      202 1.0 6.2275e-03 1.5 0.00e+00 0.0 1.2e+03 5.7e+03 
>> 0.0e+00  0  0 99100  0   0  0 99100  0     0
>> VecScatterEnd        202 1.0 2.0878e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> PCSetUp              100 1.0 4.7225e-01 1.0 7.31e+07 1.0 0.0e+00 0.0e+00 
>> 5.0e+00  4 15  0  0  1   4 15  0  0  1   617
>> PCSetUpOnBlocks       50 1.0 4.7191e-01 1.0 7.31e+07 1.0 0.0e+00 0.0e+00 
>> 3.0e+00  4 15  0  0  0   4 15  0  0  0   618
>> PCApply              252 1.0 7.3425e-01 1.1 1.71e+08 1.0 0.0e+00 0.0e+00 
>> 0.0e+00  7 36  0  0  0   7 36  0  0  0   928
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Memory usage is given in bytes:
>>
>> Object Type          Creations   Destructions     Memory  Descendants' Mem.
>> Reports information only for process 0.
>>
>> --- Event Stage 0: Main Stage
>>
>>               Matrix     4              4     16900896     0
>>        Krylov Solver     2              2         2168     0
>>               Vector    12             12      2604080     0
>>       Vector Scatter     1              1         1060     0
>>            Index Set     5              5       167904     0
>>       Preconditioner     2              2         1800     0
>>               Viewer     1              0            0     0
>> ========================================================================================================================
>> Average time to get PetscTime(): 1.09673e-06
>> Average time for MPI_Barrier(): 4.00543e-06
>> Average time for zero size MPI_Send(): 1.22786e-05
>> #PETSc Option Table entries:
>>
>> -log_summary
>> -mg_ksp_view
>> #End of PETSc Option Table entries
>> Compiled without FORTRAN kernels
>> Compiled with full precision matrices (default)
>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 
>> sizeof(PetscScalar) 8 sizeof(PetscInt) 4
>> Configure run at: Thu May 31 09:53:43 2012
>> Configure options: --with-mpi-dir=/opt/openmpi-1.5.3/ 
>> --with-blas-lapack-dir=/opt/intelcpro-11.1.059/mkl/lib/em64t/ 
>> --with-debugging=0 --download-hypre=1 
>> --prefix=/home/wtay/Lib/petsc-3.2-dev_shared_rel --known-mpi-shared=1 
>> --with-shared-libraries
>> -----------------------------------------
>> Libraries compiled on Thu May 31 09:53:43 2012 on hpc12
>> Machine characteristics: 
>> Linux-2.6.32-220.2.1.el6.x86_64-x86_64-with-centos-6.2-Final
>> Using PETSc directory: /home/wtay/Codes/petsc-dev
>> Using PETSc arch: petsc-3.2-dev_shared_rel
>> -----------------------------------------
>>
>> Using C compiler: /opt/openmpi-1.5.3/bin/mpicc  -fPIC -wd1572 
>> -Qoption,cpp,--extended_float_type -O3  ${COPTFLAGS} ${CFLAGS}
>> Using Fortran compiler: /opt/openmpi-1.5.3/bin/mpif90  -fPIC -O3   
>> ${FOPTFLAGS} ${FFLAGS}
>> -----------------------------------------
>>
>> Using include paths: 
>> -I/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/include 
>> -I/home/wtay/Codes/petsc-dev/include -I/home/wtay/Codes/petsc-dev/include 
>> -I/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/include 
>> -I/opt/openmpi-1.5.3/include
>> -----------------------------------------
>>
>> Using C linker: /opt/openmpi-1.5.3/bin/mpicc
>> Using Fortran linker: /opt/openmpi-1.5.3/bin/mpif90
>> Using libraries: 
>> -Wl,-rpath,/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib 
>> -L/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib -lpetsc -lX11 
>> -lpthread -Wl,-rpath,/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib 
>> -L/home/wtay/Codes/petsc-dev/petsc-3.2-dev_shared_rel/lib -lHYPRE -lmpi_cxx 
>> -Wl,-rpath,/opt/openmpi-1.5.3/lib 
>> -Wl,-rpath,/opt/intelcpro-11.1.059/lib/intel64 
>> -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lstdc++ 
>> -Wl,-rpath,/opt/intelcpro-11.1.059/mkl/lib/em64t 
>> -L/opt/intelcpro-11.1.059/mkl/lib/em64t -lmkl_intel_lp64 -lmkl_intel_thread 
>> -lmkl_core -liomp5 -lpthread -ldl -L/opt/openmpi-1.5.3/lib -lmpi -lnsl 
>> -lutil -L/opt/intelcpro-11.1.059/lib/intel64 -limf 
>> -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lsvml -lipgo -ldecimal -lgcc_s 
>> -lirc -lpthread -lirc_s -lmpi_f90 -lmpi_f77 -lm -lm -lifport -lifcore -lm 
>> -lm -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl -lmpi -lnsl -lutil -limf 
>> -lsvml -lipgo -ldecimal -lgcc_s -lirc -lpthread -lirc_s -ldl
>> -----------------------------------------
>>
>> Yours sincerely,
>>
>> TAY wee-beng
>>
>>
>> On 5/6/2012 1:34 AM, Barry Smith wrote:
>>>     Also run with -ksp_view to see exasctly what solver options it is 
>>> using. For example the number of levels, smoother on each level etc. My 
>>> guess is that the below is running on one level (because I don't see you 
>>> supplying options to control the number of levels etc).
>>>
>>>     Barry
>>>
>>> On Jun 4, 2012, at 4:15 PM, Jed Brown wrote:
>>>
>>>> Always send -log_summary when asking about performance.
>>>>
>>>> On Mon, Jun 4, 2012 at 4:11 PM, TAY wee-beng<zonexo at gmail.com>   wrote:
>>>> Hi,
>>>>
>>>> I tried using PETSc multigrid on my 2D CFD code. I had converted ksp eg. 
>>>> ex29 to Fortran and then added into my code to solve the Poisson equation.
>>>>
>>>> The main subroutines are:
>>>>
>>>> call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
>>>>
>>>> call 
>>>> DMDACreate2d(PETSC_COMM_WORLD,DMDA_BOUNDARY_NONE,DMDA_BOUNDARY_NONE,DMDA_STENCIL_STAR,i3,i3,PETSC_DECIDE,PETSC_DECIDE,i1,i1,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da,ierr)
>>>> call DMSetFunction(da,ComputeRHS,ierr)
>>>> call DMSetJacobian(da,ComputeMatrix,ierr)
>>>> call KSPSetDM(ksp,da,ierr)
>>>>
>>>> call KSPSetFromOptions(ksp,ierr)
>>>> call KSPSetUp(ksp,ierr)
>>>> call KSPSolve(ksp,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr)
>>>> call KSPGetSolution(ksp,x,ierr)
>>>> call VecView(x,PETSC_VIEWER_STDOUT_WORLD,ierr)
>>>> call KSPDestroy(ksp,ierr)
>>>> call DMDestroy(da,ierr)
>>>> call PetscFinalize(ierr)
>>>>
>>>>
>>>> Since the LHS matrix doesn't change, I only set up at the 1st time step, 
>>>> thereafter I only called ComputeRHS every time step.
>>>>
>>>> I was using HYPRE's geometric multigrid and the speed was much faster.
>>>>
>>>> What other options can I tweak to improve the speed? Or should I call the 
>>>> subroutines above at every timestep?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> -- 
>>>> Yours sincerely,
>>>>
>>>> TAY wee-beng
>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120606/bf81957a/attachment-0001.html>

Reply via email to