Thank you very much for your reply. The -log_summary is shown as follows:

************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r 
-fCourier9' to print this document            ***
************************************************************************************************************************


---------------------------------------------- PETSc Performance Summary: 
----------------------------------------------


./linearElasticity on a linux-gnu named c0409 with 64 processors, by fdkong Sat 
Mar 26 12:44:53 2011
Using Petsc Release Version 3.1.0, Patch 7, Mon Dec 20 14:26:37 CST 2010


                         Max       Max/Min        Avg      Total 
Time (sec):           3.459e+02      1.00012   3.459e+02
Objects:              2.120e+02      1.00000   2.120e+02
Flops:                3.205e+08      1.40451   2.697e+08  1.726e+10
Flops/sec:            9.268e+05      1.40451   7.799e+05  4.991e+07
Memory:               3.065e+06      1.25099              1.721e+08
MPI Messages:         1.182e+04      3.60692   8.252e+03  5.281e+05
MPI Message Lengths:  1.066e+07      5.78633   4.086e+02  2.158e+08
MPI Reductions:       4.365e+03      1.00000


Flop counting convention: 1 flop = 1 real number operation of type 
(multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N 
flops
                            and VecAXPY() for complex vectors of length N --> 
8N flops


Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- 
Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     
Avg         %Total   counts   %Total 
 0:      Main Stage: 3.4587e+02 100.0%  1.7263e+10 100.0%  5.281e+05 100.0%  
4.086e+02      100.0%  4.278e+03  98.0% 


------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting 
output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and 
PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in 
this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all 
processors)
------------------------------------------------------------------------------------------------------------------------




      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was compiled with a debugging option,      #
      #   To get timing results run config/configure.py        #
      #   using --with-debugging=no, the performance will      #
      #   be generally two or three times faster.              #
      #                                                        #
      ##########################################################




Event                Count      Time (sec)     Flops                            
 --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct 
 %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------


--- Event Stage 0: Main Stage


VecMDot             3195 1.0 8.1583e-01 2.7 2.23e+07 1.4 0.0e+00 0.0e+00 
3.6e+02  0  7  0  0  8   0  7  0  0  8  1480
VecNorm             4335 1.0 1.2828e+00 2.0 1.13e+07 1.4 0.0e+00 0.0e+00 
7.8e+02  0  4  0  0 18   0  4  0  0 18   475
VecScale            4192 1.0 2.3778e-02 1.3 5.47e+06 1.4 0.0e+00 0.0e+00 
0.0e+00  0  2  0  0  0   0  2  0  0  0 12415
VecCopy              995 1.0 5.0113e-03 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet              2986 1.0 6.1189e-03 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY             1140 1.0 8.8003e-03 1.3 2.78e+06 1.3 0.0e+00 0.0e+00 
0.0e+00  0  1  0  0  0   0  1  0  0  0 17310
VecAYPX               71 1.0 8.8720e-04 1.7 8.09e+04 1.2 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0  5286
VecWAXPY               2 1.0 2.3029e-05 1.7 2.28e+03 1.2 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0  5736
VecMAXPY            4192 1.0 5.6833e-02 1.4 3.08e+07 1.4 0.0e+00 0.0e+00 
0.0e+00  0 10  0  0  0   0 10  0  0  0 29299
VecAssemblyBegin       3 1.0 5.8301e-03 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 
9.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd         3 1.0 1.3023e-05 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin     2279 1.0 5.6063e-02 3.6 0.00e+00 0.0 5.1e+05 3.7e+02 
0.0e+00  0  0 96 88  0   0  0 96 88  0     0
VecScatterEnd       2279 1.0 4.8437e-0122.3 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize        4118 1.0 8.3514e-01 1.9 1.61e+07 1.4 0.0e+00 0.0e+00 
5.7e+02  0  5  0  0 13   0  5  0  0 13  1043
MatMult             3482 1.0 1.0133e+00 2.1 1.18e+08 1.4 2.1e+05 2.8e+02 
0.0e+00  0 37 39 27  0   0 37 39 27  0  6271
MatMultAdd            71 1.0 9.5340e-02 3.9 1.04e+06 1.3 2.3e+04 1.8e+02 
0.0e+00  0  0  4  2  0   0  0  4  2  0   611
MatMultTranspose     142 1.0 2.2453e-01 1.6 2.09e+06 1.3 4.6e+04 1.8e+02 
2.8e+02  0  1  9  4  7   0  1  9  4  7   519
MatSolve            3550 1.0 5.7862e-01 1.4 1.26e+08 1.4 0.0e+00 0.0e+00 
0.0e+00  0 39  0  0  0   0 39  0  0  0 11693
MatLUFactorNum         2 1.0 4.7321e-03 1.5 3.25e+05 1.5 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0  3655
MatILUFactorSym        2 1.0 1.1258e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 
6.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       5 1.0 1.6813e-0120.3 0.00e+00 0.0 1.4e+03 2.3e+03 
6.0e+00  0  0  0  2  0   0  0  0  2  0     0
MatAssemblyEnd         5 1.0 2.3137e-02 1.3 0.00e+00 0.0 1.9e+03 6.1e+01 
2.8e+01  0  0  0  0  1   0  0  0  0  1     0
MatGetRowIJ            2 1.0 4.9662e-06 4.9 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetSubMatrice       2 1.0 2.5637e-0132.3 0.00e+00 0.0 3.2e+03 2.1e+03 
1.0e+01  0  0  1  3  0   0  0  1  3  0     0
MatGetOrdering         2 1.0 1.2449e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 
8.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatIncreaseOvrlp       2 1.0 9.9950e-03 1.1 0.00e+00 0.0 1.3e+03 1.8e+02 
6.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries         2 1.0 8.5980e-05 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MeshView               4 1.0 5.9615e+00 1.0 0.00e+00 0.0 1.8e+03 3.1e+03 
0.0e+00  2  0  0  3  0   2  0  0  3  0     0
MeshGetGlobalScatter       3 1.0 4.1654e-02 1.2 0.00e+00 0.0 9.7e+02 6.0e+01 
1.8e+01  0  0  0  0  0   0  0  0  0  0     0
MeshAssembleMatrix    1606 1.1 6.7121e-02 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MeshUpdateOperator    2168 1.1 2.7389e-01 4.6 0.00e+00 0.0 0.0e+00 0.0e+00 
6.0e+00  0  0  0  0  0   0  0  0  0  0     0
SectionRealView        2 1.0 5.9061e-01199.5 0.00e+00 0.0 2.5e+02 4.1e+03 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCSetUp                5 1.0 2.8859e-01 7.4 3.25e+05 1.5 5.8e+03 1.2e+03 
4.6e+01  0  0  1  3  1   0  0  1  3  1    60
PCSetUpOnBlocks      284 1.0 8.3234e-03 1.3 3.25e+05 1.5 0.0e+00 0.0e+00 
2.6e+01  0  0  0  0  1   0  0  0  0  1  2078
PCApply               71 1.0 4.8040e+00 1.0 3.13e+08 1.4 4.8e+05 3.8e+02 
4.0e+03  1 97 91 84 92   1 97 91 84 94  3503
KSPGMRESOrthog      3195 1.0 8.5857e-01 2.5 4.46e+07 1.4 0.0e+00 0.0e+00 
3.6e+02  0 14  0  0  8   0 14  0  0  8  2814
KSPSetup               6 1.0 2.9785e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
1.2e+01  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 5.0004e+00 1.0 3.20e+08 1.4 5.1e+05 3.7e+02 
4.2e+03  1100 96 87 95   1100 96 87 97  3449
MeshDestroy            5 1.0 3.1958e-011357.6 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
DistributeMesh         1 1.0 4.5183e+00 1.1 0.00e+00 0.0 5.0e+02 9.5e+03 
0.0e+00  1  0  0  2  0   1  0  0  2  0     0
PartitionCreate        2 1.0 3.5427e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PartitionClosure       2 1.0 1.2162e+0011594.1 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
DistributeCoords       2 1.0 8.2849e-01 2.8 0.00e+00 0.0 5.0e+02 3.0e+03 
0.0e+00  0  0  0  1  0   0  0  0  1  0     0
DistributeLabels       2 1.0 1.6425e+00 1.0 0.00e+00 0.0 3.8e+02 6.9e+02 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
CreateOverlap          2 1.0 1.2166e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  1  0   0  0  0  1  0     0
------------------------------------------------------------------------------------------------------------------------


Memory usage is given in bytes:


Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.


--- Event Stage 0: Main Stage


              Viewer     4              4         1344     0
           Index Set    29             29        89664     0
                 Vec   132            131      1098884     0
         Vec Scatter     8              8         4320     0
              Matrix    13             13      1315884     0
                Mesh     5              5         1680     0
         SectionReal     7              5         1320     0
      Preconditioner     7              7         3132     0
       Krylov Solver     7              7        88364     0
========================================================================================================================
Average time to get PetscTime(): 1.90735e-07
Average time for MPI_Barrier(): 0.000211
Average time for zero size MPI_Send(): 1.4998e-05
#PETSc Option Table entries:
-log_summary
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4 
sizeof(PetscScalar) 8
Configure run at: Wed Mar  9 20:22:08 2011
Configure options: --with-clanguage=cxx --with-shared=1 --with-dynamic=1 
--download-f-blas-lapack=1 
--with-mpi-dir=/bwfs/software/ictce3.2/impi/3.2.0.011 --download-boost=1 
--download-fiat=1 --download-generator=1 --download-triangle=1 
--download-tetgen=1 --download-chaco=1 --download-parmetis=1 
--download-zoltan=1 --with-sieve=1 --with-opt-sieve=1 
--with-exodusii-dir=/bwfs/home/fdkong/petsc/petsc-3.1-p7/externalpackages/exodusii-4.75
 
--with-netcdf-dir=/bwfs/home/fdkong/petsc/petsc-3.1-p7/externalpackages/netcdf-4.1.1
-----------------------------------------
Libraries compiled on Wed Mar  9 20:22:27 CST 2011 on console 
Machine characteristics: Linux console 2.6.18-128.el5 #1 SMP Wed Dec 17 
11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux 
Using PETSc directory: /bwfs/home/fdkong/petsc/petsc-3.1-p7
Using PETSc arch: linux-gnu-c-debug
-----------------------------------------
Using C compiler: /bwfs/software/ictce3.2/impi/3.2.0.011/bin/mpicxx -Wall 
-Wwrite-strings -Wno-strict-aliasing -g   -fPIC   
Using Fortran compiler: /bwfs/software/ictce3.2/impi/3.2.0.011/bin/mpif90 -fPIC 
-Wall -Wno-unused-variable -g    
-----------------------------------------
Using include paths: 
-I/bwfs/home/fdkong/petsc/petsc-3.1-p7/linux-gnu-c-debug/include 
-I/bwfs/home/fdkong/petsc/petsc-3.1-p7/include 
-I/bwfs/home/fdkong/petsc/petsc-3.1-p7/linux-gnu-c-debug/include 
-I/export/ictce3.2/impi/3.2.0.011/include/gfortran/4.1.0 
-I/export/ictce3.2/impi/3.2.0.011/include 
-I/bwfs/home/fdkong/petsc/petsc-3.1-p7/include/sieve 
-I/bwfs/home/fdkong/petsc/petsc-3.1-p7/externalpackages/Boost/ 
-I/bwfs/home/fdkong/petsc/petsc-3.1-p7/externalpackages/exodusii-4.75/include 
-I/bwfs/home/fdkong/petsc/petsc-3.1-p7/externalpackages/netcdf-4.1.1/include 
-I/bwfs/software/ictce3.2/impi/3.2.0.011/include  
------------------------------------------
Using C linker: /bwfs/software/ictce3.2/impi/3.2.0.011/bin/mpicxx -Wall 
-Wwrite-strings -Wno-strict-aliasing -g 
Using Fortran linker: /bwfs/software/ictce3.2/impi/3.2.0.011/bin/mpif90 -fPIC 
-Wall -Wno-unused-variable -g  
Using libraries: 
-Wl,-rpath,/bwfs/home/fdkong/petsc/petsc-3.1-p7/linux-gnu-c-debug/lib 
-L/bwfs/home/fdkong/petsc/petsc-3.1-p7/linux-gnu-c-debug/lib -lpetsc       
-Wl,-rpath,/bwfs/home/fdkong/petsc/petsc-3.1-p7/linux-gnu-c-debug/lib 
-L/bwfs/home/fdkong/petsc/petsc-3.1-p7/linux-gnu-c-debug/lib -lzoltan 
-ltriangle -lX11 -lchaco -lparmetis -lmetis -ltetgen -lflapack -lfblas 
-Wl,-rpath,/bwfs/home/fdkong/petsc/petsc-3.1-p7/externalpackages/exodusii-4.75/lib
 -L/bwfs/home/fdkong/petsc/petsc-3.1-p7/externalpackages/exodusii-4.75/lib 
-lexoIIv2for -lexoIIv2c 
-Wl,-rpath,/bwfs/home/fdkong/petsc/petsc-3.1-p7/externalpackages/netcdf-4.1.1/lib
 -L/bwfs/home/fdkong/petsc/petsc-3.1-p7/externalpackages/netcdf-4.1.1/lib 
-lnetcdf -Wl,-rpath,/export/ictce3.2/impi/3.2.0.011/lib 
-L/export/ictce3.2/impi/3.2.0.011/lib 
-Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.1.2/32 
-L/usr/lib/gcc/x86_64-redhat-linux/4.1.2/32 -ldl -lmpi -lmpigf -lmpigi -lrt 
-lpthread -lgcc_s -Wl,-rpath,/bwfs/home/fdkong/petsc/petsc-3.1-p7/-Xlinker 
-lmpi_dbg -lgfortran -lm -Wl,-rpath,/opt/intel/mpi-rt/3.2 -lm -lmpigc4 
-lmpi_dbg -lstdc++ -lmpigc4 -lmpi_dbg -lstdc++ -ldl -lmpi -lmpigf -lmpigi -lrt 
-lpthread -lgcc_s -ldl  
------------------------------------------







------------------
Fande Kong
ShenZhen Institutes of Advanced Technology
Chinese Academy of Sciences
 

 
 
 
------------------ Original ------------------
From:  "knepley"<[email protected]>;
Date:  Mon, Mar 28, 2011 02:19 PM
To:  "PETSc users list"<petsc-users at mcs.anl.gov>; 
Cc:  "fdkong"<fd.kong at siat.ac.cn>; 
Subject:  Re: [petsc-users] Generation, refinement of the mesh (Sieve mesh) 
isvery slow!

 
 1) Always send the output of -log_summary when asking a performance question

2) There are implementations that are optimized for different things. Its 
possible to
?1?7?1?7 ?1?7optimize mesh handling for a cells-vertices mesh, but not if you 
need edges and
 ?1?7?1?7 ?1?7faces generated.


3) I am out of the country. I can look at the performance when I get back.


?1?7?1?7 Matt

On Mon, Mar 28, 2011 at 1:06 AM, fdkong <fd.kong at siat.ac.cn> wrote:
 Hi everyone
 ?1?7?1?7 I have developed my application based on the sieve mesh object in the 
Pestc. And now,?1?7I encountered some ?1?7serious problems.?1?7
?1?71. The generation of mesh?1?7takes a lot of time, run very slowly. The 
following code is used:
 ?1?7?1?7?1?7?1?7?1?7 ?1?7double lower[2] = {-1.0, -1.0};
?1?7?1?7 ?1?7 ?1?7 ?1?7double upper[2] = {1.0, 1.0};
?1?7?1?7 ?1?7 ?1?7 ?1?7int ?1?7 ?1?7edges[2] = {256,256};
?1?7?1?7 ?1?7 ?1?7 ?1?7mB = 
ALE::MeshBuilder<ALE::Mesh>::createSquareBoundary(comm, lower, upper, edges, 
debug);
 ?1?7?1?7ALE::ISieveConverter::convertMesh(*mB, *meshBd, renumbering, false);
?1?7?1?7 ?1?7 ?1?7ierr = PetscPrintf(PETSC_COMM_WORLD," End build convertMesh 
?1?7\n");CHKERRQ(ierr);
?1?7?1?7 ?1?7 ?1?7ierr = MeshSetMesh(boundary, meshBd);CHKERRQ(ierr);
 ?1?7?1?7 ?1?7 ?1?7ierr = PetscPrintf(PETSC_COMM_WORLD," Begin build 
MeshGenerate ?1?7\n");CHKERRQ(ierr);


?1?7?1?7 ?1?7 ?1?7ierr = MeshGenerate(boundary,interpolate, 
&mesh);CHKERRQ(ierr);

?1?7?1?7?1?7
 ?1?7?1?7 2. The refinement of mesh is also very slow. Th code:
?1?7?1?7 ?1?7?1?7refinementLimit=0.0001;
?1?7?1?7 ?1?7if (refinementLimit > 0.0)?1?7
?1?7?1?7 ?1?7{
?1?7?1?7 ?1?7 ?1?7Mesh refinedMesh;


 ?1?7?1?7 ?1?7 ?1?7ierr = MeshRefine(mesh, refinementLimit,interpolate, 
&refinedMesh);CHKERRQ(ierr);
?1?7?1?7 ?1?7 ?1?7ierr = MeshDestroy(mesh);CHKERRQ(ierr);
?1?7?1?7 ?1?7 ?1?7mesh = refinedMesh;
?1?7?1?7 ?1?7}


 ?1?7?1?7 ?1?73. The distribution of mesh is also very slow. The code:
?1?7?1?7 ?1?7?1?7if (size > 1)?1?7
?1?7?1?7 ?1?7{
?1?7?1?7 ?1?7 ?1?7Mesh parallelMesh;


?1?7?1?7 ?1?7 ?1?7//ierr = DistributeMeshnew(mesh, "chao", 
&parallelMesh);CHKERRQ(ierr);
 ?1?7?1?7 ?1?7 ?1?7ierr = DistributeMeshnew(mesh, "parmetis", 
&parallelMesh);CHKERRQ(ierr);
?1?7?1?7 ?1?7 ?1?7ierr = MeshDestroy(mesh);CHKERRQ(ierr);
?1?7?1?7 ?1?7 ?1?7mesh = parallelMesh;
?1?7?1?7 ?1?7}.
?1?7?1?7?1?7
 ?1?7?1?7 ?1?7Does anyone encounter these similar problem? If anyone can help, 
thank you very much! ?1?7
?1?7?1?7
?1?7?1?7 And I wonder to consult which parallel mesh can work ?1?7with Petsc 
very well, when we develop some complex problem??1?7
 ?1?7?1?7?1?7
?1?7?1?7 ?1?7?1?7
------------------
 Fande Kong
ShenZhen Institutes of Advanced Technology
Chinese Academy of Sciences
 

?1?7




-- 
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
 -- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110328/75740375/attachment-0001.htm>

Reply via email to