Dear Libmesh developers,

In previous emails, I met a problem likely about data communication between
nodes. However, when I run the codes on Master node with 2 CPUs. That means
that there is not data communication between nodes. The problem is always
there. the following is the table of cost time. You can find
"find_global_indices()" took a very long time.

I am using AMD x86_64, Redhat Enterprise, GCC4.0 and MPICH127p1. Could you
give me some advice? Thanks a lot.
 
-------------------------------------------------------------------------------------------------------------
| libMesh Performance: Alive time=3921.43, Active
time=3895.64                                                |
 
-------------------------------------------------------------------------------------------------------------
| Event                           nCalls    Total Time  Avg Time    Total
Time  Avg Time    % of Active Time  |
|                                           w/o Sub     w/o Sub     With
Sub    With Sub    w/o S    With S   |
|-------------------------------------------------------------------------------------------------------------|
|
|
|
|
|
DofMap
|
|   add_neighbors_to_send_list()  3         1.8758      0.625277
2.0094      0.669790    0.05     0.05     |
|   build_constraint_matrix()     36160     1.1394      0.000032
1.1394      0.000032    0.03     0.03     |
|   cnstrn_elem_mat_vec()         36160     1.0369      0.000029
1.0369      0.000029    0.03     0.03     |
|   compute_sparsity()            3         66.9598     22.319928
69.6673     23.222438   1.72     1.79     |
|   create_dof_constraints()      3         1.9375      0.645828
2.7731      0.924369    0.05     0.07     |
|   distribute_dofs()             3         5.9749      1.991641
11.1858     3.728586    0.15     0.29     |
|   dof_indices()                 451121    10.4046     0.000023
10.4046     0.000023    0.27     0.27     |
|   enforce_constraints_exactly() 2         0.0860      0.043013
0.0860      0.043013    0.00     0.00     |
|   old_dof_indices()             72320     1.6502      0.000023
1.6502      0.000023    0.04     0.04     |
|   prepare_send_list()           3         1.3888      0.462939
1.3888      0.462939    0.04     0.04     |
|   reinit()                      3         4.4227      1.474239
4.4227      1.474239    0.11     0.11     |
|
|
|
FE
|
|   compute_affine_map()          166087    28.4779     0.000171
28.4779     0.000171    0.73     0.73     |
|   compute_face_map()            65290     12.9293     0.000198
12.9293     0.000198    0.33     0.33     |
|   compute_shape_functions()     166087    53.0771     0.000320
53.0771     0.000320    1.36     1.36     |
|   init_face_shape_functions()   54525     6.8743      0.000126
6.8743      0.000126    0.18     0.18     |
|   init_shape_functions()        116731    41.2603     0.000353
41.2603     0.000353    1.06     1.06     |
|   inverse_map()                 528671    15.4390     0.000029
15.4390     0.000029    0.40     0.40     |
|
|
|
GMVIO
|
|   write_nodal_data()            1         2.0390      2.038986
2.0390      2.038986    0.05     0.05     |
|
|
|
JumpErrorEstimator
|
|   estimate_error()              2         20.5754     10.287681
126.0162    63.008090   0.53     3.23     |
|
|
|
LocationMap
|
|   find()                        69104     0.9536      0.000014
0.9536      0.000014    0.02     0.02     |
|   init()                        4         0.4922      0.123059
0.4922      0.123059    0.01     0.01     |
|
|
|
Mesh
|
|   contract()                    2         0.6653      0.332647
1.1356      0.567810    0.02     0.03     |
|   find_neighbors()              3         30.7003     10.233423
30.8006     10.266880   0.79     0.79     |
|   read()                        1         5.7922      5.792197
5.7922      5.792197    0.15     0.15     |
|   renumber_nodes_and_elem()     8         1.9422      0.242779
1.9422      0.242779    0.05     0.05     |
|
|
|
MeshCommunication
|
|   broadcast_bcs()               1         0.0604      0.060440
0.0743      0.074266    0.00     0.00     |
|   broadcast_mesh()              1         1.0069      1.006910
1.0271      1.027131    0.03     0.03     |
|   compute_hilbert_indices()     4         4.1264      1.031604
4.1264      1.031604    0.11     0.11     |
|   find_global_indices()         4         3172.3789   793.094713
3412.5255   853.131373  81.43    87.60    |
|   parallel_sort()               4         158.8466    39.711649
161.0895    40.272380   4.08     4.14     |
|
|
|
MeshRefinement
|
|   _coarsen_elements()           4         0.4822      0.120546
0.4828      0.120710    0.01     0.01     |
|   _refine_elements()            4         2.7347      0.683675
5.5430      1.385758    0.07     0.14     |
|   add_point()                   69104     1.3920      0.000020
2.5913      0.000037    0.04     0.07     |
|   make_coarsening_compatible()  12        7.9254      0.660450
7.9254      0.660450    0.20     0.20     |
|   make_refinement_compatible()  12        1.3526      0.112718
1.3618      0.113480    0.03     0.03     |
|
|
|
MetisPartitioner
|
|   partition()                   3         9.6725      3.224183
2854.3422   951.447412  0.25     73.27    |
|
|
|
Parallel
|
|   allgather()                   16        0.4336      0.027100
0.4336      0.027100    0.01     0.01     |
|   broadcast()                   13        0.0327      0.002513
0.0327      0.002513    0.00     0.00     |
|   gather()                      3         0.0007      0.000229
0.0007      0.000229    0.00     0.00     |
|   max()                         275       0.5796      0.002108
0.5796      0.002108    0.01     0.01     |
|   min()                         482       38.8301     0.080560
38.8301     0.080560    1.00     1.00     |
|   probe()                       26        56.6142     2.177470
56.6142     2.177470    1.45     1.45     |
|   receive()                     26        0.0334      0.001284
56.6479     2.178767    0.00     1.45     |
|   send()                        26        18.5027     0.711642
18.5027     0.711642    0.47     0.47     |
|   send_receive()                34        0.0077      0.000225
75.1605     2.210604    0.00     1.93     |
|   sum()                         20        2.8493      0.142467
2.8493      0.142467    0.07     0.07     |
|   wait()                        26        0.0016      0.000061
0.0016      0.000061    0.00     0.00     |
|
|
|
Partitioner
|
|   set_node_processor_ids()      3         3.7781      1.259356
4.5113      1.503763    0.10     0.12     |
|   set_parent_processor_ids()    3         0.5565      0.185501
0.5565      0.185501    0.01     0.01     |
|
|
|
PetscLinearSolver
|
|   solve()                       3         27.1758     9.058608
27.1827     9.060906    0.70     0.70     |
|
|
|
ProjectVector
|
|   operator()                    2         2.2219      1.110940
4.1955      2.097752    0.06     0.11     |
|
|
|
System
|
|   assemble()                    3         57.2052     19.068412
125.2392    41.746413   1.47     3.21     |
|   project_vector()              2         8.7408      4.370384
13.9501     6.975064    0.22     0.36     |
 
-------------------------------------------------------------------------------------------------------------
| Totals:                         1832413
3895.6373                                       100.00            |
 
-------------------------------------------------------------------------------------------------------------
Regards,
Yujie
------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
Libmesh-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libmesh-users

Reply via email to