Dear Libmesh developers, In previous emails, I met a problem likely about data communication between nodes. However, when I run the codes on Master node with 2 CPUs. That means that there is not data communication between nodes. The problem is always there. the following is the table of cost time. You can find "find_global_indices()" took a very long time.
I am using AMD x86_64, Redhat Enterprise, GCC4.0 and MPICH127p1. Could you give me some advice? Thanks a lot. ------------------------------------------------------------------------------------------------------------- | libMesh Performance: Alive time=3921.43, Active time=3895.64 | ------------------------------------------------------------------------------------------------------------- | Event nCalls Total Time Avg Time Total Time Avg Time % of Active Time | | w/o Sub w/o Sub With Sub With Sub w/o S With S | |-------------------------------------------------------------------------------------------------------------| | | | | | DofMap | | add_neighbors_to_send_list() 3 1.8758 0.625277 2.0094 0.669790 0.05 0.05 | | build_constraint_matrix() 36160 1.1394 0.000032 1.1394 0.000032 0.03 0.03 | | cnstrn_elem_mat_vec() 36160 1.0369 0.000029 1.0369 0.000029 0.03 0.03 | | compute_sparsity() 3 66.9598 22.319928 69.6673 23.222438 1.72 1.79 | | create_dof_constraints() 3 1.9375 0.645828 2.7731 0.924369 0.05 0.07 | | distribute_dofs() 3 5.9749 1.991641 11.1858 3.728586 0.15 0.29 | | dof_indices() 451121 10.4046 0.000023 10.4046 0.000023 0.27 0.27 | | enforce_constraints_exactly() 2 0.0860 0.043013 0.0860 0.043013 0.00 0.00 | | old_dof_indices() 72320 1.6502 0.000023 1.6502 0.000023 0.04 0.04 | | prepare_send_list() 3 1.3888 0.462939 1.3888 0.462939 0.04 0.04 | | reinit() 3 4.4227 1.474239 4.4227 1.474239 0.11 0.11 | | | | FE | | compute_affine_map() 166087 28.4779 0.000171 28.4779 0.000171 0.73 0.73 | | compute_face_map() 65290 12.9293 0.000198 12.9293 0.000198 0.33 0.33 | | compute_shape_functions() 166087 53.0771 0.000320 53.0771 0.000320 1.36 1.36 | | init_face_shape_functions() 54525 6.8743 0.000126 6.8743 0.000126 0.18 0.18 | | init_shape_functions() 116731 41.2603 0.000353 41.2603 0.000353 1.06 1.06 | | inverse_map() 528671 15.4390 0.000029 15.4390 0.000029 0.40 0.40 | | | | GMVIO | | write_nodal_data() 1 2.0390 2.038986 2.0390 2.038986 0.05 0.05 | | | | JumpErrorEstimator | | estimate_error() 2 20.5754 10.287681 126.0162 63.008090 0.53 3.23 | | | | LocationMap | | find() 69104 0.9536 0.000014 0.9536 0.000014 0.02 0.02 | | init() 4 0.4922 0.123059 0.4922 0.123059 0.01 0.01 | | | | Mesh | | contract() 2 0.6653 0.332647 1.1356 0.567810 0.02 0.03 | | find_neighbors() 3 30.7003 10.233423 30.8006 10.266880 0.79 0.79 | | read() 1 5.7922 5.792197 5.7922 5.792197 0.15 0.15 | | renumber_nodes_and_elem() 8 1.9422 0.242779 1.9422 0.242779 0.05 0.05 | | | | MeshCommunication | | broadcast_bcs() 1 0.0604 0.060440 0.0743 0.074266 0.00 0.00 | | broadcast_mesh() 1 1.0069 1.006910 1.0271 1.027131 0.03 0.03 | | compute_hilbert_indices() 4 4.1264 1.031604 4.1264 1.031604 0.11 0.11 | | find_global_indices() 4 3172.3789 793.094713 3412.5255 853.131373 81.43 87.60 | | parallel_sort() 4 158.8466 39.711649 161.0895 40.272380 4.08 4.14 | | | | MeshRefinement | | _coarsen_elements() 4 0.4822 0.120546 0.4828 0.120710 0.01 0.01 | | _refine_elements() 4 2.7347 0.683675 5.5430 1.385758 0.07 0.14 | | add_point() 69104 1.3920 0.000020 2.5913 0.000037 0.04 0.07 | | make_coarsening_compatible() 12 7.9254 0.660450 7.9254 0.660450 0.20 0.20 | | make_refinement_compatible() 12 1.3526 0.112718 1.3618 0.113480 0.03 0.03 | | | | MetisPartitioner | | partition() 3 9.6725 3.224183 2854.3422 951.447412 0.25 73.27 | | | | Parallel | | allgather() 16 0.4336 0.027100 0.4336 0.027100 0.01 0.01 | | broadcast() 13 0.0327 0.002513 0.0327 0.002513 0.00 0.00 | | gather() 3 0.0007 0.000229 0.0007 0.000229 0.00 0.00 | | max() 275 0.5796 0.002108 0.5796 0.002108 0.01 0.01 | | min() 482 38.8301 0.080560 38.8301 0.080560 1.00 1.00 | | probe() 26 56.6142 2.177470 56.6142 2.177470 1.45 1.45 | | receive() 26 0.0334 0.001284 56.6479 2.178767 0.00 1.45 | | send() 26 18.5027 0.711642 18.5027 0.711642 0.47 0.47 | | send_receive() 34 0.0077 0.000225 75.1605 2.210604 0.00 1.93 | | sum() 20 2.8493 0.142467 2.8493 0.142467 0.07 0.07 | | wait() 26 0.0016 0.000061 0.0016 0.000061 0.00 0.00 | | | | Partitioner | | set_node_processor_ids() 3 3.7781 1.259356 4.5113 1.503763 0.10 0.12 | | set_parent_processor_ids() 3 0.5565 0.185501 0.5565 0.185501 0.01 0.01 | | | | PetscLinearSolver | | solve() 3 27.1758 9.058608 27.1827 9.060906 0.70 0.70 | | | | ProjectVector | | operator() 2 2.2219 1.110940 4.1955 2.097752 0.06 0.11 | | | | System | | assemble() 3 57.2052 19.068412 125.2392 41.746413 1.47 3.21 | | project_vector() 2 8.7408 4.370384 13.9501 6.975064 0.22 0.36 | ------------------------------------------------------------------------------------------------------------- | Totals: 1832413 3895.6373 100.00 | ------------------------------------------------------------------------------------------------------------- Regards, Yujie ------------------------------------------------------------------------------ The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com _______________________________________________ Libmesh-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/libmesh-users
