Okay, one problem is fairly clear. As Terry indicated, you have to tell us to bind or else you lose a lot of performace. Set -mca opal_paffinity_alone 1 on your cmd line and it should make a significant difference.
On Wed, Aug 5, 2009 at 8:10 AM, Torgny Faxen <fa...@nsc.liu.se> wrote: > Ralph, > I am running through a locally provided wrapper but it translates to: > /software/mpi/openmpi/1.3b2/i101017/bin/mpirun -np 144 -npernode 8 -mca > mpi_show_mca_params env,file /nobac > kup/rossby11/faxen/RCO_scobi/src_161.openmpi/rco2.24pe > > a) Upgrade.. This will take some time, it will have to go through the > administrator, this is a production cluster > b) -mca .. see output below > c) I used exactly the same optimization flags for all three versions > (ScaliMPI, OpenMPI and IntelMPI) and this is Fortran so I am using mpif90 > :-) > > Regards / Torgny > > [n70:30299] ess=env (environment) > [n70:30299] orte_ess_jobid=482607105 (environment) > [n70:30299] orte_ess_vpid=0 (environment) > [n70:30299] mpi_yield_when_idle=0 (environment) > [n70:30299] mpi_show_mca_params=env,file (environment) > > > Ralph Castain wrote: > >> Could you send us the mpirun cmd line? I wonder if you are missing some >> options that could help. Also, you might: >> >> (a) upgrade to 1.3.3 - it looks like you are using some kind of >> pre-release version >> >> (b) add -mca mpi_show_mca_params env,file - this will cause rank=0 to >> output what mca params it sees, and where they came from >> >> (c) check that you built a non-debug version, and remembered to compile >> your application with a -O3 flag - i.e., "mpicc -O3 ...". Remember, OMPI >> does not automatically add optimization flags to mpicc! >> >> Thanks >> Ralph >> >> >> On Wed, Aug 5, 2009 at 7:15 AM, Torgny Faxen <fa...@nsc.liu.se <mailto: >> fa...@nsc.liu.se>> wrote: >> >> Pasha, >> no collectives are being used. >> >> A simple grep in the code reveals the following MPI functions >> being used: >> MPI_Init >> MPI_wtime >> MPI_COMM_RANK >> MPI_COMM_SIZE >> MPI_BUFFER_ATTACH >> MPI_BSEND >> MPI_PACK >> MPI_UNPACK >> MPI_PROBE >> MPI_GET_COUNT >> MPI_RECV >> MPI_IPROBE >> MPI_FINALIZE >> >> where MPI_IPROBE is the clear winner in terms of number of calls. >> >> /Torgny >> >> >> Pavel Shamis (Pasha) wrote: >> >> Do you know if the application use some collective operations ? >> >> Thanks >> >> Pasha >> >> Torgny Faxen wrote: >> >> Hello, >> we are seeing a large difference in performance for some >> applications depending on what MPI is being used. >> >> Attached are performance numbers and oprofile output >> (first 30 lines) from one out of 14 nodes from one >> application run using OpenMPI, IntelMPI and Scali MPI >> respectively. >> >> Scali MPI is faster the other two MPI:s with a factor of >> 1.6 and 1.75: >> >> ScaliMPI: walltime for the whole application is 214 seconds >> OpenMPI: walltime for the whole application is 376 seconds >> Intel MPI: walltime for the whole application is 346 seconds. >> >> The application is running with the main send receive >> commands being: >> MPI_Bsend >> MPI_Iprobe followed by MPI_Recv (in case of there being a >> message). Quite often MPI_Iprobe is being called just to >> check whether there is a certain message pending. >> >> Any idea on tuning tips, performance analysis, code >> modifications to improve the OpenMPI performance? A lot of >> time is being spent in "mca_btl_sm_component_progress", >> "btl_openib_component_progress" and other internal routines. >> >> The code is running on a cluster with 140 HP ProLiant >> DL160 G5 compute servers. Infiniband interconnect. Intel >> Xeon E5462 processors. The profiled application is using >> 144 cores on 18 nodes over Infiniband. >> >> Regards / Torgny >> >> >> =====================================================================================================================0 >> >> OpenMPI 1.3b2 >> >> >> =====================================================================================================================0 >> >> >> Walltime: 376 seconds >> >> CPU: CPU with timer interrupt, speed 0 MHz (estimated) >> Profiling through timer interrupt >> samples % image name app name >> symbol name >> 668288 22.2113 mca_btl_sm.so rco2.24pe >> mca_btl_sm_component_progress >> 441828 14.6846 rco2.24pe rco2.24pe >> step_ >> 335929 11.1650 libmlx4-rdmav2.so rco2.24pe >> (no symbols) >> 301446 10.0189 mca_btl_openib.so rco2.24pe >> btl_openib_component_progress >> 161033 5.3521 libopen-pal.so.0.0.0 rco2.24pe >> opal_progress >> 157024 5.2189 libpthread-2.5.so >> <http://libpthread-2.5.so> rco2.24pe >> pthread_spin_lock >> >> 99526 3.3079 no-vmlinux no-vmlinux >> (no symbols) >> 93887 3.1204 mca_btl_sm.so rco2.24pe >> opal_using_threads >> 69979 2.3258 mca_pml_ob1.so rco2.24pe >> mca_pml_ob1_iprobe >> 58895 1.9574 mca_bml_r2.so rco2.24pe >> mca_bml_r2_progress >> 55095 1.8311 mca_pml_ob1.so rco2.24pe >> mca_pml_ob1_recv_request_match_wild >> 49286 1.6381 rco2.24pe rco2.24pe >> tracer_ >> 41946 1.3941 libintlc.so.5 rco2.24pe >> __intel_new_memcpy >> 40730 1.3537 rco2.24pe rco2.24pe >> scobi_ >> 36586 1.2160 rco2.24pe rco2.24pe >> state_ >> 20986 0.6975 rco2.24pe rco2.24pe >> diag_ >> 19321 0.6422 libmpi.so.0.0.0 rco2.24pe >> PMPI_Unpack >> 18552 0.6166 libmpi.so.0.0.0 rco2.24pe >> PMPI_Iprobe >> 17323 0.5757 rco2.24pe rco2.24pe >> clinic_ >> 16194 0.5382 rco2.24pe rco2.24pe >> k_epsi_ >> 15330 0.5095 libmpi.so.0.0.0 rco2.24pe >> PMPI_Comm_f2c >> 13778 0.4579 libmpi_f77.so.0.0.0 rco2.24pe >> mpi_iprobe_f >> 13241 0.4401 rco2.24pe rco2.24pe >> s_recv_ >> 12386 0.4117 rco2.24pe rco2.24pe >> growth_ >> 11699 0.3888 rco2.24pe rco2.24pe >> testnrecv_ >> 11268 0.3745 libmpi.so.0.0.0 rco2.24pe >> mca_pml_base_recv_request_construct >> 10971 0.3646 libmpi.so.0.0.0 rco2.24pe >> ompi_convertor_unpack >> 10034 0.3335 mca_pml_ob1.so rco2.24pe >> mca_pml_ob1_recv_request_match_specific >> 10003 0.3325 libimf.so rco2.24pe >> exp.L >> 9375 0.3116 rco2.24pe rco2.24pe >> subbasin_ >> 8912 0.2962 libmpi_f77.so.0.0.0 rco2.24pe >> mpi_unpack_f >> >> >> >> >> >> =====================================================================================================================0 >> >> Intel MPI, version 3.2.0.011/ <http://3.2.0.011/> >> >> >> =====================================================================================================================0 >> >> >> Walltime: 346 seconds >> >> CPU: CPU with timer interrupt, speed 0 MHz (estimated) >> Profiling through timer interrupt >> samples % image name app name >> symbol name >> 486712 17.7537 rco2 rco2 >> step_ >> 431941 15.7558 no-vmlinux no-vmlinux >> (no symbols) >> 212425 7.7486 libmpi.so.3.2 rco2 >> MPIDI_CH3U_Recvq_FU >> 188975 6.8932 libmpi.so.3.2 rco2 >> MPIDI_CH3I_RDSSM_Progress >> 172855 6.3052 libmpi.so.3.2 rco2 >> MPIDI_CH3I_read_progress >> 121472 4.4309 libmpi.so.3.2 rco2 >> MPIDI_CH3I_SHM_read_progress >> 64492 2.3525 libc-2.5.so <http://libc-2.5.so> >> rco2 sched_yield >> 52372 1.9104 rco2 rco2 >> tracer_ >> 48621 1.7735 libmpi.so.3.2 rco2 >> .plt >> 45475 1.6588 libmpiif.so.3.2 rco2 >> pmpi_iprobe__ >> 44082 1.6080 libmpi.so.3.2 rco2 >> MPID_Iprobe >> 42788 1.5608 libmpi.so.3.2 rco2 >> MPIDI_CH3_Stop_recv >> 42754 1.5595 libpthread-2.5.so >> <http://libpthread-2.5.so> rco2 >> pthread_mutex_lock >> 42190 1.5390 libmpi.so.3.2 rco2 >> PMPI_Iprobe >> 41577 1.5166 rco2 rco2 >> scobi_ >> 40356 1.4721 libmpi.so.3.2 rco2 >> MPIDI_CH3_Start_recv >> 38582 1.4073 libdaplcma.so.1.0.2 rco2 >> (no symbols) >> 37545 1.3695 rco2 rco2 >> state_ >> 35597 1.2985 libc-2.5.so <http://libc-2.5.so> >> rco2 free >> 34019 1.2409 libc-2.5.so <http://libc-2.5.so> >> rco2 malloc >> 31841 1.1615 rco2 rco2 >> s_recv_ >> 30955 1.1291 libmpi.so.3.2 rco2 >> __I_MPI___intel_new_memcpy >> 27876 1.0168 libc-2.5.so <http://libc-2.5.so> >> rco2 _int_malloc >> 26963 0.9835 rco2 rco2 >> testnrecv_ >> 23005 0.8391 libpthread-2.5.so >> <http://libpthread-2.5.so> rco2 >> __pthread_mutex_unlock_usercnt >> 22290 0.8131 libmpi.so.3.2 rco2 >> MPID_Segment_manipulate >> 22086 0.8056 libmpi.so.3.2 rco2 >> MPIDI_CH3I_read_progress_expected >> 19146 0.6984 rco2 rco2 >> diag_ >> 18250 0.6657 rco2 rco2 >> clinic_ >> >> >> =====================================================================================================================0 >> >> Scali MPI, version 3.13.10-59413 >> >> >> =====================================================================================================================0 >> >> >> Walltime: >> >> CPU: CPU with timer interrupt, speed 0 MHz (estimated) >> Profiling through timer interrupt >> samples % image name app name >> symbol name >> 484267 30.0664 rco2.24pe rco2.24pe >> step_ >> 111949 6.9505 libmlx4-rdmav2.so rco2.24pe >> (no symbols) >> 73930 4.5900 libmpi.so rco2.24pe >> scafun_rq_handle_body >> 57846 3.5914 libmpi.so rco2.24pe >> invert_decode_header >> 55836 3.4667 libpthread-2.5.so >> <http://libpthread-2.5.so> rco2.24pe >> pthread_spin_lock >> 53703 3.3342 rco2.24pe rco2.24pe >> tracer_ >> 40934 2.5414 rco2.24pe rco2.24pe >> scobi_ >> 40244 2.4986 libmpi.so rco2.24pe >> scafun_request_probe_handler >> 37399 2.3220 rco2.24pe rco2.24pe >> state_ >> 30455 1.8908 libmpi.so rco2.24pe >> invert_matchandprobe >> 29707 1.8444 no-vmlinux no-vmlinux >> (no symbols) >> 29147 1.8096 libmpi.so rco2.24pe >> FMPI_scafun_Iprobe >> 27969 1.7365 libmpi.so rco2.24pe >> decode_8_u_64 >> 27475 1.7058 libmpi.so rco2.24pe >> scafun_rq_anysrc_fair_one >> 25966 1.6121 libmpi.so rco2.24pe >> scafun_uxq_probe >> 24380 1.5137 libc-2.5.so <http://libc-2.5.so> >> rco2.24pe memcpy >> 22615 1.4041 libmpi.so rco2.24pe >> .plt >> 21172 1.3145 rco2.24pe rco2.24pe >> diag_ >> 20716 1.2862 libc-2.5.so <http://libc-2.5.so> >> rco2.24pe memset >> 18565 1.1526 libmpi.so rco2.24pe >> openib_wrapper_poll_cq >> 18192 1.1295 rco2.24pe rco2.24pe >> clinic_ >> 17135 1.0638 libmpi.so rco2.24pe >> PMPI_Iprobe >> 16685 1.0359 rco2.24pe rco2.24pe >> k_epsi_ >> 16236 1.0080 libmpi.so rco2.24pe >> PMPI_Unpack >> 15563 0.9662 libmpi.so rco2.24pe >> scafun_r_rq_append >> 14829 0.9207 libmpi.so rco2.24pe >> scafun_rq_test_finished >> 13349 0.8288 rco2.24pe rco2.24pe >> s_recv_ >> 12490 0.7755 libmpi.so rco2.24pe >> flop_matchandprobe >> 12427 0.7715 libibverbs.so.1.0.0 rco2.24pe >> (no symbols) >> 12272 0.7619 libmpi.so rco2.24pe >> scafun_rq_handle >> 12146 0.7541 rco2.24pe rco2.24pe >> growth_ >> 10175 0.6317 libmpi.so rco2.24pe >> wrp2p_test_finished >> 9888 0.6139 libimf.so rco2.24pe >> exp.L >> 9179 0.5699 rco2.24pe rco2.24pe >> subbasin_ >> 9082 0.5639 rco2.24pe rco2.24pe >> testnrecv_ >> 8901 0.5526 libmpi.so rco2.24pe >> openib_wrapper_purge_requests >> 7425 0.4610 rco2.24pe rco2.24pe >> scobimain_ >> 7378 0.4581 rco2.24pe rco2.24pe >> scobi_interface_ >> 6530 0.4054 rco2.24pe rco2.24pe >> setvbc_ >> 6471 0.4018 libfmpi.so rco2.24pe >> pmpi_iprobe >> 6341 0.3937 rco2.24pe rco2.24pe >> snap_ >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> -- --------------------------------------------------------- >> Torgny Faxén National Supercomputer Center >> Linköping University S-581 83 Linköping >> Sweden >> Email:fa...@nsc.liu.se <email%3afa...@nsc.liu.se> <mailto: >> email%3afa...@nsc.liu.se <email%253afa...@nsc.liu.se>> >> Telephone: +46 13 285798 (office) +46 13 282535 (fax) >> http://www.nsc.liu.se >> --------------------------------------------------------- >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > -- > --------------------------------------------------------- > Torgny Faxén > National Supercomputer Center > Linköping University > S-581 83 Linköping > Sweden > > Email:fa...@nsc.liu.se <email%3afa...@nsc.liu.se> > Telephone: +46 13 285798 (office) +46 13 282535 (fax) > http://www.nsc.liu.se > --------------------------------------------------------- > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >