Re: [OMPI users] cannot build 32-bit openmpi-1.7 on Linux
I believe with 99%prob this is not an Open MPI issue, but an issue of the used fortran compiler (PPFC) itself. You can verify this by going to the build subdir ('Entering directory...') and trying to find out _what command was called_. If your compiler crashes again, build a reproducer and send it to the compiler developer team :o) Best Paul Kapinos On 04/05/13 17:56, Siegmar Gross wrote: PPFC mpi-f08.lo "../../../../../openmpi-1.7/ompi/mpi/fortran/use-mpi-f08/mpi-f08.F90", Line = 1, Column = 1: INTERNAL: Interrupt: Segmentation fault -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] OMPI v1.7.1 fails to build on RHEL 5 and RHEL 6
On 04/17/13 23:37, Ralph Castain wrote: Try adding --disable-openib-connectx-xrc to your configure line That mean, the XRC issue is still not fixed, though this in the 1.7.1 announce? > - Fixed XRC compile issue in Open Fabrics support. On Apr 17, 2013, at 2:27 PM, Timothy Dwight Dunn <timothy.d...@colorado.edu> wrote: I have been trying to get the new v1.7.1 to build on a few different systems and I get the same error on every build attempted. While the builds are on 3 different clusters the are all using RHEL 5 or RHEL6 (6.3 not 6.4 as OFED is still not working for it yet) Get this, too: gmake[2]: Entering directory `/tmp/pk224850/linuxc2_10777/openmpi-1.7.1_linux64_intel/ompi/mca/common/ofacm' CC common_ofacm_xoob.lo common_ofacm_xoob.c(158): error: identifier "ompi_jobid_t" is undefined static int xoob_ib_address_init(ofacm_ib_address_t *ib_addr, uint16_t lid, uint64_t s_id, ompi_jobid_t ep_jobid) ^ common_ofacm_xoob.c(873): warning #188: enumerated type mixed with another type enum ibv_mtu mtu = (context->attr[0].path_mtu < context->remote_info.rem_mtu) ? ^ common_ofacm_xoob.c(953): warning #188: enumerated type mixed with another type enum ibv_mtu mtu = (context->attr[0].path_mtu < remote_info->rem_mtu) ? ^ compilation aborted for common_ofacm_xoob.c (code 2) gmake[2]: *** [common_ofacm_xoob.lo] Error 1 While I have complex configs, even when I go down to a simple config using either gnu or Intel compilers such as; export CC=icc export CXX=icpc export F77=ifort export FC=ifort ./configure --prefix=~/openmpi-1.7.1 --with-tm=~/torque-2.5.11/ --with-verbs (Note the ~ is just covering up my actual paths otherwise all is well) So this config's without problems but when I go to build with make all -j 8 I get the following error make[2]: Entering directory `~openmpi-1.7.1/ompi/mpi/fortran/mpiext' PPFC mpi-ext-module.lo PPFC mpi-f08-ext-module.lo FCLD libforce_usempi_module_to_be_built.la FCLD libforce_usempif08_module_to_be_built.la make[2]: Leaving directory `~openmpi-1.7.1/ompi/mpi/fortran/mpiext' Making all in mca/common/ofacm make[2]: Entering directory `~openmpi-1.7.1/ompi/mca/common/ofacm' CC libmca_common_ofacm_la-common_ofacm_oob.lo CC libmca_common_ofacm_la-common_ofacm_base.lo if test -z "libmca_common_ofacm.la"; then \ rm -f "libmca_common_ofacm.la"; \ ln -s "libmca_common_ofacm_noinst.la" "libmca_common_ofacm.la"; \ fi CC libmca_common_ofacm_la-common_ofacm_empty.lo CC libmca_common_ofacm_la-common_ofacm_xoob.lo common_ofacm_xoob.c(158): error: identifier "ompi_jobid_t" is undefined static int xoob_ib_address_init(ofacm_ib_address_t *ib_addr, uint16_t lid, uint64_t s_id, ompi_jobid_t ep_jobid) ^ compilation aborted for common_ofacm_xoob.c (code 2) make[2]: *** [libmca_common_ofacm_la-common_ofacm_xoob.lo] Error 1 Note I get this even if I try and build without IB verbs. Googeling for help on this has turned up nothing, literally nothing. Any suggestions? Thanks Tim Dunn ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Building Open MPI with LSF
On 05/07/13 17:55, Ralph Castain wrote: *1.* *OpenMPI support for 1.6 seems to be broken, and was fixed maybe in 1.7?* http://www.open-mpi.org/community/lists/users/2013/03/21640.php It is indeed fixed in 1.7 - we will look at backporting a fix to 1.6 well, we're using 1.6.4 with tight integration to LSF 8.0 now =) For future, if you need a testbed, I can grant an user access to you... best Paul -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] basic questions about compiling OpenMPI
On 05/22/13 17:08, Blosch, Edwin L wrote: Apologies for not exploring the FAQ first. No comments =) If I want to use Intel or PGI compilers but link against the OpenMPI that ships with RedHat Enterprise Linux 6 (compiled with g++ I presume), are there any issues to watch out for, during linking? At least, the Fortran-90 bindings ("use mpi") won't work at all (they're compiler-dependent. So, our way is to compile a version of Open MPI with each compiler. I think this is recommended. Note also that the version of Open MPI shipped with Linux is usuallu a bit dusty. -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] 1.7.1 Hang with MPI_THREAD_MULTIPLE set
Hello, It is more or less well-known that MPI_THREAD_MULTIPLE disable the OpenFabric / InfiniBand networking in Open MPI: http://www.open-mpi.org/faq/?category=supported-systems#thread-support http://www.open-mpi.org/community/lists/users/2010/03/12345.php On our system not only the 'openib' BTL is off, but also the IPoIB denies to work, leading to error. But I was able to run your programm error-free when completely disabling using the InfiniBand: either both processes on same node (using shared memory), or use "-mca btl ^openib -mca btl_tcp_if_exclude ib0,lo" parameter to the 'mpiexec' in order to disable InfiniBand and IPoIB. Well; this is disappinting due to some 20x loss of performance using Gigagbit Ethernet, comparing the actual InfiniBand... Note: Intel MPI support MPI_THREAD_MULTIPLE when linked with -mt_mpi (Intel and GCC compilers) or -lmpi_mt instead of -lmpi (other compilers). However, Intel MPI is not free. Best, Paul Kapinos Also, I recommend to _always_ check what kinda of threading lievel you ordered and what did you get: print *, 'hello, world!', MPI_THREAD_MULTIPLE, provided On 05/31/13 06:12, W Spector wrote: Dear OpenMPI group, The following trivial program hangs on the mpi_barrier call with 1.7.1. I am using gfortran/gcc 4.6.3 on Ubuntu linux. OpenMPI was built with --enable-mpi-thread-multiple support and no other options (other than --prefix). Are there additional options we should be telling configure about? Or have we done something very silly? Mpich2 works just fine... Walter Spector program hang use mpi implicit none integer :: me, npes integer :: mpierr, provided logical :: iampe0 call mpi_init_thread ( & MPI_THREAD_MULTIPLE, & provided, & mpierr) print *, 'hello, world!' ! Hangs here with MPI_THREAD_MULTIPLE set... call mpi_barrier (MPI_COMM_WORLD, mpierr) call mpi_comm_rank (MPI_COMM_WORLD, me, mpierr) iampe0 = me == 0 call mpi_comm_size (MPI_COMM_WORLD, npes, mpierr) print *, 'pe:', me, ', total comm size:', npes print *, 'I am ', trim (merge ('PE 0', 'not PE 0', iampe0)) call mpi_finalize (mpierr) end program ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] knem/openmpi performance?
On 07/12/13 12:55, Jeff Squyres (jsquyres) wrote: FWIW: a long time ago (read: many Open MPI / knem versions ago), I did a few benchmarks with knem vs. no knem Open MPI installations. IIRC, I used the typical suspects like NetPIPE, the NPBs, etc. There was a modest performance improvement (I don't remember the numbers offhand); it was a smaller improvement than I had hoped for -- particularly in point-to-point message passing latency (e.g., via NetPIPE). Jeff, I would turn the question the other way around: - are there any penalties when using KNEM? We have a couple of Really Big Nodes (128 cores) with non-huge memory bandwidth (because coupled of 4x standalone nodes with 4 sockets each). So cutting the bandwidth in halves on these nodes sound like Very Good Thing. But otherwise we've 1500+ nodes with 2 sockets and 24GB memory only and we do not wanna to disturb the production on these nodes (and different MPI versions for different nodes are doofy). Best Paul -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] Big job, InfiniBand, MPI_Alltoallv and ibv_create_qp failed
Dear Open MPI experts, An user at our cluster has a problem running a kinda of big job: (- the job using 3024 processes (12 per node, 252 nodes) runs fine) - the job using 4032 processes (12 per node, 336 nodes) produce the error attached below. Well, the http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages is well-known one; both recommended tweakables (user limits and registered memory size) are at MAX now, nevertheless someone queue pair could not be created. Our blind guess is the number of completion queues is exhausted. What happen' when raising the value from standard to max? What max size of Open MPI jobs have been seen at all? What max size of Open MPI jobs *using MPI_Alltoallv* have been seen at all? Is there a way to manage the size/the number of queue pairs? (XRC not availabe) Is there a way to tell MPI_Alltoallv to use less queue pairs, even when this could lead to slow-down? There is a suspicious parameter in the mlx4_core module: $ modinfo mlx4_core | grep log_num_cq parm: log_num_cq:log maximum number of CQs per HCA (int) Is this the tweakable parameter? What is the default, and max value? Any help would be welcome... Best, Paul Kapinos P.S. There should be no connection problen somewhere between the nodes; a test job with 1x process on each node has been ran sucessfully just before starting the actual job, which also ran OK for a while - until calling MPI_Alltoallv. -- A process failed to create a queue pair. This usually means either the device has run out of queue pairs (too many connections) or there are insufficient resources available to allocate a queue pair (out of memory). The latter can happen if either 1) insufficient memory is available, or 2) no more physical memory can be registered with the device. For more information on memory registration see the Open MPI FAQs at: http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages Local host: linuxbmc1156.rz.RWTH-Aachen.DE Local device: mlx4_0 Queue pair type:Reliable connected (RC) -- [linuxbmc1156.rz.RWTH-Aachen.DE][[3703,1],4021][connect/btl_openib_connect_oob.c:867:rml_recv_cb] error in endpoint reply start connect [linuxbmc1156.rz.RWTH-Aachen.DE:9632] *** An error occurred in MPI_Alltoallv [linuxbmc1156.rz.RWTH-Aachen.DE:9632] *** on communicator MPI_COMM_WORLD [linuxbmc1156.rz.RWTH-Aachen.DE:9632] *** MPI_ERR_OTHER: known error not in list [linuxbmc1156.rz.RWTH-Aachen.DE:9632] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort [linuxbmc1156.rz.RWTH-Aachen.DE][[3703,1],4024][connect/btl_openib_connect_oob.c:867:rml_recv_cb] error in endpoint reply start connect [linuxbmc1156.rz.RWTH-Aachen.DE][[3703,1],4027][connect/btl_openib_connect_oob.c:867:rml_recv_cb] error in endpoint reply start connect [linuxbmc0840.rz.RWTH-Aachen.DE][[3703,1],10][connect/btl_openib_connect_oob.c:867:rml_recv_cb] error in endpoint reply start connect [linuxbmc0840.rz.RWTH-Aachen.DE][[3703,1],1][connect/btl_openib_connect_oob.c:867:rml_recv_cb] error in endpoint reply start connect [linuxbmc0840.rz.RWTH-Aachen.DE:17696] [[3703,0],0]-[[3703,1],10] mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104) [linuxbmc0840.rz.RWTH-Aachen.DE:17696] [[3703,0],0]-[[3703,1],8] mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104) [linuxbmc0840.rz.RWTH-Aachen.DE:17696] [[3703,0],0]-[[3703,1],9] mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104) [linuxbmc0840.rz.RWTH-Aachen.DE:17696] [[3703,0],0]-[[3703,1],1] mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104) [linuxbmc0840.rz.RWTH-Aachen.DE:17696] 9 more processes have sent help message help-mpi-btl-openib-cpc-base.txt / ibv_create_qp failed [linuxbmc0840.rz.RWTH-Aachen.DE:17696] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages [linuxbmc0840.rz.RWTH-Aachen.DE:17696] 3 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Big job, InfiniBand, MPI_Alltoallv and ibv_create_qp failed
Vanilla Linux ofed from RPM's for Scientific Linux release 6.4 (Carbon) (= RHEL 6.4). No ofed_info available :-( On 07/31/13 16:59, Mike Dubman wrote: Hi, What OFED vendor and version do you use? Regards M On Tue, Jul 30, 2013 at 8:42 PM, Paul Kapinos <kapi...@rz.rwth-aachen.de <mailto:kapi...@rz.rwth-aachen.de>> wrote: Dear Open MPI experts, An user at our cluster has a problem running a kinda of big job: (- the job using 3024 processes (12 per node, 252 nodes) runs fine) - the job using 4032 processes (12 per node, 336 nodes) produce the error attached below. Well, the http://www.open-mpi.org/faq/?__category=openfabrics#ib-__locked-pages <http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages> is well-known one; both recommended tweakables (user limits and registered memory size) are at MAX now, nevertheless someone queue pair could not be created. Our blind guess is the number of completion queues is exhausted. What happen' when raising the value from standard to max? What max size of Open MPI jobs have been seen at all? What max size of Open MPI jobs *using MPI_Alltoallv* have been seen at all? Is there a way to manage the size/the number of queue pairs? (XRC not availabe) Is there a way to tell MPI_Alltoallv to use less queue pairs, even when this could lead to slow-down? There is a suspicious parameter in the mlx4_core module: $ modinfo mlx4_core | grep log_num_cq parm: log_num_cq:log maximum number of CQs per HCA (int) Is this the tweakable parameter? What is the default, and max value? Any help would be welcome... Best, Paul Kapinos P.S. There should be no connection problen somewhere between the nodes; a test job with 1x process on each node has been ran sucessfully just before starting the actual job, which also ran OK for a while - until calling MPI_Alltoallv. --__--__-- A process failed to create a queue pair. This usually means either the device has run out of queue pairs (too many connections) or there are insufficient resources available to allocate a queue pair (out of memory). The latter can happen if either 1) insufficient memory is available, or 2) no more physical memory can be registered with the device. For more information on memory registration see the Open MPI FAQs at: http://www.open-mpi.org/faq/?__category=openfabrics#ib-__locked-pages <http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages> Local host: linuxbmc1156.rz.RWTH-Aachen.DE <http://linuxbmc1156.rz.RWTH-Aachen.DE> Local device: mlx4_0 Queue pair type:Reliable connected (RC) --__--__-- [linuxbmc1156.rz.RWTH-Aachen.__DE <http://linuxbmc1156.rz.RWTH-Aachen.DE>][[3703,1],4021][connect/__btl_openib_connect_oob.c:867:__rml_recv_cb] error in endpoint reply start connect [linuxbmc1156.rz.RWTH-Aachen.__DE:9632 <http://linuxbmc1156.rz.RWTH-Aachen.DE:9632>] *** An error occurred in MPI_Alltoallv [linuxbmc1156.rz.RWTH-Aachen.__DE:9632 <http://linuxbmc1156.rz.RWTH-Aachen.DE:9632>] *** on communicator MPI_COMM_WORLD [linuxbmc1156.rz.RWTH-Aachen.__DE:9632 <http://linuxbmc1156.rz.RWTH-Aachen.DE:9632>] *** MPI_ERR_OTHER: known error not in list [linuxbmc1156.rz.RWTH-Aachen.__DE:9632 <http://linuxbmc1156.rz.RWTH-Aachen.DE:9632>] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort [linuxbmc1156.rz.RWTH-Aachen.__DE <http://linuxbmc1156.rz.RWTH-Aachen.DE>][[3703,1],4024][connect/__btl_openib_connect_oob.c:867:__rml_recv_cb] error in endpoint reply start connect [linuxbmc1156.rz.RWTH-Aachen.__DE <http://linuxbmc1156.rz.RWTH-Aachen.DE>][[3703,1],4027][connect/__btl_openib_connect_oob.c:867:__rml_recv_cb] error in endpoint reply start connect [linuxbmc0840.rz.RWTH-Aachen.__DE <http://linuxbmc0840.rz.RWTH-Aachen.DE>][[3703,1],10][connect/btl___openib_connect_oob.c:867:rml___recv_cb] error in endpoint reply start connect [linuxbmc0840.rz.RWTH-Aachen.__DE <http://linuxbmc0840.rz.RWTH-Aachen.DE>][[3703,1],1][connect/btl___openib_connect_oob.c:867:rml___recv_cb] error in endpoint reply start connect [linuxbmc0840.rz.RWTH-Aachen.__DE:17696 <http://linuxbmc0840.rz.RWTH-Aachen.DE:17696>] [[3703,0],0]-[[3703,1],10] mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104) [linuxbmc0840.rz.RWTH-Aachen.__DE:17696 <http://linuxbmc0840.rz.RWTH-Aachen.DE:17696>] [[3703,0],0]-[[3703,1],8] mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104) [linuxbmc0840.rz.RWTH-Aachen.__DE:17696 <http://linuxbmc0840.rz.RWTH-A
Re: [OMPI users] MPI_Init_thread hangs in OpenMPI 1.7.1 when using --enable-mpi-thread-multiple
et --enable-mpi-thread-multiple. So maybe it hangs in 1.7.1 on any computer as long as you use MPI_THREAD_MULTIPLE. At least I have not seen it work anywhere. Do you agree that this is a bug, or am I doing something wrong? Best regards, Elias ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dr. Hans Ekkehard Plesser, Associate Professor Head, Basic Science Section Dept. of Mathematical Sciences and Technology Norwegian University of Life Sciences PO Box 5003, 1432 Aas, Norway Phone +47 6496 5467 Fax +47 6496 5401 emailhans.ekkehard.ples...@umb.no Homehttp://arken.umb.no/~plesser ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org <mailto:us...@open-mpi.org> http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] SIGSEGV in opal_hwlock152_hwlock_bitmap_or.A // Bug in 'hwlock" ?
Hello all, using 1.7.x (1.7.2 and 1.7.3 tested), we get SIGSEGV from somewhere in-deepth of 'hwlock' library - see the attached screenshot. Because the error is strongly aligned to just one single node, which in turn is kinda special one (see output of 'lstopo -'), it smells like an error in the 'hwlock' library. Is there a way to disable hwlock or to debug it in somehow way? (besides to build a debug version of hwlock and OpenMPI) Best Paul -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 System(252GB) Misc1 Misc0 Node#0(31GB) + Socket#0 + L3(18MB) L2(256KB) + L1(32KB) + Core#0 P#0 P#64 L2(256KB) + L1(32KB) + Core#1 P#1 P#65 L2(256KB) + L1(32KB) + Core#2 P#2 P#66 L2(256KB) + L1(32KB) + Core#3 P#3 P#67 L2(256KB) + L1(32KB) + Core#8 P#4 P#68 L2(256KB) + L1(32KB) + Core#9 P#5 P#69 L2(256KB) + L1(32KB) + Core#10 P#6 P#70 L2(256KB) + L1(32KB) + Core#11 P#7 P#71 Node#1(32GB) + Socket#1 + L3(18MB) L2(256KB) + L1(32KB) + Core#0 P#8 P#72 L2(256KB) + L1(32KB) + Core#1 P#9 P#73 L2(256KB) + L1(32KB) + Core#2 P#10 P#74 L2(256KB) + L1(32KB) + Core#3 P#11 P#75 L2(256KB) + L1(32KB) + Core#8 P#12 P#76 L2(256KB) + L1(32KB) + Core#9 P#13 P#77 L2(256KB) + L1(32KB) + Core#10 P#14 P#78 L2(256KB) + L1(32KB) + Core#11 P#15 P#79 Misc0 Node#2(32GB) + Socket#2 + L3(18MB) L2(256KB) + L1(32KB) + Core#0 P#16 P#80 L2(256KB) + L1(32KB) + Core#1 P#17 P#81 L2(256KB) + L1(32KB) + Core#2 P#18 P#82 L2(256KB) + L1(32KB) + Core#3 P#19 P#83 L2(256KB) + L1(32KB) + Core#8 P#20 P#84 L2(256KB) + L1(32KB) + Core#9 P#21 P#85 L2(256KB) + L1(32KB) + Core#10 P#22 P#86 L2(256KB) + L1(32KB) + Core#11 P#23 P#87 Node#3(32GB) + Socket#3 + L3(18MB) L2(256KB) + L1(32KB) + Core#0 P#24 P#88 L2(256KB) + L1(32KB) + Core#1 P#25 P#89 L2(256KB) + L1(32KB) + Core#2 P#26 P#90 L2(256KB) + L1(32KB) + Core#3 P#27 P#91 L2(256KB) + L1(32KB) + Core#8 P#28 P#92 L2(256KB) + L1(32KB) + Core#9 P#29 P#93 L2(256KB) + L1(32KB) + Core#10 P#30 P#94 L2(256KB) + L1(32KB) + Core#11 P#31 P#95 Misc1 Misc0 Node#4(32GB) + Socket#4 + L3(18MB) L2(256KB) + L1(32KB) + Core#0 P#32 P#96 L2(256KB) + L1(32KB) + Core#1 P#33 P#97 L2(256KB) + L1(32KB) + Core#2 P#34 P#98 L2(256KB) + L1(32KB) + Core#3 P#35 P#99 L2(256KB) + L1(32KB) + Core#8 P#36 P#100 L2(256KB) + L1(32KB) + Core#9 P#37 P#101 L2(256KB) + L1(32KB) + Core#10 P#38 P#102 L2(256KB) + L1(32KB) + Core#11 P#39 P#103 Node#5(32GB) + Socket#5 + L3(18MB) L2(256KB) + L1(32KB) + Core#0 P#40 P#104 L2(256KB) + L1(32KB) + Core#1 P#41 P#105 L2(256KB) + L1(32KB) + Core#2 P#42 P#106 L2(256KB) + L1(32KB) + Core#3 P#43 P#107 L2(256KB) + L1(32KB) + Core#8 P#44 P#108 L2(256KB) + L1(32KB) + Core#9 P#45 P#109 L2(256KB) + L1(32KB) + Core#10 P#46 P#110 L2(256KB) + L1(32KB) + Core#11 P#47 P#111 Misc0 Node#6(32GB) + Socket#6 + L3(18MB) L2(256KB) + L1(32KB) + Core#0 P#48 P#112 L2(256KB) + L1(32KB) + Core#1 P#49 P#113 L2(256KB) + L1(32KB) + Core#2 P#50 P#114 L2(256KB) + L1(32KB) + Core#3 P#51 P#115 L2(256KB) + L1(32KB) + Core#8 P#52 P#116 L2(256KB) + L1(32KB) + Core#9 P#53 P#117 L2(256KB) + L1(32KB) + Core#10 P#54 P#118 L2(256KB) + L1(32KB) + Core#11 P#55 P#119 Node#7(32GB) + Socket#7 + L3(18MB) L2(256KB) + L1(32KB) + Core#0 P#56 P#120 L2(256KB) + L1(32KB) + Core#1
[OMPI users] is there a way to bring to light _all_ configure options in a ready installation?
Hello OpenMPI developers, I am searching for a way to discover _all_ configure options of an OpenMPI installation. Background: in a existing installation, the ompi_info program helps to find out a lot of informations about the installation. So, "ompi_info -c" shows *some* configuration options like CFLAGS, FFLAGS et cetera. Compilation directories often does not survive for long time (or are not shipped at all, e.g. with SunMPI) But what about --enable-mpi-threads or --enable-contrib-no-build=vt for example (and all other possible) flags of "configure", how can I see would these flags set or would not? In other words: is it possible to get _all_ flags of configure from an "ready" installation in without having the compilation dirs (with configure logs) any more? Many thanks Paul -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] is there a way to bring to light _all_ configure options in a ready installation?
You should be able to run "./configure --help" and see a lengthy help message that includes all the command line options to configure. Is that what you're looking for? No, he wants to know what configure options were used with some binaries. Yes Terry - I want to know what configure options were for a given installation! "./configure --help" helps but to guess which all of the options are used in a release, is a hard job.. --td On Aug 24, 2010, at 7:40 AM, Paul Kapinos wrote: Hello OpenMPI developers, I am searching for a way to discover _all_ configure options of an OpenMPI installation. Background: in a existing installation, the ompi_info program helps to find out a lot of informations about the installation. So, "ompi_info -c" shows *some* configuration options like CFLAGS, FFLAGS et cetera. Compilation directories often does not survive for long time (or are not shipped at all, e.g. with SunMPI) But what about --enable-mpi-threads or --enable-contrib-no-build=vt for example (and all other possible) flags of "configure", how can I see would these flags set or would not? In other words: is it possible to get _all_ flags of configure from an "ready" installation in without having the compilation dirs (with configure logs) any more? Many thanks Paul -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Oracle Terry D. Dontje | Principal Software Engineer Developer Tools Engineering | +1.781.442.2631 Oracle * - Performance Technologies* 95 Network Drive, Burlington, MA 01803 Email terry.don...@oracle.com <mailto:terry.don...@oracle.com> ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] a question about [MPI]IO on systems without network filesystem
Dear OpenMPI developer, We have a question about the possibility to use MPI IO (and possible regular I/O) on clusters which does *not* have a common filesystem (network filesystem) on all nodes. A common filesystem is mainly NOT a hard precondition to use OpenMPI: http://www.open-mpi.org/faq/?category=running#do-i-need-a-common-filesystem Say, we have a (diskless? equipped with very small disks?) cluster, on which only one node have access to a filesystem. Is it possible to configure/run OpenMPI in a such way, that only _one_ process (e.g. master) performs real disk I/O, and other processes sends the data to the master which works as an agent? Of course this would impacts the performance, because all data must be send over network, and the master may became a bottleneck. But is such scenario - IO of all processes bundled to one process - practicable at all? Best wishes Paul -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] v1.5.1 build failed with PGI compiler
soname -Wl,libopen-pal.so.1 -o .libs/libopen-pal.so.1.0.0 Best wishes, Paul -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] v1.5.1: configuration failed if compiling on CentOS 5.5 with defauld GCC
Dear OpenMPI folks, I tried to compile the OpenMPI version 1.5.1 on a CentOS 5.5 computer with the default GCC shipped with the distribution, which is gcc version 4.1.2 20080704 (Red Hat 4.1.2-48) The configuration failed: configure:156412: checking location of libltdl configure:156425: result: internal copy configure:156709: WARNING: Failed to build GNU libltdl. This usually means that something configure:156711: WARNING: is incorrectly setup with your environment. There may be useful information in configure:156713: WARNING: opal/libltdl/config.log. You can also disable GNU libltdl, which will disable configure:156715: WARNING: dynamic shared object loading, by configuring with --disable-dlopen. configure:156717: error: Cannot continue The configuration line was was follows: $ ./configure --with-openib --with-devel-headers --enable-contrib-no-build=vt --enable-mpi-threads CFLAGS=-O3 -ffast-math -mtune=opteron -m32 CXXFLAGS=-O3 -ffast-math -mtune=opteron -m32 FFLAGS=-O3 -ffast-math -mtune=opteron -m32 FCFLAGS=-O3 -ffast-math -mtune=opteron -m32 F77=gfortran LDFLAGS=-O3 -ffast-math -mtune=opteron -m32 --prefix=/../MPI/openmpi-1.5.1mt/linux32/gcc With a newer version of GCC, version 4.2.4 (and also gcc version 4.5.1) the configuration completed fine. Does there an error in my way of configuring, or is there a problem in the configure itself? I think the non-ability to configure and build OpenMPI with the default compiler on CentOS 5.5 is still a problem, also other versions of GCC seem not to have the same issue. Best wishes, Paul -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] Configure fail: OpenMPI/1.5.3 with Support for LSF using Sun Studio compilers
Dear OpenMPI developers, We tried to build OpenMPI 1.5.3 including Support for Platform LSF using the Sun Studio (=Oracle Solaris Studio now) /12.2 and the configure stage failed. 1. Used flags: ./configure --with-lsf --with-openib --with-devel-headers --enable-contrib-no-build=vt --enable-mpi-threads CFLAGS="-fast -xtarget=nehalem -m64" CXXFLAGS="-fast -xtarget=nehalem -m64" FFLAGS="-fast -xtarget=nehalem" -m64 FCFLAGS="-fast -xtarget=nehalem -m64" F77=f95 LDFLAGS="-fast -xtarget=nehalem -m64" --prefix=//openmpi-1.5.3mt/linux64/studio (note the Support for LSF enabled by --with-lsf). The compiler envvars are set as following: $ echo $CC $FC $CXX cc f95 CC The compiler info: (cc -V, CC -V) cc: Sun C 5.11 Linux_i386 2010/08/13 CC: Sun C++ 5.11 Linux_i386 2010/08/13 2. The configure error was: ## checking for lsb_launch in -lbat... no configure: WARNING: LSF support requested (via --with-lsf) but not found. configure: error: Aborting. ## 3. In the config.log (see the config.log.error) there is more info about the problem. crucial info is: ## /opt/lsf/8.0/linux2.6-glibc2.3-x86_64/lib/libbat.so: undefined reference to `ceil' ## 4. Googling vor `ceil' results e.g. in http://www.cplusplus.com/reference/clibrary/cmath/ceil/ so, the attached ceil.c example file *can* be compiled by "CC" (the Studio C++ compiler), but *cannot* be compiled using "cc" (the Studio C compiler). $ CC ceil.c $ cc ceil.c 5. Looking into configure.log and searching on `ceil' results: there was a check for the availability of `ceil' for the C compiler (see config.log.ceil). This check says `ceil' is *available* for the "cc" Compiler, which is *wrong*, cf. (4). So, is there an error in the configure stage? Or either the checks in config.log.ceil does not rely on the avilability of the `ceil' funcion in the C compiler? Best wishes, Paul Kapinos P.S. Note in in the past we build many older versions of OpenMPI with no support for LSF and no such problems -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 configure:84213: cc -o conftest -DNDEBUG -fast -xtarget=nehalem -m64 -mt -I/home/pk224850/OpenMPI/openmpi-1.5.3_linux64_studio/opal/mca/paffinity/hwloc/hwloc/include -I/opt/lsf/8.0/include -fast -xtarget=nehalem -m64 -L/opt/lsf/8.0/linux2.6-glibc2.3-x86_64/lib conftest.c -lbat -llsf -lnsl -lutil >&5 cc: Warning: -xchip=native detection failed, falling back to -xchip=generic "conftest.c", line 568: warning: statement not reached /opt/lsf/8.0/linux2.6-glibc2.3-x86_64/lib/libbat.so: undefined reference to `ceil' configure:84213: $? = 2 configure: failed program was: | /* confdefs.h */ | #define PACKAGE_NAME "Open MPI" | #define PACKAGE_TARNAME "openmpi" | #define PACKAGE_VERSION "1.5.3" | #define PACKAGE_STRING "Open MPI 1.5.3" | #define PACKAGE_BUGREPORT "http://www.open-mpi.org/community/help/; | #define PACKAGE_URL "" | #define OPAL_ARCH "x86_64-unknown-linux-gnu" | #define STDC_HEADERS 1 | #define HAVE_SYS_TYPES_H 1 | #define HAVE_SYS_STAT_H 1 | #define HAVE_STDLIB_H 1 | #define HAVE_STRING_H 1 | #define HAVE_MEMORY_H 1 | #define HAVE_STRINGS_H 1 | #define HAVE_INTTYPES_H 1 | #define HAVE_STDINT_H 1 | #define HAVE_UNISTD_H 1 | #define __EXTENSIONS__ 1 | #define _ALL_SOURCE 1 | #define _GNU_SOURCE 1 | #define _POSIX_PTHREAD_SEMANTICS 1 | #define _TANDEM_SOURCE 1 | #define OMPI_MAJOR_VERSION 1 | #define OMPI_MINOR_VERSION 5 | #define OMPI_RELEASE_VERSION 3 | #define OMPI_GREEK_VERSION "" | #define OMPI_VERSION "3" | #define OMPI_RELEASE_DATE "Mar 16, 2011" | #define ORTE_MAJOR_VERSION 1 | #define ORTE_MINOR_VERSION 5 | #define ORTE_RELEASE_VERSION 3 | #define ORTE_GREEK_VERSION "" | #define ORTE_VERSION "3" | #define ORTE_RELEASE_DATE "Mar 16, 2011" | #define OPAL_MAJOR_VERSION 1 | #define OPAL_MINOR_VERSION 5 | #define OPAL_RELEASE_VERSION 3 | #define OPAL_GREEK_VERSION "" | #define OPAL_VERSION "3" | #define OPAL_RELEASE_DATE "Mar 16, 2011" | #define OPAL_ENABLE_MEM_DEBUG 0 | #define OPAL_ENABLE_MEM_PROFILE 0 | #define OPAL_ENABLE_DEBUG 0 | #define OPAL_WANT_PRETTY_PRINT_STACKTRACE 1 | #define OPAL_ENABLE_PTY_SUPPORT 1 | #define OPAL_ENABLE_HETEROGENEOUS_SUPPORT 0 | #define OPAL_ENABLE_TRACE 0 | #define OPAL_ENABLE_FT 0 | #define OPAL_ENABLE_FT_CR 0 | #define OPAL_WANT_HOME_CONFIG_FILES 1 | #define OPAL_ENABLE_IPV6 0 | #define OPAL_PACKAGE_STRING "Open MPI pk224...@cluster.rz.rwth-aachen.d
Re: [OMPI users] Configure fail: OpenMPI/1.5.3 with Support for LSF using Sun Studio compilers
Hi Terry, so, the attached ceil.c example file *can* be compiled by "CC" (the Studio C++ compiler), but *cannot* be compiled using "cc" (the Studio C compiler). $ CC ceil.c $ cc ceil.c Did you try to link in the math library -lm? When I did this your test program worked for me and that actually is the first test that the configure does. 5. Looking into configure.log and searching on `ceil' results: there was a check for the availability of `ceil' for the C compiler (see config.log.ceil). This check says `ceil' is *available* for the "cc" Compiler, which is *wrong*, cf. (4). See above, it actually is right when you link in the math lib. Thankt for the tipp! Yes, if using -lm so the Studio C compiler "cc" works also fine for ceil.c: $ cc ceil.c -lm So, is there an error in the configure stage? Or either the checks in config.log.ceil does not rely on the avilability of the `ceil' funcion in the C compiler? It looks to me like the lbat configure test is not linking in the math lib. Yes, the is no -lm in configure:84213 line. Note the cheks for ceil again, config.log.ceil. As far as I unterstood these logs, the checks for ceil and for the need of -lm deliver wrong results: configure:55000: checking if we need -lm for ceil configure:55104: result: no configure:55115: checking for ceil configure:55115: result: yes So, configure assumes "ceil" is available for the "cc" compiler without the need for -lm flag - and this is *wrong*, "cc" need -lm. It seem for me to be an configure issue. Greetings Paul -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] --enable-progress-threads broken in 1.5.3?
Hi OpenMPI folks, I've tried to install /1.5.3 version with aktivated progress threads (just to try it out) in addition to --enable-mpi-threads. The installation was fine, I also could build binaries, but each mpiexec call hangs forever silently. With the very same configuration options but without --enable-progress-threads, everything runs fine. So I wonder about the --enable-progress-threads is broken, or maybe I did something wrong? The configuration line was: ./configure --with-openib --with-lsf --with-devel-headers --enable-contrib-no-build=vt --enable-mpi-threads --enable-progress-threads --enable-heterogeneous --enable-cxx-exceptions --enable-orterun-prefix-by-default <> where <> contain prefix and some compiler-specific stuff. All versions compilerd (GCC, Intel, PGI, Sun Studio compilers, 23bit and 64bit) behaves the very same way. Best wishes, Paul -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] How to use a wrapper for ssh?
Hi OpenMPI folks, Using the version 1.4.3 of OpenMPI, I wanna to wrap the 'ssh' calls, which are made from the OpenMPIs 'mpiexec'. For this purpose, at least two ways seem to be possible for me: 1. let the wrapper have the name 'ssh' and paste the path where it is into the PATH envvar *before* the path to real ssh Q1: Would this work? 2. use MCA parameters described in http://www.open-mpi.org/faq/?category=rsh#rsh-not-ssh to bend the call to my wrapper, e.g. export OMPI_MCA_plm_rsh_agent=WrapPer export OMPI_MCA_orte_rsh_agent=WrapPer the oddly thing is, that the OMPI_MCA_orte_rsh_agent envvar seem not to have any effect, whereas OMPI_MCA_plm_rsh_agent works. Why I believe so? Because "strace -f mpiexec ..." says still trying for opening 'ssh' if OMPI_MCA_orte_rsh_agent is set, and correctly trying to open the 'WrapPer' iff OMPI_MCA_plm_rsh_agent is set. Q2: Is the supposed non-functionality of OMPI_MCA_orte_rsh_agent a bug, or do I have just misunderstood something? Best wishes, Paul P.S. reproducing: just set the envvars and do 'strace -f mpiexec ...' example: export OMPI_MCA_plm_rsh_agent=WrapPer ---> look'n for 'WrapPer'; stat64("/opt/lsf/8.0/linux2.6-glibc2.3-x86_64/bin/WrapPer", 0x8324) = -1 ENOENT (No such file or directory) export OMPI_MCA_orte_rsh_agent=WrapPer (do not forget to unset OMPI_MCA_plm_rsh_agent :o) ---> still looking for 'ssh' stat64("/opt/lsf/8.0/linux2.6-glibc2.3-x86_64/bin/ssh", 0x8324) = -1 ENOENT (No such file or directory) ===> OMPI_MCA_orte_rsh_agent does not work?! -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] How to use a wrapper for ssh?
Hi Ralph, 2. use MCA parameters described in http://www.open-mpi.org/faq/?category=rsh#rsh-not-ssh to bend the call to my wrapper, e.g. export OMPI_MCA_plm_rsh_agent=WrapPer export OMPI_MCA_orte_rsh_agent=WrapPer the oddly thing is, that the OMPI_MCA_orte_rsh_agent envvar seem not to have any effect, whereas OMPI_MCA_plm_rsh_agent works. Why I believe so? orte_rsh_agent doesn't exist in the 1.4 series :-) Only plm_rsh_agent is available in 1.4. "ompi_info --param orte all" and "ompi_info --param plm rsh" will confirm that fact. If so, then the Wiki is not correct. Maybe someone can correct it? This would save some time for people like me... Best wishes Paul Kapinos -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] Does Oracle Cluster Tools aka Sun's MPI work with LDAP?
Hi OpenMPI volks (and Oracle/Sun experts), we have a problem with Sun's MPI (Cluster Tools 8.2.x) on a part of our cluster. In the part of the cluster where LDAP is activated, the mpiexec does not try to spawn tasks on remote nodes at all, but exits with an error message alike below. If 'strace -f' the mpiexec, no exec of "ssh" can be found at all. Wondering, mpiexec tries to look into /etc/passwd (where user is not in, because using LDAP!). On the old part of the cluster, where NIS is used as the autentification method, Sun MPI runs very fine. So, is Suns MPI compatible with LDAP autotentification method at all? Best wishes, Paul P.S. in both parts if the cluster, me (login marked as x here) can login to any node by ssh without need to type the password. -- The user (x) is unknown to the system (i.e. there is no corresponding entry in the password file). Please contact your system administrator for a fix. -- [cluster-beta.rz.RWTH-Aachen.DE:31535] [[57885,0],0] ORTE_ERROR_LOG: Fatal in file plm_rsh_module.c at line 1058 -- -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Does Oracle Cluster Tools aka Sun's MPI work with LDAP?
Hi Terry, Reuti, good news: we've solved/workarounded the problem with CT/8.2.1c :o) the "fix" was easy: we used the 64bit version of the 'mpiexec' instead of [previously-used as default] 32bit version. The 64bit version version works now with both NIS and LDAP autentification modi. The32bit version works with the NIS-autentificated part of our cluster, only. Thanks for your help! Best wishes Paul Kapinos Reuti wrote: Hi, Am 15.07.2011 um 21:14 schrieb Terry Dontje: On 7/15/2011 1:46 PM, Paul Kapinos wrote: Hi OpenMPI volks (and Oracle/Sun experts), we have a problem with Sun's MPI (Cluster Tools 8.2.x) on a part of our cluster. In the part of the cluster where LDAP is activated, the mpiexec does not try to spawn tasks on remote nodes at all, but exits with an error message alike below. If 'strace -f' the mpiexec, no exec of "ssh" can be found at all. Wondering, mpiexec tries to look into /etc/passwd (where user is not in, because using LDAP!). Note this is an area that should be no different than from stock Open MPI. "should not" but it is :o) However, I compare CT/8.2.1c with self-compiled OpenMPI/1.4.3 which are far different releases. And they behave definitely in different way: in selv-compiled OpenMPI both 32bit and 64bit mpiexecs work with NIS and with LDAP, and the CT/8.2.1c mpiexec in 32bit does work with NIS only. I would suspect that the message might be coming from ssh. I wouldn't suspect mpiexec would be looking into /etc/passwd at all, why would it need to. the output you listed is titled "[unknown-user]". Maybe referring to the password file is a wrong simplification. The test is also on the master node of the parallel job by an usual `getpwuid`. The /etc/nsswitch.conf is fine an the `mpiexec` machine? On this node the user is known too? Can they login because they have no passphrase or because they have an agent running, or did you setup hostbased authentication? my user is known on each node and is allowed to log in (without password) from any to any node. In /etc/passwd there is no password for my user; all auth thins are done by NIS or LDAP. (sorry I cannot tell more because this is admin stuff, but as said: "ssh" works from any to any node without password). /etc/nsswitch.conf seem to be fine (it works now with the 64bit version of mpiexec :o) It should just be using ssh. Can you manually ssh to the same node? On the old part of the cluster, where NIS is used as the autentification method, Sun MPI runs very fine. So, is Suns MPI compatible with LDAP autotentification method at all? In as far as whatever launcher you use is compatible with LDAP. Best wishes, Paul P.S. in both parts if the cluster, me (login marked as x here) can login to any node by ssh without need to type the password. From the headnode of the cluster to a node or also between nodes? -- Reuti -- The user (x) is unknown to the system (i.e. there is no corresponding entry in the password file). Please contact your system administrator for a fix. -- [cluster-beta.rz.RWTH-Aachen.DE:31535] [[57885,0],0] ORTE_ERROR_LOG: Fatal in file plm_rsh_module.c at line 1058 ------ -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] Cofigure(?) problem building /1.5.3 on ScientificLinux6.0
Dear OpenMPI volks, currently I have a problem by building the version 1.5.3 of OpenMPI on Scientific Linux 6.0 systems, which seem vor me to be a configuration problem. After the configure run (which seem to terminate without error code), the "gmake all" stage produces errors and exits. Typical is the output below. Fancy: the 1.4.3 version on same computer can be build with no special trouble. Both the 1.4.3 and 1.5.3 versions can be build on other computer running CentOS 5.6. In each case I build 16 versions at all (4 compiler * 32bit/64bit * support for multithreading ON/OFF). The same error arise in all 16 versions. Can someone give a hint about how to avoid this issue? Thanks! Best wishes, Paul Some logs and configure are downloadable here: https://gigamove.rz.rwth-aachen.de/d/id/2jM6MEa2nveJJD The configure line is in RUNME.sh, the logs of configure and build stage in log_* files; I also attached the config.log file and the configure itself (which is the standard from the 1.5.3 release). ## CDPATH="${ZSH_VERSION+.}:" && cd . && /bin/sh /tmp/pk224850/linuxc2_11254/openmpi-1.5.3mt_linux64_gcc/config/missing --run aclocal-1.11 -I config sh: config/ompi_get_version.sh: No such file or directory /usr/bin/m4: esyscmd subprocess failed configure.ac:953: warning: OMPI_CONFIGURE_SETUP is m4_require'd but not m4_defun'd config/ompi_mca.m4:37: OMPI_MCA is expanded from... configure.ac:953: the top level configure.ac:953: warning: AC_COMPILE_IFELSE was called before AC_USE_SYSTEM_EXTENSIONS ../../lib/autoconf/specific.m4:386: AC_USE_SYSTEM_EXTENSIONS is expanded from... opal/mca/paffinity/hwloc/hwloc/config/hwloc.m4:152: HWLOC_SETUP_CORE_AFTER_C99 is expanded from... ../../lib/m4sugar/m4sh.m4:505: AS_IF is expanded from... opal/mca/paffinity/hwloc/hwloc/config/hwloc.m4:22: HWLOC_SETUP_CORE is expanded from... opal/mca/paffinity/hwloc/configure.m4:40: MCA_paffinity_hwloc_CONFIG is expanded from... config/ompi_mca.m4:540: MCA_CONFIGURE_M4_CONFIG_COMPONENT is expanded from... config/ompi_mca.m4:326: MCA_CONFIGURE_FRAMEWORK is expanded from... config/ompi_mca.m4:247: MCA_CONFIGURE_PROJECT is expanded from... configure.ac:953: warning: AC_RUN_IFELSE was called before AC_USE_SYSTEM_EXTENSIONS -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] Usage of PGI compilers (Libtool or OpenMPI issue?)
Hi, just found out: the --instantiation_dir, --one_instantiation_per_object, and --template_info_file flags are deprecated in the newer versions of the PGI compilers, cf. http://www.pgroup.com/support/release_tprs_2010.htm But, compiling OpenMPI /1.4.3 with 11.7 PGI compilers, I see the warnings: pgCC-Warning-prelink_objects switch is deprecated pgCC-Warning-instantiation_dir switch is deprecated coming from the below-noted call. I do not know about this is a Libtool or a libtool usage (=OpenMPI issue, but I do not want to keep secret this... Best wishes Paul Kapinos libtool: link: pgCC --prelink_objects --instantiation_dir Template.dir .libs/mpicxx.o .libs/intercepts.o .libs/comm.o .libs/datatype.o .libs/win.o .libs/file.o -Wl,--rpath -Wl,/tmp/pk224850/linuxc2_11254/openmpi-1.4.3_linux32_pgi/ompi/.libs -Wl,--rpath -Wl,/tmp/pk224850/linuxc2_11254/openmpi-1.4.3_linux32_pgi/orte/.libs -Wl,--rpath -Wl,/tmp/pk224850/linuxc2_11254/openmpi-1.4.3_linux32_pgi/opal/.libs -Wl,--rpath -Wl,/opt/MPI/openmpi-1.4.3/linux/pgi/lib/lib32 -L/tmp/pk224850/linuxc2_11254/openmpi-1.4.3_linux32_pgi/orte/.libs -L/tmp/pk224850/linuxc2_11254/openmpi-1.4.3_linux32_pgi/opal/.libs -L/opt/lsf/8.0/linux2.6-glibc2.3-x86/lib ../../../ompi/.libs/libmpi.so /tmp/pk224850/linuxc2_11254/openmpi-1.4.3_linux32_pgi/orte/.libs/libopen-rte.so /tmp/pk224850/linuxc2_11254/openmpi-1.4.3_linux32_pgi/opal/.libs/libopen-pal.so -ldl -lnsl -lutil -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] and the next one (3th today!) PGI+OpenMPI issue
... just do almost the same thing: Try to install OpenMPI 1.4.3 using 11.7 PGI Compiler on Scientific Linux 6.0. The same place, but other error message: -- /usr/lib64/crt1.o: In function `_start': (.text+0x20): undefined reference to `main' gmake[2]: *** [libmpi_cxx.la] Error 2 gmake[2]: Leaving directory `/tmp/pk224850/linuxc2_11254/openmpi-1.4.3_linux64_pgi/ompi/mpi/cxx' -- and then the compilation aborted. Configure string below. With the Intel, gcc and Studio compiles, the very same installations were happily through. Maybe someone can give me a hint about this is an issue by openmpi, pgi or somehow else... Best wishes, Paul P.S. again, more logs downloadable: https://gigamove.rz.rwth-aachen.de/d/id/WNk69nPr4w7svT -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Cofigure(?) problem building /1.5.3 on ScientificLinux6.0
Hi Ralph, Higher rev levels of the autotools are required for the 1.5 series - are you at the right ones? See http://www.open-mpi.org/svn/building.php Many thanks for the link. Short test, and it's out: autoconf version on our release is too old. We have 2.63 and needed ist 2.65. I will trigger our admins... Best wishes, Paul m4 (GNU M4) 1.4.13 (OK) autoconf (GNU Autoconf) 2.63 (Need: 2.65, NOK) automake (GNU automake) 1.11.1 (OK) ltmain.sh (GNU libtool) 2.2.6b (OK) On Jul 22, 2011, at 9:12 AM, Paul Kapinos wrote: Dear OpenMPI volks, currently I have a problem by building the version 1.5.3 of OpenMPI on Scientific Linux 6.0 systems, which seem vor me to be a configuration problem. After the configure run (which seem to terminate without error code), the "gmake all" stage produces errors and exits. Typical is the output below. Fancy: the 1.4.3 version on same computer can be build with no special trouble. Both the 1.4.3 and 1.5.3 versions can be build on other computer running CentOS 5.6. In each case I build 16 versions at all (4 compiler * 32bit/64bit * support for multithreading ON/OFF). The same error arise in all 16 versions. Can someone give a hint about how to avoid this issue? Thanks! Best wishes, Paul Some logs and configure are downloadable here: https://gigamove.rz.rwth-aachen.de/d/id/2jM6MEa2nveJJD The configure line is in RUNME.sh, the logs of configure and build stage in log_* files; I also attached the config.log file and the configure itself (which is the standard from the 1.5.3 release). ## CDPATH="${ZSH_VERSION+.}:" && cd . && /bin/sh /tmp/pk224850/linuxc2_11254/openmpi-1.5.3mt_linux64_gcc/config/missing --run aclocal-1.11 -I config sh: config/ompi_get_version.sh: No such file or directory /usr/bin/m4: esyscmd subprocess failed configure.ac:953: warning: OMPI_CONFIGURE_SETUP is m4_require'd but not m4_defun'd config/ompi_mca.m4:37: OMPI_MCA is expanded from... configure.ac:953: the top level configure.ac:953: warning: AC_COMPILE_IFELSE was called before AC_USE_SYSTEM_EXTENSIONS ../../lib/autoconf/specific.m4:386: AC_USE_SYSTEM_EXTENSIONS is expanded from... opal/mca/paffinity/hwloc/hwloc/config/hwloc.m4:152: HWLOC_SETUP_CORE_AFTER_C99 is expanded from... ../../lib/m4sugar/m4sh.m4:505: AS_IF is expanded from... opal/mca/paffinity/hwloc/hwloc/config/hwloc.m4:22: HWLOC_SETUP_CORE is expanded from... opal/mca/paffinity/hwloc/configure.m4:40: MCA_paffinity_hwloc_CONFIG is expanded from... config/ompi_mca.m4:540: MCA_CONFIGURE_M4_CONFIG_COMPONENT is expanded from... config/ompi_mca.m4:326: MCA_CONFIGURE_FRAMEWORK is expanded from... config/ompi_mca.m4:247: MCA_CONFIGURE_PROJECT is expanded from... configure.ac:953: warning: AC_RUN_IFELSE was called before AC_USE_SYSTEM_EXTENSIONS -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] problems with Intel 12.x compilers and OpenMPI (1.4.3)
Hi Open MPI volks, we see some quite strange effects with our installations of Open MPI 1.4.3 with Intel 12.x compilers, which makes us puzzling: Different programs reproducibly deadlock or die with errors alike the below-listed ones. Some of the errors looks alike programming issue at first look (well, a deadlock *is* usually a programming error) but we do not believe it is so: the errors arise in many well-tested codes including HPL (*) and only with a special compiler + Open MPI version (Intel 12.x compiler + open MPI 1.4.3) and only on special number of processes (usually high ones). For example, HPL reproducible deadlocks with 72 procs and dies with error message #2 with 384 processes. All this errors seem to be somehow related to MPI communicators; and 1.4.4rc3 and in 1.5.3 and 1.5.4 seem not to have this problem. Also 1.4.3 if using together with Intel 11.x compielr series seem to be unproblematic. So probably this: (1.4.4 release notes:) - Fixed a segv in MPI_Comm_create when called with GROUP_EMPTY. Thanks to Dominik Goeddeke for finding this. is also fix for our issues? Or maybe not, because 1.5.3 is _older_ than this fix? As far as we workarounded the problem by switching our production to 1.5.3 this issue is not a "burning" one; but I decieded still to post this because any issue on such fundamental things may be interesting for developers. Best wishes, Paul Kapinos (*) http://www.netlib.org/benchmark/hpl/ Fatal error in MPI_Comm_size: Invalid communicator, error stack: MPI_Comm_size(111): MPI_Comm_size(comm=0x0, size=0x6f4a90) failed MPI_Comm_size(69).: Invalid communicator [linuxbdc05.rz.RWTH-Aachen.DE:23219] *** An error occurred in MPI_Comm_split [linuxbdc05.rz.RWTH-Aachen.DE:23219] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0 [linuxbdc05.rz.RWTH-Aachen.DE:23219] *** MPI_ERR_IN_STATUS: error code in status [linuxbdc05.rz.RWTH-Aachen.DE:23219] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) forrtl: severe (71): integer divide by zero Image PC Routine Line Source libmpi.so.0 2D9EDF52 Unknown Unknown Unknown libmpi.so.0 2D9EE45D Unknown Unknown Unknown libmpi.so.0 2D9C3375 Unknown Unknown Unknown libmpi_f77.so.0 2D75C37A Unknown Unknown Unknown vasp_mpi_gamma 0057E010 Unknown Unknown Unknown vasp_mpi_gamma 0059F636 Unknown Unknown Unknown vasp_mpi_gamma 00416C5A Unknown Unknown Unknown vasp_mpi_gamma 00A62BEE Unknown Unknown Unknown libc.so.6 003EEB61EC5D Unknown Unknown Unknown vasp_mpi_gamma 00416A29 Unknown Unknown Unknown -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] How are the Open MPI processes spawned?
Hello Open MPI volks, We use OpenMPI 1.5.3 on our pretty new 1800+ nodes InfiniBand cluster, and we have some strange hangups if starting OpenMPI processes. The nodes are named linuxbsc001,linuxbsc002,... (with some lacuna due of offline nodes). Each node is accessible from each other over SSH (without password), also MPI programs between any two nodes are checked to run. So long, I tried to start some bigger number of processes, one process per node: $ mpiexec -np NN --host linuxbsc001,linuxbsc002,... MPI_FastTest.exe Now the problem: there are some constellations of names in the host list on which mpiexec reproducible hangs forever; and more surprising: other *permutation* of the *same* node names may run without any errors! Example: the command in laueft.txt runs OK, the command in haengt.txt hangs. Note: the only difference is that the node linuxbsc025 is put on the end of the host list. Amazed, too? Looking on the particular nodes during the above mpiexec hangs, we found the orted daemons started on *each* node and the binary on all but one node (orted.txt, MPI_FastTest.txt). Again amazing that the node with no user process started (leading to hangup in MPI_Init of all processes and thus to hangup, I believe) was always the same, linuxbsc005, which is NOT the permuted item linuxbsc025... This behaviour is reproducible. The hang-on only occure if the started application is a MPI application ("hostname" does not hang). Any Idea what is gonna on? Best, Paul Kapinos P.S: no alias names used, all names are real ones -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 linuxbsc001: STDOUT: 24323 ?SLl0:00 MPI_FastTest.exe linuxbsc002: STDOUT: 2142 ?SLl0:00 MPI_FastTest.exe linuxbsc003: STDOUT: 69266 ?SLl0:00 MPI_FastTest.exe linuxbsc004: STDOUT: 58899 ?SLl0:00 MPI_FastTest.exe linuxbsc006: STDOUT: 68255 ?SLl0:00 MPI_FastTest.exe linuxbsc007: STDOUT: 62026 ?SLl0:00 MPI_FastTest.exe linuxbsc008: STDOUT: 54221 ?SLl0:00 MPI_FastTest.exe linuxbsc009: STDOUT: 55482 ?SLl0:00 MPI_FastTest.exe linuxbsc010: STDOUT: 59380 ?SLl0:00 MPI_FastTest.exe linuxbsc011: STDOUT: 58312 ?SLl0:00 MPI_FastTest.exe linuxbsc014: STDOUT: 56013 ?SLl0:00 MPI_FastTest.exe linuxbsc016: STDOUT: 58563 ?SLl0:00 MPI_FastTest.exe linuxbsc017: STDOUT: 54693 ?SLl0:00 MPI_FastTest.exe linuxbsc018: STDOUT: 54187 ?SLl0:00 MPI_FastTest.exe linuxbsc020: STDOUT: 55811 ?SLl0:00 MPI_FastTest.exe linuxbsc021: STDOUT: 54982 ?SLl0:00 MPI_FastTest.exe linuxbsc022: STDOUT: 50032 ?SLl0:00 MPI_FastTest.exe linuxbsc023: STDOUT: 54044 ?SLl0:00 MPI_FastTest.exe linuxbsc024: STDOUT: 51247 ?SLl0:00 MPI_FastTest.exe linuxbsc025: STDOUT: 18575 ?SLl0:00 MPI_FastTest.exe linuxbsc027: STDOUT: 48969 ?SLl0:00 MPI_FastTest.exe linuxbsc028: STDOUT: 52397 ?SLl0:00 MPI_FastTest.exe linuxbsc029: STDOUT: 52780 ?SLl0:00 MPI_FastTest.exe linuxbsc030: STDOUT: 47537 ?SLl0:00 MPI_FastTest.exe linuxbsc031: STDOUT: 54609 ?SLl0:00 MPI_FastTest.exe linuxbsc032: STDOUT: 52833 ?SLl0:00 MPI_FastTest.exe $ timex /opt/MPI/openmpi-1.5.3/linux/intel/bin/mpiexec -np 27 --host linuxbsc001,linuxbsc002,linuxbsc003,linuxbsc004,linuxbsc005,linuxbsc006,linuxbsc007,linuxbsc008,linuxbsc009,linuxbsc010,linuxbsc011,linuxbsc014,linuxbsc016,linuxbsc017,linuxbsc018,linuxbsc020,linuxbsc021,linuxbsc022,linuxbsc023,linuxbsc024,linuxbsc025,linuxbsc027,linuxbsc028,linuxbsc029,linuxbsc030,linuxbsc031,linuxbsc032 MPI_FastTest.exe $ timex /opt/MPI/openmpi-1.5.3/linux/intel/bin/mpiexec -np 27 --host linuxbsc001,linuxbsc002,linuxbsc003,linuxbsc004,linuxbsc005,linuxbsc006,linuxbsc007,linuxbsc008,linuxbsc009,linuxbsc010,linuxbsc011,linuxbsc014,linuxbsc016,linuxbsc017,linuxbsc018,linuxbsc020,linuxbsc021,linuxbsc022,linuxbsc023,linuxbsc024,linuxbsc027,linuxbsc028,linuxbsc029,linuxbsc030,linuxbsc031,linuxbsc032,linuxbsc025 MPI_FastTest.exe linuxbsc001: STDOUT: 24322 ?Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh linuxbsc002: STDOUT: 2141 ?Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_vpid 2 -mca orte_ess_num_procs 28 --hnp-uri 751435776.0;tcp://134.61.194.2:33210 -mca plm rsh linuxbsc003: STDOUT: 69265 ?Ss 0:00 /opt/MPI/openmpi-1.5.3/linux/intel/bin/orted --daemonize -mca ess env -mca orte_ess_jobid 751435776 -mca orte_ess_v
Re: [OMPI users] How are the Open MPI processes spawned?
Hello Ralph, hello all. No real ideas, I'm afraid. We regularly launch much larger jobs than that using ssh without problem, I was also able to run a 288-node-job yesterday - the size alone is not the problem... so it is likely something about the local setup of that node that is causing the problem. Offhand, it sounds like either the mapper isn't getting things right, or for some reason the daemon on 005 isn't properly getting or processing the launch command. What you could try is adding --display-map to see if the map is being correctly generated. > If that works, then (using a debug build) try adding --leave-session-attached and see if > any daemons are outputting an error. You could add -mca odls_base_verbose 5 --leave-session-attached to your cmd line. > You'll see debug output from each daemon as it receives and processes the launch command. See if the daemon on 005 is behaving differently than the others. I've tried the options. The map seem to be correctly build, also the output if the daemons seem to be the same (see helloworld.txt) You should also try putting that long list of nodes in a hostfile - see if that makes a difference. > It will process the nodes thru a different code path, so if there is some problem in --host, this will tell us. No, with the host file instead of host list on command line the behaviour is the same. But, I just found out that the 1.4.3 does *not* hang on this constellation. The next thing I will try will be the installation of 1.5.4 :o) Best, Paul P.S. started: $ /opt/MPI/openmpi-1.5.3/linux/intel/bin/mpiexec --hostfile hostfile-mini -mca odls_base_verbose 5 --leave-session-attached --display-map helloworld 2>&1 | tee helloworld.txt On Nov 21, 2011, at 9:33 AM, Paul Kapinos wrote: Hello Open MPI volks, We use OpenMPI 1.5.3 on our pretty new 1800+ nodes InfiniBand cluster, and we have some strange hangups if starting OpenMPI processes. The nodes are named linuxbsc001,linuxbsc002,... (with some lacuna due of offline nodes). Each node is accessible from each other over SSH (without password), also MPI programs between any two nodes are checked to run. So long, I tried to start some bigger number of processes, one process per node: $ mpiexec -np NN --host linuxbsc001,linuxbsc002,... MPI_FastTest.exe Now the problem: there are some constellations of names in the host list on which mpiexec reproducible hangs forever; and more surprising: other *permutation* of the *same* node names may run without any errors! Example: the command in laueft.txt runs OK, the command in haengt.txt hangs. Note: the only difference is that the node linuxbsc025 is put on the end of the host list. Amazed, too? Looking on the particular nodes during the above mpiexec hangs, we found the orted daemons started on *each* node and the binary on all but one node (orted.txt, MPI_FastTest.txt). Again amazing that the node with no user process started (leading to hangup in MPI_Init of all processes and thus to hangup, I believe) was always the same, linuxbsc005, which is NOT the permuted item linuxbsc025... This behaviour is reproducible. The hang-on only occure if the started application is a MPI application ("hostname" does not hang). Any Idea what is gonna on? Best, Paul Kapinos P.S: no alias names used, all names are real ones -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 linuxbsc001: STDOUT: 24323 ?SLl0:00 MPI_FastTest.exe linuxbsc002: STDOUT: 2142 ?SLl0:00 MPI_FastTest.exe linuxbsc003: STDOUT: 69266 ?SLl0:00 MPI_FastTest.exe linuxbsc004: STDOUT: 58899 ?SLl0:00 MPI_FastTest.exe linuxbsc006: STDOUT: 68255 ?SLl0:00 MPI_FastTest.exe linuxbsc007: STDOUT: 62026 ?SLl0:00 MPI_FastTest.exe linuxbsc008: STDOUT: 54221 ?SLl0:00 MPI_FastTest.exe linuxbsc009: STDOUT: 55482 ?SLl0:00 MPI_FastTest.exe linuxbsc010: STDOUT: 59380 ?SLl0:00 MPI_FastTest.exe linuxbsc011: STDOUT: 58312 ?SLl0:00 MPI_FastTest.exe linuxbsc014: STDOUT: 56013 ?SLl0:00 MPI_FastTest.exe linuxbsc016: STDOUT: 58563 ?SLl0:00 MPI_FastTest.exe linuxbsc017: STDOUT: 54693 ?SLl0:00 MPI_FastTest.exe linuxbsc018: STDOUT: 54187 ?SLl0:00 MPI_FastTest.exe linuxbsc020: STDOUT: 55811 ?SLl0:00 MPI_FastTest.exe linuxbsc021: STDOUT: 54982 ?SLl0:00 MPI_FastTest.exe linuxbsc022: STDOUT: 50032 ?SLl0:00 MPI_FastTest.exe linuxbsc023: STDOUT: 54044 ?SLl0:00 MPI_FastTest.exe linuxbsc024: STDOUT: 51247 ?SLl0:00 MPI_FastTest.exe linuxbsc025: STDOUT: 18575 ?SLl0:00 MPI_FastTest.exe linuxbsc027: STDOUT: 48969 ?SLl0:00 MPI_FastTest.exe linuxbsc028: STDOU
Re: [OMPI users] How are the Open MPI processes spawned?
Hello Ralph, hello all, Two news, as usual a good and a bad one. The good: we believe to find out *why* it hangs The bad: it seem for me, this is a bug or at least undocumented feature of Open MPI /1.5.x. In detail: As said, we see mystery hang-ups if starting on some nodes using some permutation of hostnames. Usually removing "some bad" nodes helps, sometimes a permutation of node names in the hostfile is enough(!). The behaviour is reproducible. The machines have at least 2 networks: *eth0* is used for installation, monitoring, ... - this ethernet is very slim *ib0* - is the "IP over IB" interface and is used for everything: the file systems, ssh and so on. The hostnames are bound to the ib0 network; our idea was not to use eth0 for MPI at all. all machines are available from any over ib0 (are in one network). But on eth0 there are at least two different networks; especially the computer linuxbsc025 is in different network than the others and is not reachable from other nodes over eth0! (but reachable over ib0. The name used in the hostfile is resolved to the IP of ib0 ). So I believe that Open MPI /1.5.x tries to communicate over eth0 and cannot do it, and hangs. The /1.4.3 does not hang, so this issue is 1.5.x-specific (seen in 1.5.3 and 1.5.4). A bug? I also tried to disable the eth0 completely: $ mpiexec -mca btl_tcp_if_exclude eth0,lo -mca btl_tcp_if_include ib0 ... ...but this does not help. All right, the above command should disable the usage of eth0 for MPI communication itself, but it hangs just before the MPI is started, isn't it? (because one process lacks, the MPI_INIT cannot be passed) Now a question: is there a way to forbid the mpiexec to use some interfaces at all? Best wishes, Paul Kapinos P.S. Of course we know about the good idea to bring all nodes into the same net on eth0, but at this point it is impossible due of technical reason[s]... P.S.2 I'm not sure that the issue is really rooted in the above mentioned misconfiguration of eth0, but I have no better idea at this point... The map seem to be correctly build, also the output if the daemons seem to be the same (see helloworld.txt) Unfortunately, it appears that OMPI was not built with --enable-debug as there is no debug info in the output. Without a debug installation of OMPI, the ability to determine the problem is pretty limited. well, this will be the next option we will activate. We also have another issue here, on (not) using uDAPL.. You should also try putting that long list of nodes in a hostfile - see if that makes a difference. It will process the nodes thru a different code path, so if there is some problem in --host, this will tell us. No, with the host file instead of host list on command line the behaviour is the same. But, I just found out that the 1.4.3 does *not* hang on this constellation. The next thing I will try will be the installation of 1.5.4 :o) Best, Paul P.S. started: $ /opt/MPI/openmpi-1.5.3/linux/intel/bin/mpiexec --hostfile hostfile-mini -mca odls_base_verbose 5 --leave-session-attached --display-map helloworld 2>&1 | tee helloworld.txt On Nov 21, 2011, at 9:33 AM, Paul Kapinos wrote: Hello Open MPI volks, We use OpenMPI 1.5.3 on our pretty new 1800+ nodes InfiniBand cluster, and we have some strange hangups if starting OpenMPI processes. The nodes are named linuxbsc001,linuxbsc002,... (with some lacuna due of offline nodes). Each node is accessible from each other over SSH (without password), also MPI programs between any two nodes are checked to run. So long, I tried to start some bigger number of processes, one process per node: $ mpiexec -np NN --host linuxbsc001,linuxbsc002,... MPI_FastTest.exe Now the problem: there are some constellations of names in the host list on which mpiexec reproducible hangs forever; and more surprising: other *permutation* of the *same* node names may run without any errors! Example: the command in laueft.txt runs OK, the command in haengt.txt hangs. Note: the only difference is that the node linuxbsc025 is put on the end of the host list. Amazed, too? Looking on the particular nodes during the above mpiexec hangs, we found the orted daemons started on *each* node and the binary on all but one node (orted.txt, MPI_FastTest.txt). Again amazing that the node with no user process started (leading to hangup in MPI_Init of all processes and thus to hangup, I believe) was always the same, linuxbsc005, which is NOT the permuted item linuxbsc025... This behaviour is reproducible. The hang-on only occure if the started application is a MPI application ("hostname" does not hang). Any Idea what is gonna on? Best, Paul Kapinos P.S: no alias names used, all names are real ones -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52
Re: [OMPI users] How are the Open MPI processes spawned?
Hello Ralph, Terry, all! again, two news: the good one and the second one. Ralph Castain wrote: Yes, that would indeed break things. The 1.5 series isn't correctly checking connections across multiple interfaces until it finds one that works - it just uses the first one it sees. :-( Yahhh!! This behaviour - catch a random interface and hang forever if something is wrong with it - is somewhat less than perfect. From my perspective - the users one - OpenMPI should try to use eitcher *all* available networks (as 1.4 it does...), starting with the high performance ones, or *only* those interfaces on which the hostnames from the hostfile are bound to. Also, there should be timeouts (if you cannot connect to a node within a minute you probably will never ever be connected...) If some connection runs into a timeout a warning would be great (and a hint to take off the interface by oob_tcp_if_exclude, btl_tcp_if_exclude). Should it not? Maybe you can file it as a "call for enhancement"... The solution is to specify -mca oob_tcp_if_include ib0. This will direct the run-time wireup across the IP over IB interface. You will also need the -mca btl_tcp_if_include ib0 as well so the MPI comm goes exclusively over that network. YES! This works. Adding -mca oob_tcp_if_include ib0 -mca btl_tcp_if_include ib0 to the command line of mpiexec helps me to run the 1.5.x programs, so I believe this is the workaround. Many thanks for this hint, Ralph! My fail to not to find it in the FAQ (I was so close :o) http://www.open-mpi.org/faq/?category=tcp#tcp-selection But then I ran into yet another one issue. In http://www.open-mpi.org/faq/?category=tuning#setting-mca-params the way to define MCA parameters over environment variables is described. I tried it: $ export OMPI_MCA_oob_tcp_if_include=ib0 $ export OMPI_MCA_btl_tcp_if_include=ib0 I checked it: $ ompi_info --param all all | grep oob_tcp_if_include MCA oob: parameter "oob_tcp_if_include" (current value: , data source: environment or cmdline) $ ompi_info --param all all | grep btl_tcp_if_include MCA btl: parameter "btl_tcp_if_include" (current value: , data source: environment or cmdline) But then I get again the hang-up issue! ==> seem, mpiexec does not understand these environment variables! and only get the command line options. This should not be so? (I also tried to advise to provide the envvars by -x OMPI_MCA_oob_tcp_if_include -x OMPI_MCA_btl_tcp_if_include - nothing changed. Well, they are OMPI_ variables and should be provided in any case). Best wishes and many thanks for all, Paul Kapinos Specifying both include and exclude should generate an error as those are mutually exclusive options - I think this was also missed in early 1.5 releases and was recently patched. HTH Ralph On Nov 23, 2011, at 12:14 PM, TERRY DONTJE wrote: On 11/23/2011 2:02 PM, Paul Kapinos wrote: Hello Ralph, hello all, Two news, as usual a good and a bad one. The good: we believe to find out *why* it hangs The bad: it seem for me, this is a bug or at least undocumented feature of Open MPI /1.5.x. In detail: As said, we see mystery hang-ups if starting on some nodes using some permutation of hostnames. Usually removing "some bad" nodes helps, sometimes a permutation of node names in the hostfile is enough(!). The behaviour is reproducible. The machines have at least 2 networks: *eth0* is used for installation, monitoring, ... - this ethernet is very slim *ib0* - is the "IP over IB" interface and is used for everything: the file systems, ssh and so on. The hostnames are bound to the ib0 network; our idea was not to use eth0 for MPI at all. all machines are available from any over ib0 (are in one network). But on eth0 there are at least two different networks; especially the computer linuxbsc025 is in different network than the others and is not reachable from other nodes over eth0! (but reachable over ib0. The name used in the hostfile is resolved to the IP of ib0 ). So I believe that Open MPI /1.5.x tries to communicate over eth0 and cannot do it, and hangs. The /1.4.3 does not hang, so this issue is 1.5.x-specific (seen in 1.5.3 and 1.5.4). A bug? I also tried to disable the eth0 completely: $ mpiexec -mca btl_tcp_if_exclude eth0,lo -mca btl_tcp_if_include ib0 ... I believe if you give "-mca btl_tcp_if_include ib0" you do not need to specify the exclude parameter. ...but this does not help. All right, the above command should disable the usage of eth0 for MPI communication itself, but it hangs just before the MPI is started, isn't it? (because one process lacks, the MPI_INIT cannot be passed) By "just before the MPI is started" do you mean while orte is launching the processes. I wonder if you need to specify "-mca oob_tcp_if_include ib0" also but I think that may depe
Re: [OMPI users] How are the Open MPI processes spawned?
Hello again, Ralph Castain wrote: Yes, that would indeed break things. The 1.5 series isn't correctly checking connections across multiple interfaces until it finds one that works - it just uses the first one it sees. :-( Yahhh!! This behaviour - catch a random interface and hang forever if something is wrong with it - is somewhat less than perfect. From my perspective - the users one - OpenMPI should try to use eitcher *all* available networks (as 1.4 it does...), starting with the high performance ones, or *only* those interfaces on which the hostnames from the hostfile are bound to. It is indeed supposed to do the former - as I implied, this is a bug in the 1.5 series. Thanks for clarification. I was not sure about this is a bug or a feature :-) Also, there should be timeouts (if you cannot connect to a node within a minute you probably will never ever be connected...) We have debated about this for some time - there is a timeout mca param one can set, but we'll consider again making it default. If some connection runs into a timeout a warning would be great (and a hint to take off the interface by oob_tcp_if_exclude, btl_tcp_if_exclude). Should it not? Maybe you can file it as a "call for enhancement"... Probably the right approach at this time. Ahhh.. sorry, did not understand what you mean. Did you filed it, or someone else, or should I do it in some way? Or should not? But then I ran into yet another one issue. In http://www.open-mpi.org/faq/?category=tuning#setting-mca-params the way to define MCA parameters over environment variables is described. I tried it: $ export OMPI_MCA_oob_tcp_if_include=ib0 $ export OMPI_MCA_btl_tcp_if_include=ib0 I checked it: $ ompi_info --param all all | grep oob_tcp_if_include MCA oob: parameter "oob_tcp_if_include" (current value: , data source: environment or cmdline) $ ompi_info --param all all | grep btl_tcp_if_include MCA btl: parameter "btl_tcp_if_include" (current value: , data source: environment or cmdline) But then I get again the hang-up issue! ==> seem, mpiexec does not understand these environment variables! and only get the command line options. This should not be so? No, that isn't what is happening. The problem lies in the behavior of rsh/ssh. This environment does not forward environmental variables. Because of limits on cmd line length, we don't automatically forward MCA params from the environment, but only from the cmd line. It is an annoying limitation, but one outside our control. We know about "ssh does not forward environmental variables." But in this case, are these parameters not the parameters of mpiexec itself, too? The crucial thing is, that setting of the parameters works over the command line but *does not work* over the envvar way (as in http://www.open-mpi.org/faq/?category=tuning#setting-mca-params described). This looks like a bug for me! Put those envars in the default mca param file and the problem will be resolved. You mean e.g. $prefix/etc/openmpi-mca-params.conf as described in 4. of http://www.open-mpi.org/faq/?category=tuning#setting-mca-params Well, this is possible, but not flexible enough for us (because there are some machines which only can run if the parameters are *not* set - on those the ssh goes just over these eth0 devices). By now we use the command line parameters and hope the envvar way will work sometimes. (I also tried to advise to provide the envvars by -x OMPI_MCA_oob_tcp_if_include -x OMPI_MCA_btl_tcp_if_include - nothing changed. I'm surprised by that - they should be picked up and forwarded. Could be a bug Well, I also mean this is a bug, but as said not on providing the values of envvars but on detecting of these parameters at all. Or maybe on both. Well, they are OMPI_ variables and should be provided in any case). No, they aren't - they are not treated differently than any other envar. [after performing some RTFM...] at least the man page of mpiexec says, the OMPI_ environment variables are always provided and thus treated *differently* than other envvars: $ man mpiexec Exported Environment Variables All environment variables that are named in the form OMPI_* will automatically be exported to new processes on the local and remote nodes. So, tells the man page lies, or this is an removed feature, or something else? Best wishes, Paul Kapinos Specifying both include and exclude should generate an error as those are mutually exclusive options - I think this was also missed in early 1.5 releases and was recently patched. HTH Ralph On Nov 23, 2011, at 12:14 PM, TERRY DONTJE wrote: On 11/23/2011 2:02 PM, Paul Kapinos wrote: Hello Ralph, hello all, Two news, as usual a good and a bad one. The good: we believe to find out *why* it hangs The bad: it seem for me, this is a bug or at lea
[OMPI users] Open MPI and DAPL 2.0.34 are incompatible?
Dear Open MPI developer, OFED 1.5.4 will contain DAPL 2.0.34. I tried to compile the newest release of Open MPI (1.5.4) with this DAPL release and I was not successful. Configuring with --with-udapl=/path/to/2.0.34/dapl got the error "/path/to/2.0.34/dapl/include/dat/udat.h not found" Looking into include dir: there is no 'dat' subdir but 'dat2'. Just for fun I also tried to move 'dat2' to 'dat' back (dirty hack I know :-) - the configure stage was then successful but the compilation failed. The header seem to be really changed, not just moved. The question: are the Open MPI developer aware of this changes, and when a version of Open MPI will be available with support for DAPL 2.0.34? (Background: we have some trouble with Intel MPI and current DAPL which we do not have with DAPL 2.0.34, so our dream is to update as soon as possible) Best wishes and an nice weekend, Paul http://www.openfabrics.org/downloads/OFED/release_notes/OFED_1.5.4_release_notes -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] How are the Open MPI processes spawned?
Hello Jeff, Ralph, all! Meaning that per my output from above, what Paul was trying should have worked, no? I.e., setenv'ing OMPI_, and those env vars should magically show up in the launched process. In the -launched process- yes. However, his problem was that they do not show up for the -orteds-, and thus the orteds don't wireup correctly. Sorry for latency, too many issues on too many area needing improvement :-/ Well, just to clarify the long story about what I have seen: 1. got a strange start-up problem (based on bogus configuration of eth0 + known (for you, experts :o) bug in /1.5.x 2. got a workaround for (1.) by setting '-mca oob_tcp_if_include ib0 -mca btl_tcp_if_include ib0' on the command line of mpiexec => WORKS! Many thanks guys! 3. remembered that any MCA Parameters can also be defined over OMP_MCA_... envvars, tried out to set => works NOT, the hang-ups were again here. Checking how the MCA parameters are set by ompi_info - all clear, but doesn't work. My blind guess was, mpiexec does not understood there envvars in this case. See also http://www.open-mpi.org/community/lists/users/2011/11/17823.php Thus this issue is not about forwarding some or any OMPI_* envvars to the _processes_, but on someone step _before_ (the processes were not started correctly at all in my problem case), as Ralph wrote. The difference in behaviour if setting parameters on command line or over OMPI_*envvars matters! Ralph Castain wrote: >> Did you filed it, or someone else, or should I do it in some way? > I'll take care of it, and copy you on the ticket so you can see > what happens. I'll also do the same for the connection bug > - sorry for the problem :-( Ralph, many thanks for this! Best wishes and a nice evening/day/whatever time you have! Paul Kapinos -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] wiki and "man mpirun" odds, and a question
Hello, I don't see what you're referring to. I see: - • -x : The name of an environment variable to export to the parallel application. The -x option can be specified multiple times to export multiple environment variables to the parallel application. - (ok, I might have just changed it :-) ) nice joke! :o) > Queuing systems can forward the submitters environment if desired. > For example, in SGE, the -V switch forwards all the environment > variables to the job's environment, so if there's one you can use > to launch your job, you might want to check it's documentation. This is known and not an option for us. There are too much variables in the interactive environment which should not be forwarded... What I asked for is something which could replace mpiexec -x FOO -x BAR -x FOBA -x BAFO -x RR -x ZZ .. (which is quite tedious to type and error-prone for the users) by setting some dreamlike value, e.g. export OMPI_PROVIDE_THIS_VARIABLES="FOO BAR FOBA BAFO RR ZZ" At least some envvar, whose content would be simply added to the comand line, could help: export OMPI_ADD_2_COMMLINE="-x FOO -x BAR -x FOBA -x BAFO -x RR -x ZZ" Well, this are my user's dreams; but maybe this give an inspiration for Open MPI programmers. As said, the situation when a [long] list of envvars is to be provided is quite common, and typing everything on the command line is tedious and error-prone. Best wishes [and sorry for the noise], Paul Kapinos -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Open MPI and DAPL 2.0.34 are incompatible?
Good morning, We've never recommended the use of dapl on Linux. I think it might have worked at one time, but I don't think anyone bothered to maintain it. On Linux, you should probably use native verbs support, instead. Well, we use 'Open MPI + openib' since some years now (started with Sun's ClusterTools and Open MPI 1.2.x, now we have self-build 1.4.x and 1.5.x Open MPI). The problem is, that on our new, big, sexy cluster (some 1700 nodes connected to common QDR InfiniBand fabric), running MPI over DAPL seem to be quite faster than running over native IB. Yes, it is puzzling. But reproducible: Intel MPI (over DAPL) => 100% OpenMPI (over openib) => 90% on some 4/5 machines (Westmere dual-Socket) OpenMPI (over openib) => 45% on some 1/5 machines (Nehalem quad-Socket) Intel MPI (over ofa) ==> the same values than OpenMPI! (Bandwidth in a PingPong test, e.g. Intel MPI benchmark, and two other PingPongs) The question about WHY native IB is slower than DAPL is a very good one (did you have any ideas?). As said it is reproducible: switching from dapl to ofa in Intel MPI also switches the performance of PingPong. (You may say "your test is wrong" but we tried out three different PingPong tests, producing very similar values). The second question is How to Learn OpenMPI to Use DAPL. Meanwhile, I compiled lotz of versions (1.4.3, 1.4.4, 1.5.3, 1.5.4) using at least two DAPL versions and option --with-udapl. The versions are build well, but always on start, the initialisation of DAPL fails (message see below) and the communication runs as usual over openib. Also the error message says "may be an invalid Registry in the dat.conf file", this seem to be very unlikely: with the same dat.conf the Intel MPI can use DAPL. (and yes, OpenMPI really use the same dat.conf than Intel MPI, set over DAT_OVERRIDE - checked and double-checked). -- WARNING: Failed to open "ofa-v2-mlx4_0-1u" [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED]. This may be a real error or it may be an invalid entry in the uDAPL Registry which is contained in the dat.conf file. Contact your local System Administrator to confirm the availability of the interfaces in the dat.conf file. -- Because of the anticipated performance gain we would be very keen on using DAPL with Open MPI. Does somebody have any idea what could be wrong and what to check? On Dec 2, 2011, at 1:21 PM, Paul Kapinos wrote: Dear Open MPI developer, OFED 1.5.4 will contain DAPL 2.0.34. I tried to compile the newest release of Open MPI (1.5.4) with this DAPL release and I was not successful. Configuring with --with-udapl=/path/to/2.0.34/dapl got the error "/path/to/2.0.34/dapl/include/dat/udat.h not found" Looking into include dir: there is no 'dat' subdir but 'dat2'. Just for fun I also tried to move 'dat2' to 'dat' back (dirty hack I know :-) - the configure stage was then successful but the compilation failed. The header seem to be really changed, not just moved. The question: are the Open MPI developer aware of this changes, and when a version of Open MPI will be available with support for DAPL 2.0.34? (Background: we have some trouble with Intel MPI and current DAPL which we do not have with DAPL 2.0.34, so our dream is to update as soon as possible) Best wishes and an nice weekend, Paul http://www.openfabrics.org/downloads/OFED/release_notes/OFED_1.5.4_release_notes -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Cofigure(?) problem building /1.5.3 on ScientificLinux6.0
Hello Gus, Ralph, Jeff a very late answer for this - just found it in my mailbox. Would "cp -rp" help? (To preserve time stamps, instead of "cp -r".) Yes, the root of the evil were the time stamps. 'cp -a' is the magic wand. Many thanks for your help, and I should wear sackcloth and ashes... :-/ Best, Paul Anyway, since 1.2.8 here I build 5, sometimes more versions, all from the same tarball, but on separate build directories, as Jeff suggests. [VPATH] Works for me. My two cents. Gus Correa Jeff Squyres wrote: Ah -- Ralph pointed out the relevant line to me in your first mail that I initially missed: In each case I build 16 versions at all (4 compiler * 32bit/64bit * support for multithreading ON/OFF). The same error arise in all 16 versions. Perhaps you should just expand the tarball once and then do VPATH builds...? Something like this: tar xf openmpi-1.5.3.tar.bz2 cd openmpi-1.5.3 mkdir build-gcc cd build-gcc ../configure blah.. make -j 4 make install cd .. mkdir build-icc ../configure CC=icc CXX=icpc FC=ifort F77=ifort ..blah. make -j 4 make install cd .. etc. This allows you to have one set of source and have N different builds from it. Open MPI uses the GNU Autotools correctly to support this kind of build pattern. On Jul 22, 2011, at 2:37 PM, Jeff Squyres wrote: Your RUNME script is a *very* strange way to build Open MPI. It starts with a massive copy: cp -r /home/pk224850/OpenMPI/openmpi-1.5.3/AUTHORS /home/pk224850/OpenMPI/openmpi-1.5.3/CMakeLists.txt <...much snipped...> . Why are you doing this kind of copy? I suspect that the GNU autotools' timestamps are getting all out of whack when you do this kind of copy, and therefore when you run "configure", it tries to re-autogen itself. To be clear: when you expand OMPI from a tarball, you shouldn't need the GNU Autotools installed at all -- the tarball is pre-bootstrapped exactly to avoid you needing to use the Autotools (much less any specific version of the Autotools). I suspect that if you do this: - tar xf openmpi-1.5.3.tar.bz2 cd openmpi-1.5.3 ./configure etc. - everything will work just fine. On Jul 22, 2011, at 11:12 AM, Paul Kapinos wrote: Dear OpenMPI volks, currently I have a problem by building the version 1.5.3 of OpenMPI on Scientific Linux 6.0 systems, which seem vor me to be a configuration problem. After the configure run (which seem to terminate without error code), the "gmake all" stage produces errors and exits. Typical is the output below. Fancy: the 1.4.3 version on same computer can be build with no special trouble. Both the 1.4.3 and 1.5.3 versions can be build on other computer running CentOS 5.6. In each case I build 16 versions at all (4 compiler * 32bit/64bit * support for multithreading ON/OFF). The same error arise in all 16 versions. Can someone give a hint about how to avoid this issue? Thanks! Best wishes, Paul Some logs and configure are downloadable here: https://gigamove.rz.rwth-aachen.de/d/id/2jM6MEa2nveJJD The configure line is in RUNME.sh, the logs of configure and build stage in log_* files; I also attached the config.log file and the configure itself (which is the standard from the 1.5.3 release). ## CDPATH="${ZSH_VERSION+.}:" && cd . && /bin/sh /tmp/pk224850/linuxc2_11254/openmpi-1.5.3mt_linux64_gcc/config/missing --run aclocal-1.11 -I config sh: config/ompi_get_version.sh: No such file or directory /usr/bin/m4: esyscmd subprocess failed configure.ac:953: warning: OMPI_CONFIGURE_SETUP is m4_require'd but not m4_defun'd config/ompi_mca.m4:37: OMPI_MCA is expanded from... configure.ac:953: the top level configure.ac:953: warning: AC_COMPILE_IFELSE was called before AC_USE_SYSTEM_EXTENSIONS ../../lib/autoconf/specific.m4:386: AC_USE_SYSTEM_EXTENSIONS is expanded from... opal/mca/paffinity/hwloc/hwloc/config/hwloc.m4:152: HWLOC_SETUP_CORE_AFTER_C99 is expanded from... ../../lib/m4sugar/m4sh.m4:505: AS_IF is expanded from... opal/mca/paffinity/hwloc/hwloc/config/hwloc.m4:22: HWLOC_SETUP_CORE is expanded from... opal/mca/paffinity/hwloc/configure.m4:40: MCA_paffinity_hwloc_CONFIG is expanded from... config/ompi_mca.m4:540: MCA_CONFIGURE_M4_CONFIG_COMPONENT is expanded from... config/ompi_mca.m4:326: MCA_CONFIGURE_FRAMEWORK is expanded from... config/ompi_mca.m4:247: MCA_CONFIGURE_PROJECT is expanded from... configure.ac:953: warning: AC_RUN_IFELSE was called before AC_USE_SYSTEM_EXTENSIONS -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@c
Re: [OMPI users] SIGV at MPI_Cart_sub
A blind guess: did you use Intel compiler? If so, there is/was an error leading to SIGSEGV _in Open MPI itselv_. http://www.open-mpi.org/community/lists/users/2012/01/18091.php If the SIGSEGV arise not in OpenMPI but in application itself it may be a programming issue.. In any case, more precisely answer are impossible without seeing any codes snippet and/or logs. Best, Paul Anas Al-Trad wrote: Dear people, In my application, I have the segmentation fault of Integer Divide-by-zero when calling MPI_cart_sub routine. My program is as follows, I have 128 ranks, I make a new communicator of the first 96 ranks via MPI_Comm_creat. Then I create a grid of 8X12 by calling MPI_Cart_create. After creating the grid if I call MPI_Cart_sub then I have that error. This error happens also when I use a communicator of 24 ranks and create a grid of 4X6. Can you please help me in solving this? Regards, Anas ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] rankfiles on really big nodes broken?
Hello, Open MPI developer! Now, we have a really nice toy: 2 Tb RAM, 16 sockets, 128 cores. (4x smaller Bull S6010 coupled by BCS chips to a single image machine) On a such big box, process pinning is vital. So we tried to use the Open MPI capabilities to pin te processes. But it seem that the rankfile infrastructure does not work properly: we always get "Error: Invalid argument" message on the 128-core node, also if the rankfile was OK. On a smaller node (up to 32 cores/ 64 threads) the very same rankfile (with changed node name of course) works well. I believe, this computer dimension is a bit too big for the pinning infrasructure now. A bug? Best wishes, Paul Kapinos P.S. see the attached .tgz for some logzz -- Rankfiles Rankfiles provide a means for specifying detailed information about how process ranks should be mapped to nodes and how they should be bound. Consider the following: -- Open RTE: 1.5.3 Open RTE SVN revision: r24532 Open RTE release date: Mar 16, 2011 OPAL: 1.5.3 OPAL SVN revision: r24532 OPAL release date: Mar 16, 2011 Ident string: 1.5.3 -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 rankfiles128.tgz Description: application/compressed-tar smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] rankfiles on really big nodes broken?
Hello Ralph, Yes, the rankfiles in rankfiles128.tgz are the rankfiles which are used, and linuxbsc*.txt files contain the output produced. It would surprise me if the rankfile3 is incorrect - the very same files (exept the node name of course) rankfile1, rankfile2 worked on smaller machines, cf. runme.sh, the rankfile* files ant the output files. The behaviour "it works on small box but does not work on thick box" was the quell of mu assumption that there is a error somewhere.. For the complete error message on the thick node see linuxbsc269.txt file. Updating to newer 1.5.x is a good idea; but it is always a bit tedious... Would 1.5.5 arrive the next time? Best wishes, Paul Kapinos Ralph Castain wrote: I don't see anything in the code that limits the number of procs in a rankfile. > Are the attached rankfiles the ones you are trying to use? I'm wondering if there is a syntax error that is causing the problem. It would help if you could provide the complete error message output. At one time, there was a limit on the number of procs on a node - > nothing to do with rankfile. That was fixed, though, and there is no real limit any more. I don't recall the precise release number where it changed in the 1.5 series - you might try updating to 1.5.4 as I'm sure it doesn't exist there. On Jan 20, 2012, at 12:43 PM, Paul Kapinos wrote: Hello, Open MPI developer! Now, we have a really nice toy: 2 Tb RAM, 16 sockets, 128 cores. (4x smaller Bull S6010 coupled by BCS chips to a single image machine) On a such big box, process pinning is vital. So we tried to use the Open MPI capabilities to pin te processes. But it seem that the rankfile infrastructure does not work properly: we always get "Error: Invalid argument" message on the 128-core node, also if the rankfile was OK. On a smaller node (up to 32 cores/ 64 threads) the very same rankfile (with changed node name of course) works well. I believe, this computer dimension is a bit too big for the pinning infrasructure now. A bug? Best wishes, Paul Kapinos P.S. see the attached .tgz for some logzz -- Rankfiles Rankfiles provide a means for specifying detailed information about how process ranks should be mapped to nodes and how they should be bound. Consider the following: -- Open RTE: 1.5.3 Open RTE SVN revision: r24532 Open RTE release date: Mar 16, 2011 OPAL: 1.5.3 OPAL SVN revision: r24532 OPAL release date: Mar 16, 2011 Ident string: 1.5.3 -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Mpirun: How to print STDOUT of just one process?
Try out the attached wrapper: $ mpiexec -np 2 masterstdout mpirun -n 2 Is there a way to have mpirun just merger STDOUT of one process to its STDOUT stream? -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 #!/bin/sh ARGS=$@ if [[ $OMPI_COMM_WORLD_RANK == 0 ]] then $ARGS else $ARGS 1>/dev/null 2>/dev/null fi smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] Environment variables [documentation]
Dear Open MPI developer, here: http://www.open-mpi.org/faq/?category=running#mpi-environmental-variables are enlisted four envvars Open MPI set for every process. We use they for some scripting and thank you for providing they. But simple "mpiexec -np 1 env | grep OMPI" brings lotz more envvars. These are interesting for us: 1) OMPI_COMM_WORLD_LOCAL_SIZE - seem to contain the number of processes which are running on the specific node, see also http://www.open-mpi.org/community/lists/users/2008/07/6054.php Is this envvar also "stable" as OMPI_COMM_WORLD_LOCAL_RANK is? (This would make sense as it looks like the OMPI_COMM_WORLD_SIZE, OMPI_COMM_WORLD_RANK pair.) If yes, maybe it also should be documented in the Wiki page. 2) OMPI_COMM_WORLD_NODE_RANK - is that just a double of OMPI_COMM_WORLD_LOCAL_RANK ? Best wishes, Paul Kapinos -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] Problem running over IB with huge data set
Hello Jeff, Ralph, All Open MPI folks, We had an off-list discussion about an error in the Serpent program. Ralph said: >We already have several tickets for that problem, each relating to a different scenario: >https://svn.open-mpi.org/trac/ompi/ticket/2155 >https://svn.open-mpi.org/trac/ompi/ticket/2157 >https://svn.open-mpi.org/trac/ompi/ticket/2295 I've build a quite small reproducer for the original issue (with a huge memory footprint) and have send it to you. The other week, another user got problemz if using huge data sets. A program, which runs without any problem with smaller data sets (in order of 24Gb data in total and smaller), got problem with huge data sets (in order of 100Gb data in total and more), _if running over infiniband or IPoIB_. The program essentially hangs, mostly blocking the transport used. In some scenarios it crash. The same program and data set run fine over ethernet or shared memory (yes, we've computers with 100ths of GB of memory). The behaviour is reproducible. Diverse errors are produced, some of them are listed below. Another thing is that in the most cases, if the program hangs, it also blocks the transport, that is another programs cannot run over the same interface (just as reported earlier). More fun: we also found some '#procs x #Nodes' combinations where the program run fine. I.e., 30 and 60 processes over 6 nodes run through fine, 6 procs over 6 nodes - killed with a error message (see below) 12,18,36,61,62,64,66 procs over 6 nodes - hangs and block the interface. Well, we cannot give any warranty that that isn't a bug in the program itself, because it is just in development now. However, since the program works well for smaller sized data sets and over TCP and over ShMem, it smells like a MPI library error, thus this mail. Or maybe the puzzling behaviour may be a follow-up of any bugs in the program itself? If yes, what it could be and how we could try no find it? I did not attach a reproducer to this mail because the user do not want to spread the code all over the world, but can send it to you if you are interested in reproducing it. [The code is about matrix transpose of huge matrices and essentially calls MPI_Alltoallv, it is written a 'nice, well-structured' C++ code (nothing stays unwrapped) but is pretty small and readable]. Ralph, Jeff, anybody - any interest in reproducing this issue? Best wishes, Paul Kapinos P.S. Open MPI 1.5.3 used - still waiting for 1.5.5 ;-) Some error messages: with 6 procs over 6 Nodes: -- mlx4: local QP operation err (QPN 7c0063, WQE index 0, vendor syndrome 6f, opcode = 5e) [[8771,1],5][btl_openib_component.c:3316:handle_wc] from linuxbdc07.rz.RWTH-Aachen.DE to: linuxbdc04 error polling LP CQ with status LOCAL QP OPERATION ERROR status number 2 for wr_id 6afb70 opcode 0 vendor error 111 qp_idx 3 mlx4: local QP operation err (QPN 18005f, WQE index 0, vendor syndrome 6f, opcode = 5e) [[8771,1],2][btl_openib_component.c:3316:handle_wc] from linuxbdc03.rz.RWTH-Aachen.DE to: linuxbdc02 error polling LP CQ with status LOCAL QP OPERATION ERROR status number 2 for wr_id 6afb70 opcode 0 vendor error 111 qp_idx 3 [[8771,1],1][btl_openib_component.c:3316:handle_wc] from linuxbdc02.rz.RWTH-Aachen.DE to: linuxbdc01 error polling LP CQ with status LOCAL QP OPERATION ERROR status number 2 for wr_id 6afb70 opcode 0 vendor error 111 qp_idx 3 mlx4: local QP operation err (QPN 340057, WQE index 0, vendor syndrome 6f, opcode = 5e) -- with 61 processes using IPoIB: mpiexec -mca btl ^openib -np 61 -host 1,2,3,4,5,6 a.out < dim100G.in -- [linuxbdc02.rz.RWTH-Aachen.DE][[21403,1],1][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect] connect() to 134.61.208.202 failed: Connection timed out (110) [linuxbdc01.rz.RWTH-Aachen.DE][[21403,1],18][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect] connect() to 134.61.208.203 failed: Connection timed out (110) [linuxbdc01.rz.RWTH-Aachen.DE][[21403,1],18][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect] connect() to 134.61.208.203 failed: Connection timed out (110) -- -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Hybrid OpenMPI / OpenMP programming
port KMP_BLOCKTIME=0 ... The latest finally leads to an interesting reduction of computing time but worsens the second problem we have to face (see bellow). b) We managed to have a "correct" (?) implementation of our MPI-processes on our sockets by using: mpirun -bind-to-socket -bysocket -np 4n However if OpenMP threads initially seem to scatter on each socket (one thread by core) they slowly migrate to the same core as their 'Master MPI process' or gather on one or two cores by socket We play around with the environment variable KMP_AFFINITY but the best we could obtain was a pinning of the OpenMP threads to their own core... disorganizing at the same time the implementation of the 4n Level-2 MPI processes. When added, neither the specification of a rankfile nor the mpirun option -x IPATH_NO_CPUAFFINITY=1 seem to change significantly the situation. This comportment looks rather inefficient but so far we did not manage to prevent the migration of the 4 threads to at most a couple of cores ! Is there something wrong in our "Hybrid" implementation? Do you have any advices? Thanks for your help, Francis ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users _______ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] Still bothered / cannot run an application
(cross-post to 'users' and 'devel' mailing lists) Dear Open MPI developer, a long time ago, I reported about an error in Open MPI: http://www.open-mpi.org/community/lists/users/2012/02/18565.php Well, in the 1.6 the behaviour has changed: the test case don't hang forever and block an InfiniBand interface, but seem to run through, and now this error message is printed: -- The OpenFabrics (openib) BTL failed to register memory in the driver. Please check /var/log/messages or dmesg for driver specific failure reason. The failure occured here: Local host:mlx4_0 Device:openib_reg_mr Function: Cannot allocate memory() Errno says: You may need to consult with your system administrator to get this problem fixed. -- Looking into FAQ http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages deliver us no hint about what is bad. The locked memory is unlimited: -- pk224850@linuxbdc02:~[502]$ cat /etc/security/limits.conf | grep memlock #- memlock - max locked-in-memory address space (KB) * hardmemlock unlimited * softmemlock unlimited -- Could it still be an Open MPI issue? Are you interested in reproduce this? Best, Paul Kapinos P.S: The same test with Intel MPI cannot run using DAPL, but run very fine opef 'ofa' (= native verbs as Open MPI use it). So I believe the problem is rooted in the communication pattern of the program; it send very LARGE messages to a lot of/all other processes. (The program perform an matrix transposition of a distributed matrix). -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Re :Re: OpenMP and OpenMPI Issue
Jack, note that support for THREAD_MULTIPLE is available in [newer] versions of open MPI, but disabled by default. You have to enable it by configuring, in 1.6: --enable-mpi-thread-multiple Enable MPI_THREAD_MULTIPLE support (default: disabled) You may check the available threading supprt level by using the attaches program. On 07/20/12 19:33, Jack Galloway wrote: This is an old thread, and I'm curious if there is support now for this? I have a large code that I'm running, a hybrid MPI/OpenMP code, that is having trouble over our infiniband network. I'm running a fairly large problem (uses about 18GB), and part way in, I get the following errors: You say "big footprint"? I hear a bell ringing... http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 PROGRAM tthr IMPLICIT NONE INCLUDE "mpif.h" INTEGER REQUIRED, PROVIDED, IERROR REQUIRED = MPI_THREAD_MULTIPLE PROVIDED = -1 CALL MPI_INIT_THREAD(REQUIRED, PROVIDED, IERROR) WRITE (*,*) MPI_THREAD_SINGLE, MPI_THREAD_FUNNELED, & MPI_THREAD_SERIALIZED, MPI_THREAD_MULTIPLE WRITE (*,*) REQUIRED, PROVIDED, IERROR CALL MPI_FINALIZE(IERROR) END smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Infiniband performance Problem and stalling
Randolph, after reading this: On 08/28/12 04:26, Randolph Pullen wrote: - On occasions it seems to stall indefinately, waiting on a single receive. ... I would make a blind guess: are you aware about IB card parameters for registered memory? http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem "Waiting forever" for a single operation is one of symptoms of the problem especially in 1.5.3. best, Paul P.S. the lower performance with 'big' chinks is known phenomenon, cf. http://www.scl.ameslab.gov/netpipe/ (image on bottom of the page). But the chunk size of 64k is fairly small -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] OMPI 1.6.x Hang on khugepaged 100% CPU time
Yevgeny, we at RZ Aachen also see problems very similar to described in initial posting of Yong Qin, on VASP with Open MPI 1.5.3. We're currently looking for a data set able to reproduce this. I'll write an email if we gotcha it. Best, Paul On 09/05/12 13:52, Yevgeny Kliteynik wrote: I'm checking it with OFED folks, but I doubt that there are some dedicated tests for THP. So do you see it only with a specific application and only on a specific data set? Wonder if I can somehow reproduce it in-house... -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] too much stack size: _silently_ failback to IPoIB
Dear Open MPI developer, there are often problems with the user limit for the stack size (ulimit -s) on Linux if running Fortran and/or OpenMP(=hybride) programs. In one case we have seen the user has set the stack size in his environment by occasion far too high - to about one TeraByte (on nodes with less than 100Gb RAM). It turned out that Open MPI (1.6.1) cannot use InfiniBand in this environment (cannot activate IB card / register memory / something else because of lack of virtual memory - all memory reserved for the virtual stack?). The job seem to fail back and run over IPoIB, according to achieved bandwidth. The problem was that there was no single word of caution printed out, whereby Open MPI usually warns the user iff an seemingly available high performance network cannot be used, AFAIK. Thus the problem of the user - 15x bandwidth and performance loss - was covered for many weeks and found only by occasion. So, what's going wrong [if any]? Reproducing: try to set the 'ulimit -s' in your environment to an astronomic value, or use the attached wrapper. $MPI_ROOT/bin/mpiexec -mca oob_tcp_if_include ib0 -mca btl_tcp_if_include ib0 -np 2 -H linuxbdc01,linuxbdc02 /home/pk224850/bin/ulimit_high.sh MPI_FastTest.exe -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 ulimit_high.sh Description: application/shellscript smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Performance/stability impact of thread support
At least, be aware of silently disabling the usage of InfiniBand if 'multiple' threading level is activated: http://www.open-mpi.org/community/lists/devel/2012/10/11584.php On 10/29/12 19:14, Daniel Mitchell wrote: Hi everyone, I've asked my linux distribution to repackage Open MPI with thread support (meaning configure with --enable-thread-multiple). They are willing to do this if it won't have any performance/stability hit for Open MPI users who don't need thread support (meaning everyone but me, apparently). Does enabling thread support impact performance/stability? Daniel ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] Multirail + Open MPI 1.6.1 = very big latency for the first communication
Hello all, Open MPI is clever and use by default multiple IB adapters, if available. http://www.open-mpi.org/faq/?category=openfabrics#ofa-port-wireup Open MPI is lazy and establish connections only iff needed. Both is good. We have kinda special nodes: up to 16 sockets, 128 cores, 4 boards, 4 IB cards. Multirail works! The crucial thing is, that starting with v1.6.1 the latency of the very first PingPong sample between two nodes take really a lot of time - some 100x - 200x of usual latency. You cannot see this using usual latency benchmark(*) because they tend to omit the first samples as "warmup phase", but we use a kinda self-written parallel test which clearly show this (and let me to muse some days). If Miltirail is forbidden (-mca btl_openib_max_btls 1), or if v.1.5.3 used, or if the MPI processes are preconnected (http://www.open-mpi.org/faq/?category=running#mpi-preconnect) there is no such huge latency outliers for the first sample. Well, we know about the warm-up and lazy connections. But 200x ?! Any comments about that is OK so? Best, Paul Kapinos (*) E.g. HPCC explicitely say in http://icl.cs.utk.edu/hpcc/faq/index.html#132 > Additional startup latencies are masked out by starting the measurement after > one non-measured ping-pong. P.S. Sorry for cross-posting to both Users and Developers, but my last questions to Users have no reply until yet, so trying to broadcast... -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] MPI_Alltoallv performance regression 1.6.0 to 1.6.1
Did you *really* wanna to dig into code just in order to switch a default communication algorithm? Note there are several ways to set the parameters; --mca on command line is just one of them (suitable for quick online tests). http://www.open-mpi.org/faq/?category=tuning#setting-mca-params We 'tune' our Open MPI by setting environment variables Best Paul Kapinos On 12/19/12 11:44, Number Cruncher wrote: Having run some more benchmarks, the new default is *really* bad for our application (2-10x slower), so I've been looking at the source to try and figure out why. It seems that the biggest difference will occur when the all_to_all is actually sparse (e.g. our application); if most N-M process exchanges are zero in size the 1.6 ompi_coll_tuned_alltoallv_intra_basic_linear algorithm will actually only post irecv/isend for non-zero exchanges; any zero-size exchanges are skipped. It then waits once for all requests to complete. In contrast, the new ompi_coll_tuned_alltoallv_intra_pairwise will post the zero-size exchanges for *every* N-M pair, and wait for each pairwise exchange. This is O(comm_size) waits, may of which are zero. I'm not clear what optimizations there are for zero-size isend/irecv, but surely there's a great deal more latency if each pairwise exchange has to be confirmed complete before executing the next? Relatedly, how would I direct OpenMPI to use the older algorithm programmatically? I don't want the user to have to use "--mca" in their "mpiexec". Is there a C API? Thanks, Simon On 16/11/12 10:15, Iliev, Hristo wrote: Hi, Simon, The pairwise algorithm passes messages in a synchronised ring-like fashion with increasing stride, so it works best when independent communication paths could be established between several ports of the network switch/router. Some 1 Gbps Ethernet equipment is not capable of doing so, some is - it depends (usually on the price). This said, not all algorithms perform the same given a specific type of network interconnect. For example, on our fat-tree InfiniBand network the pairwise algorithm performs better. You can switch back to the basic linear algorithm by providing the following MCA parameters: mpiexec --mca coll_tuned_use_dynamic_rules 1 --mca coll_tuned_alltoallv_algorithm 1 ... Algorithm 1 is the basic linear, which used to be the default. Algorithm 2 is the pairwise one. You can also set these values as exported environment variables: export OMPI_MCA_coll_tuned_use_dynamic_rules=1 export OMPI_MCA_coll_tuned_alltoallv_algorithm=1 mpiexec ... You can also put this in $HOME/.openmpi/mcaparams.conf or (to make it have global effect) in $OPAL_PREFIX/etc/openmpi-mca-params.conf: coll_tuned_use_dynamic_rules=1 coll_tuned_alltoallv_algorithm=1 A gratuitous hint: dual-Opteron systems are NUMAs so it makes sense to activate process binding with --bind-to-core if you haven't already did so. It prevents MPI processes from being migrated to other NUMA nodes while running. Kind regards, Hristo -- Hristo Iliev, Ph.D. -- High Performance Computing RWTH Aachen University, Center for Computing and Communication Rechen- und Kommunikationszentrum der RWTH Aachen Seffenter Weg 23, D 52074 Aachen (Germany) -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Number Cruncher Sent: Thursday, November 15, 2012 5:37 PM To: Open MPI Users Subject: [OMPI users] MPI_Alltoallv performance regression 1.6.0 to 1.6.1 I've noticed a very significant (100%) slow down for MPI_Alltoallv calls as of version 1.6.1. * This is most noticeable for high-frequency exchanges over 1Gb ethernet where process-to-process message sizes are fairly small (e.g. 100kbyte) and much of the exchange matrix is sparse. * 1.6.1 release notes mention "Switch the MPI_ALLTOALLV default algorithm to a pairwise exchange", but I'm not clear what this means or how to switch back to the old "non-default algorithm". I attach a test program which illustrates the sort of usage in our MPI application. I have run as this as 32 processes on four nodes, over 1Gb ethernet, each node with 2x Opteron 4180 (dual hex-core); rank 0,4,8,.. on node 1, rank 1,5,9, ... on node 2, etc. It constructs an array of integers and a nProcess x nProcess exchange typical of part of our application. This is then exchanged several thousand times. Output from "mpicc -O3" runs shown below. My guess is that 1.6.1 is hitting additional latency not present in 1.6.0. I also attach a plot showing network throughput on our actual mesh generation application. Nodes cfsc01-04 are running 1.6.0 and finish within 35 minutes. Nodes cfsc05-08 are running 1.6.2 (started 10 minutes later) and take over a hour to run. There seems to be a much greater network demand in the 1.6.1 version, despite the user-code and input data being identical. Thanks for any help you can give, Simon ___
Re: [OMPI users] Initializing OMPI with invoking the array constructor on Fortran derived types causes the executable to crash
This is hardly an Open MPI issue: switch the calls to MPI_Init, MPI_Finalize against WRITE(*,*) "f" comment aut 'USE mpi' an see your error (SIGSEGV) again, now without any MPI part in the program. So my suspiction is this is an bug in your GCC version. Especially because there is no SIGSEGV using 4.7.2 GCC (whereby it crasehs using 4.4.6) ==> Update your compilers! On 01/11/13 14:01, Stefan Mauerberger wrote: Hi There! First of all, this is my first post here. In case I am doing something inappropriate pleas be soft with me. On top of that I am not quite sure whether that issue is related to Open MPI or GCC. Regarding my problem: Well, it is a little bulky, see below. I could figure out that the actual crash is caused by invoking Fortran's array constructor [ xx, yy ] on derived-data-types xx and yy. The one key factor is that those types have allocatable member variables. Well, that fact points to blame gfortran for that. However, the crash does not occur if MPI_Iinit is not called in before. Compiled as a serial program everything works perfectly fine. I am pretty sure, the lines I wrote are valid F2003 code. Here is a minimal working example: PROGRAM main USE mpi IMPLICIT NONE INTEGER :: ierr TYPE :: test_typ REAL, ALLOCATABLE :: a(:) END TYPE TYPE(test_typ) :: xx, yy TYPE(test_typ), ALLOCATABLE :: conc(:) CALL mpi_init( ierr ) conc = [ xx, yy ] CALL mpi_finalize( ierr ) END PROGRAM main Just compile with mpif90 ... and execute leads to: *** glibc detected *** ./a.out: free(): invalid pointer: 0x7fefd2a147f8 *** === Backtrace: = /lib/x86_64-linux-gnu/libc.so.6(+0x7eb96)[0x7fefd26dab96] ./a.out[0x400fdb] ./a.out(main+0x34)[0x401132] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x7fefd267d76d] ./a.out[0x400ad9] With commenting out 'CALL MPI_Init' and 'MPI_Finalize' everything seems to be fine. What do you think: Is this a OMPI or a GCC related bug? Cheers, Stefan ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] openmpi, 1.6.3, mlx4_core, log_num_mtt and Debian/vanilla kernel
The MTT-Parameter mess is well-known and the good solution is to set the MTT parameter high. In other case you never know what you will get - your application may hang, block the IB interface, run bit slower, run very slow... http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem http://www.open-mpi.org/community/lists/devel/2012/08/11417.php http://montecarlo.vtt.fi/mtg/2012_Madrid/Hans_Hammer2.pdf On 02/21/13 11:53, Stefan Friedel wrote: Is there a way to tell openmpi-1.6.3 to use the ofed-module from vanilla kernel and not to rely on log_num_mtt for "do-we-have-enough-registred-mem" computation for Mellanox HCAs? Any other idea/hint? -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] bug in mpif90? OMPI_FC envvar does not work with 'use mpi'
AFAIK the GNU people change the Fotran Module syntax every time they get any chance for doing it :-( So openmpi compiled with 4.4.6 (sys-default for RHEL 6.x) definitely does not work with 4.5, 4.6, 4.7 versions of gfortran. Intel 'ifort' compiler build modules which are compatible from 11.x through 13.x versions. So, the recommended solution is to build an own version of Open MPI with any compiler you use. Greetings, Paul P.S. As Hristo said, changing the Fortran compiler vendor and using the precompiled Fortran header would never work: the syntax of these .mod files is not standatised at all. On 03/13/13 11:05, Iliev, Hristo wrote: However, it works if for example you configure Open MPI with the system supplied version of gfortran and then specify a later gfortran version, e.g. OMPI_FC=gfortran-4.7 (unless the module format has changed in the meantime). -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] OpenMPI 1.6.4, MPI I/O on Lustre, 32bit: bug?
Hello, we observe the following divide-by-zero error: [linuxscc005:31416] *** Process received signal *** [linuxscc005:31416] Signal: Floating point exception (8) [linuxscc005:31416] Signal code: Integer divide-by-zero (1) [linuxscc005:31416] Failing at address: 0x2282db [linuxscc005:31416] [ 0] [0x3a9410] [linuxscc005:31416] [ 1] /lib/libgcc_s.so.1(__divdi3+0x8b) [0x2282db] [linuxscc005:31416] [ 2] /opt/MPI/openmpi-1.6.4/linux/intel/lib/lib32/libmpi.so.1(ADIOI_LUSTRE_WriteStrided+0x1c36) [0x8c8206] [linuxscc005:31416] [ 3] /opt/MPI/openmpi-1.6.4/linux/intel/lib/lib32/libmpi.so.1(MPIOI_File_write+0x1f2) [0x8ed752] [linuxscc005:31416] [ 4] /opt/MPI/openmpi-1.6.4/linux/intel/lib/lib32/libmpi.so.1(mca_io_romio_dist_MPI_File_write+0x33) [0x8ed553] [linuxscc005:31416] [ 5] /opt/MPI/openmpi-1.6.4/linux/intel/lib/lib32/libmpi.so.1(mca_io_romio_file_write+0x2e) [0x8a46fe] [linuxscc005:31416] [ 6] /opt/MPI/openmpi-1.6.4/linux/intel/lib/lib32/libmpi.so.1(MPI_File_write+0x45) [0x846c25] [linuxscc005:31416] [ 7] /rwthfs/rz/cluster/home/pk224850/SVN/rz_cluster_utils/test_suite/trunk/tests/mpi/mpiIO/mpiIOC32.exe() [0x804a1ac] [linuxscc005:31416] [ 8] /lib/libc.so.6(__libc_start_main+0xe6) [0x6fccce6] [linuxscc005:31416] [ 9] /rwthfs/rz/cluster/home/pk224850/SVN/rz_cluster_utils/test_suite/trunk/tests/mpi/mpiIO/mpiIOC32.exe() [0x8049d91] [linuxscc005:31416] *** End of error message *** ... if we're using Open MPI 1.6.4 for compiling a 'C' test program(*) (attached), which perform some MPI I/O on Lustre. 0.) The error only came if the binary is compiled in 32bit 1.) the error did not corellate with a compiler used to build the MPI library (all 4 we have - GCC, Su/Oralce Studio; Intel, PGI - result in the same behaviour) 2.) The error did not came in our version Open MPI / 1.6.1 (however I'm not really sure the configure options used are the same) 3.) The error did only came if the file to be written is located on the Lustre file system (no error on local disc or on NFS share). 4.) The Fortran version (also attached) did not have the issue. 5.) The error only occur when using 2 or more processes On the basis of the error message I believe the error could be located somewhere indeepth of the OpenMPI/ROMIO implementation... Well, is somebody interested in further investigation of this issue? If yes we can feed you with informations. Otherwise we will ignore it, probably... Best Paul Kapinos (*) we've kinda internal test suite in order to check our MPIs... P.S. $ mpicc -O0 -m32 -o ./mpiIOC32.exe ctest.c -lm P.S.2 an example cofnigure line: ./configure --with-openib --with-lsf --with-devel-headers --enable-contrib-no-build=vt --enable-heterogeneous --enable-cxx-exceptions --enable-orterun-prefix-by-default --disable-dlopen --disable-mca-dso --with-io-romio-flags='--with-file-system=testfs+ufs+nfs+lustre' --enable-mpi-ext CFLAGS="$FLAGS_FAST $FLAGS_ARCH32 " CXXFLAGS="$FLAGS_FAST $FLAGS_ARCH32 " FFLAGS="$FLAGS_FAST $FLAGS_ARCH32 " FCFLAGS="$FLAGS_FAST $FLAGS_ARCH32 " LDFLAGS="$FLAGS_FAST $FLAGS_ARCH32 -L/opt/lsf/8.0/linux2.6-glibc2.3-x86/lib" --prefix=/opt/MPI/openmpi-1.6.4/linux/gcc --mandir=/opt/MPI/openmpi-1.6.4/linux/gcc/man --bindir=/opt/MPI/openmpi-1.6.4/linux/gcc/bin/32 --libdir=/opt/MPI/openmpi-1.6.4/linux/gcc/lib/lib32 --includedir=/opt/MPI/openmpi-1.6.4/linux/gcc/include/32 --datarootdir=/opt/MPI/openmpi-1.6.4/linux/gcc/share/32 2>&1 | tee log_01_conf.txt I _believe_ the part --with-io-romio-flags='--with-file-system=testfs+ufs+nfs+lustre' is new in our 1.6.4 installation compared with 1.6.1. Could this be the root of evil? -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 program test ! Zum Betrieb wird die f77 Methode benutzt, MPI einzubindetn, Zum Entwickeln versuche f90 Methode zu nutzen. ! Erhoffe dadurchweniger Fehler; beispileweise die unterschide MPI_INT --> MPI_INTEGER können sehr wohl abgefangen werden.. USE MPI IMPLICIT NONE !include "mpif.h" ! ! integer :: wrank,wsize,ierr,status(MPI_STATUS_SIZE),fh,i8_type,mpi_info,my_data_wsize(1),my_disp(1) integer(8), allocatable :: id_list(:), quelle(:) character(len=1024) :: filename CHARACTER(len=1024) :: pfad integer :: anzahl = 12000 integer(KIND=MPI_OFFSET_KIND) :: offset = 0 !5776 INTEGER(8) :: my_current_offset, my_offset INTEGER :: i, laenge, intsize call MPI_INIT(ierr) call MPI_COMM_SIZE(MPI_COMM_WORLD,wsize,ierr) call MPI_COMM_RANK(MPI_COMM_WORLD,wrank,ierr) call getarg(1, pfad) ! das ist der Pfad+Dateiname !filename = TRIM(pfad) // "blupp" !Anlegen der Binärdatei (seriell, nur Master) ! CALL MPI_File_seek(fh, my_offset, MPI_SEEK_SET); ! CALL MPI_File_get_position(fh, _current_offset); IF (wrank .EQ. 0) THEN my_offset =
[OMPI users] an MPI process using about 12 file descriptors per neighbour processes - isn't it a bit too much?
Hi OpenMPI folks, We use Sun MPI (Cluster Tools 8.2) and also native OpenMPI 1.3.3 and we wonder us about the way OpenMPI devours file descriptors: on our computers, ulimit -n is currently set to 1024, and we found out that we may run maximally 84 MPI processes per box, and if we try to run 85 (or above) processes, we got such error message: -- Error: system limit exceeded on number of network connections that can be open . -- Simple computing tells us, 1024/85 is about 12. This lets us believe that there is an single OpenMPI process, which needs 12 file descriptor per other MPI process. By now, we have only one box with more than 100 CPUs on which it may be meaningfull to run more than 85 processes. But in the quite near future, many-core boxes are arising (we also ordered 128-way nehalems), so it may be disadvantageous to consume a lot of file descriptors per MPI process. We see a possibility to awod this problem by setting the ulimit for file descriptor to a higher value. This is not easy unter linux: you need either to recompile the kernel (which is not a choise for us), or to set a root process somewhere which will set the ulimit to a higher value (which is a security risk and not easy to implement). We also tryed to set the opal_set_max_sys_limits to 1, as the help says (by adding "-mca opal_set_max_sys_limits 1" to the command line), but we does not see any change of behaviour). What is your meaning? Best regards, Paul Kapinos RZ RWTH Aachen # /opt/SUNWhpc/HPC8.2/intel/bin/mpiexec -mca opal_set_max_sys_limits 1 -np 86 a.out <> smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] an environment variable with same meaning than the -x option of mpiexec
Dear OpenMPI developer, with the -x option of mpiexec there is a way to distribute environmnet variables: -x Export the specified environment variables to the remote nodes before executing the program. Is there an environment variable ( OMPI_) with the same meaning? The writing of environmnet variables on the command line is ugly and tedious... I've searched for this info on OpenMPI web pages for about an hour and didn't find the ansver :-/ Thanking you in anticipation, Paul -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] an environment variable with same meaning than the -x option of mpiexec
Hi Ralph, Not at the moment - though I imagine we could create one. It is a tad tricky in that we allow multiple -x options on the cmd line, but we obviously can't do that with an envar. why not? export OMPI_Magic_Variavle="-x LD_LIBRARY_PATH -x PATH" cold be possible, or not? I can add it to the "to-do" list for a rainy day :-) That would be great :-) Thanks for your help! Paul Kapinos with the -x option of mpiexec there is a way to distribute environmnet variables: -x Export the specified environment variables to the remote nodes before executing the program. Is there an environment variable ( OMPI_) with the same meaning? The writing of environmnet variables on the command line is ugly and tedious... I've searched for this info on OpenMPI web pages for about an hour and didn't find the ansver :-/ Thanking you in anticipation, Paul -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] an environment variable with same meaning than the-x option of mpiexec
Hi Jeff, FWIW, environment variables prefixed with "OMPI_" will automatically be distributed out to processes. Of course, but saddingly the variable(s) we want to ditribute aren't "OMPI_" variable. Depending on your environment and launcher, your entire environment may be copied out to all the processes, anyway (rsh does not, but environments like SLURM do), making the OMPI_* and -x mechanisms somewhat redundant. Does this help? By now I specified the $MPIEXEC variable to "mpiexec -x BLABLABLA" and advice the users to use this. This is a bit ugly, but working workaround. What i wanted to achieve with my mail, was a less ugly solution :o) Thanks for your help, Paul Kapinos Not at the moment - though I imagine we could create one. It is a tad tricky in that we allow multiple -x options on the cmd line, but we obviously can't do that with an envar. The most likely solution would be to specify multiple "-x" equivalents by separating them with a comma in the envar. It would take some parsing to make it all work, but not impossible. I can add it to the "to-do" list for a rainy day :-) On Nov 6, 2009, at 7:59 AM, Paul Kapinos wrote: > Dear OpenMPI developer, > > with the -x option of mpiexec there is a way to distribute > environmnet variables: > > -x Export the specified environment variables to the > remote > nodes before executing the program. > > > Is there an environment variable ( OMPI_) with the same meaning? > The writing of environmnet variables on the command line is ugly and > tedious... > > I've searched for this info on OpenMPI web pages for about an hour > and didn't find the ansver :-/ > > > Thanking you in anticipation, > > Paul > > > > > -- > Dipl.-Inform. Paul Kapinos - High Performance Computing, > RWTH Aachen University, Center for Computing and Communication > Seffenter Weg 23, D 52074 Aachen (Germany) > Tel: +49 241/80-24915 > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users _______ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] exceedingly virtual memory consumption of MPI environment if higher-setting "ulimit -s"
Hi volks, we see an exeedingly *virtual* memory consumtion through MPI processes if "ulimit -s" (stack size)in profile configuration was setted higher. Furthermore we believe, every mpi process started, wastes about the double size of `ulimit -s` value which will be set in a fresh console (that is, the value is configurated in e.g. .zshenv, *not* the value actually setted in the console from which the mpiexec runs). Sun MPI 8.2.1, an empty mpi-HelloWorld program ! either if running both processes on the same host.. .zshenv: ulimit -s 10240 --> VmPeak:180072 kB .zshenv: ulimit -s 102400 --> VmPeak:364392 kB .zshenv: ulimit -s 1024000 --> VmPeak:2207592 kB .zshenv: ulimit -s 2024000 --> VmPeak:4207592 kB .zshenv: ulimit -s 2024 --> VmPeak: 39.7 GB (see the attached files; the a.out binary is a mpi helloworld program running an never ending loop). Normally, the stack size ulimit is set to some 10 MB by us, but we see a lot of codes which needs *a lot* of stack space, e.g. Fortran codes, OpenMP codes (and especially fortran OpenMP codes). Users tends to hard-code the setting-up the higher value for stack size ulimit. Normally, the using of a lot of virtual memory is no problem, because there is a lot of this thing :-) But... If more than one person is allowed to work on a computer, you have to divide the ressources in such a way that nobody can crash the box. We do not know how to limit the real RAM used so we need to divide the RAM by means of setting virtual memory ulimit (in our batch system e.g.. That is, for us "virtual memory consumption" = "real memory consumption". And real memory is not that way cheap than virtual memory. So, why consuming the *twice* amount of stack size for each process? And, why consuming the virtual memory at all? We guess this virtual memory is allocated for the stack (why else it will be related to the stack size ulimit). But, is such allocation really needed? Is there a way to avoid the vaste of virtual memory? best regards, Paul Kapinos -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 ! Paul Kapinos 22.09.2009 - ! RZ RWTH Aachen, www.rz.rwth-aachen.de ! ! MPI-Hello-World ! PROGRAM PK_MPI_Test USE MPI IMPLICIT NONE ! INTEGER :: my_MPI_Rank, laenge, ierr CHARACTER*(MPI_MAX_PROCESSOR_NAME) my_Host ! !WRITE (*,*) "Jetz penn ich mal 30" !CALL Sleep(30) CALL MPI_INIT (ierr) ! !WRITE (*,*) "Nach MPI_INIT" !CALL Sleep(30) CALL MPI_COMM_RANK( MPI_COMM_WORLD, my_MPI_Rank, ierr ) !WRITE (*,*) "Nach MPI_COMM_RANK" CALL MPI_GET_PROCESSOR_NAME(my_Host, laenge, ierr) WRITE (*,*) "Prozessor ", my_MPI_Rank, "on Host: ", my_Host(1:laenge) ! sleeping or spinnig - the same behaviour !CALL Sleep(3) DO WHILE (.TRUE.) ENDDO !CALL Sleep(3) CALL MPI_FINALIZE(ierr) ! WRITE (*,*) "Daswars" ! END PROGRAM PK_MPI_Test smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] exceedingly virtual memory consumption of MPI, environment if higher-setting "ulimit -s"
Hi Jeff, hi all, I can't think of what OMPI would be doing related to the predefined stack size -- I am not aware of anywhere in the code where we look up the predefine stack size and then do something with it. I do not know OMPI code at all - but what I see is the consumption of virtual memory according to the twice stack size defaults by new login.. That being said, I don't know what the OS and resource consumption effects are of setting 1GB+ stack size on *any* application... we defenitely have applications which *need* stack size of 500+MB. Users who use such codes, may trend to hard-code a *huge* stack size in their profile (you do not wanna to lose a day ot two of computing time just by forgitting to set a ulimit, right?). (Currently, I see *one* such user, but who knows how many there are...) nevertheless, also if the users do not use a huge stack size, the default stack size is some 20 MB. That's not much, but does this allocation-and-never-use of twice of the stack size really needed? Best wishes, PK Have you tried non-MPI examples, potentially with applications as large as MPI applications but without the complexity of MPI? On Nov 19, 2009, at 3:13 PM, David Singleton wrote: Depending on the setup, threads often get allocated a thread local stack with size equal to the stacksize rlimit. Two threads maybe? David Terry Dontje wrote: > A couple things to note. First Sun MPI 8.2.1 is effectively OMPI > 1.3.4. I also reproduced the below issue using a C code so I think this > is a general issue with OMPI and not Fortran based. > > I did a pmap of a process and there were two anon spaces equal to the > stack space set by ulimit. > > In one case (setting 102400) the anon spaces were next to each other > prior to all the loadable libraries. In another case (setting 1024000) > one anon space was locate in the same area as the first case but the > second space was deep into some memory used by ompi. > > Is any of this possibly related to the predefined handles? Though I am > not sure why it would expand based on stack size?. > > --td >> Date: Thu, 19 Nov 2009 19:21:46 +0100 >> From: Paul Kapinos <kapi...@rz.rwth-aachen.de> >> Subject: [OMPI users] exceedingly virtual memory consumption of MPI >> environment if higher-setting "ulimit -s" >> To: Open MPI Users <us...@open-mpi.org> >> Message-ID: <4b058cba.3000...@rz.rwth-aachen.de> >> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" >> >> Hi volks, >> >> we see an exeedingly *virtual* memory consumtion through MPI processes >> if "ulimit -s" (stack size)in profile configuration was setted higher. >> >> Furthermore we believe, every mpi process started, wastes about the >> double size of `ulimit -s` value which will be set in a fresh console >> (that is, the value is configurated in e.g. .zshenv, *not* the value >> actually setted in the console from which the mpiexec runs). >> >> Sun MPI 8.2.1, an empty mpi-HelloWorld program >> ! either if running both processes on the same host.. >> >> .zshenv: ulimit -s 10240 --> VmPeak:180072 kB >> .zshenv: ulimit -s 102400 --> VmPeak:364392 kB >> .zshenv: ulimit -s 1024000 --> VmPeak:2207592 kB >> .zshenv: ulimit -s 2024000 --> VmPeak:4207592 kB >> .zshenv: ulimit -s 2024 --> VmPeak: 39.7 GB >> (see the attached files; the a.out binary is a mpi helloworld program >> running an never ending loop). >> >> >> >> Normally, the stack size ulimit is set to some 10 MB by us, but we see >> a lot of codes which needs *a lot* of stack space, e.g. Fortran codes, >> OpenMP codes (and especially fortran OpenMP codes). Users tends to >> hard-code the setting-up the higher value for stack size ulimit. >> >> Normally, the using of a lot of virtual memory is no problem, because >> there is a lot of this thing :-) But... If more than one person is >> allowed to work on a computer, you have to divide the ressources in >> such a way that nobody can crash the box. We do not know how to limit >> the real RAM used so we need to divide the RAM by means of setting >> virtual memory ulimit (in our batch system e.g.. That is, for us >> "virtual memory consumption" = "real memory consumption". >> And real memory is not that way cheap than virtual memory. >> >> >> So, why consuming the *twice* amount of stack size for each process? >> >> And, why consuming the virtual memory at all? We guess this virtual >> memory is allocated for the stack (why else it will be related to the >> stack size ulimit). But, is such
[OMPI users] MPI_Comm_set_errhandler: error in Fortran90 Interface mpi.mod
Hello OpenMPI / Sun/Oracle MPI folks, we believe that the OpenMPI and SunMPI (Cluster Tools) has an error in the Fortran-90 (f90) bindings of the MPI_Comm_set_errhandler routine. Tested MPI versions: OpenMPI/1.3.3 and Cluster Tools 8.2.1 Consider the attached example. This file uses the "USE MPI" to bind the MPI routines f90-style. The f77-style "include 'mpif.h'" is commented out. If using Intel MPI the attached example is running error-free (with both bindings). If trying to compiler with OpenMPI and using f90 bindings, any compilers tested (Intel/11.1, Sun Studio/12.1, gcc/4.1) says the code cannot be build because of trying to use a constant (MPI_COMM_WORLD) as input. For example, the output of the Intel compiler: - MPI_Comm_set_errhandler.f90(12): error #6638: An actual argument is an expression or constant; this is not valid since the associated dummy argument has the explicit INTENT(OUT) or INTENT(INOUT) attribute. [0] call MPI_Comm_set_errhandler (MPI_COMM_WORLD, errhandler, ierr) ! MPI_COMM_WORLD in MPI_Comm_set_errhandler is the problem... --^ compilation aborted for MPI_Comm_set_errhandler.f90 (code 1) - With the f77 bindings, the attached program compiles and runs fine. The older (deprecated) routine MPI_Errhandler_set which is defined to have the same functionality works fine with both bindings and all MPI's. So, we believe the OpenMPI implementation of MPI standard erroneously sets the INTENT(OUT) or INTENT(INOUT) attribute for the communicator attribute. The definition of an error handle for MPI_COMM_WORLD should be possible which it is currently not. Best wishes, Paul Kapinos -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 PROGRAM sunerr USE MPI ! f90: Error on MPI_Comm_set_errhandler if using this with OpenMPI / Sun MPI !include 'mpif.h' ! f77: Works fine with all MPI's tested IMPLICIT NONE ! integer :: data = 1, errhandler, ierr external AbortWithMessage ! call MPI_Init(ierr) call MPI_Comm_create_errhandler (AbortWithMessage, errhandler, ierr) ! Creating a handle: no problem call MPI_Comm_set_errhandler (MPI_COMM_WORLD, errhandler, ierr) ! MPI_COMM_WORLD in MPI_Comm_set_errhandler is the problem... in f90 !call MPI_Errhandler_set (MPI_COMM_WORLD, errhandler, ierr)! and this one deprecated function works fine both for f77 and f90 ! ... a errornous MPI routine ... call MPI_Send (data, 1, MPI_INTEGER, 1, -12, MPI_COMM_WORLD, ierr) call MPI_Finalize( ierr ) END PROGRAM sunerr subroutine AbortWithMessage (comm, errorcode) use mpi implicit none integer :: comm, errorcode character(LEN=MPI_MAX_ERROR_STRING) :: errstr integer :: stringlength, ierr call MPI_Error_string (errorcode, errstr, stringlength, ierr) write (*,*) 'Error: =+=> ', errstr, ' =+=> Aborting' call MPI_Abort (comm, errorcode, ierr) end subroutine AbortWithMessage smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Fortran derived types
Hi, In general, even in your serial fortran code, you're already taking a performance hit using a derived type. That is not generally true. The right statement is: "it depends". Yes, sometimes derived data types and object orientation and so on can lead to some performance hit; but current compiler usually can oprimise alot. E.g. consider http://www.terboven.com/download/OAbstractionsLA.pdf (especially p.19). So, I would not recommend to disturb the ready program in order to let it be the old good f77 style. And let us not start a flame about "assembler is faster but OO is easier"! :-) Best wishes Paul -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Prentice Bisbal Sent: Wednesday, May 05, 2010 11:51 AM To: Open MPI Users Subject: Re: [OMPI users] Fortran derived types Vedran Coralic wrote: Hello, In my Fortran 90 code I use several custom defined derived types. Amongst them is a vector of arrays, i.e. v(:)%f(:,:,:). I am wondering what the proper way of sending this data structure from one processor to another is. Is the best way to just restructure the data by copying it into a vector and sending that or is there a simpler way possible by defining an MPI derived type that can handle it? I looked into the latter myself but so far, I have only found the solution for a scalar fortran derived type and the methodology that was suggested in that case did not seem naturally extensible to the vector case. It depends on how your data is distributed in memory. If the arrays are evenly distributed, like what would happen in a multidimensional-array, the derived datatypes will work fine. If you can't guarantee the spacing between the arrays that make up the vector, then using MPI_Pack/MPI_Unpack (or whatever the Fortran equivalents are) is the best way to go. I'm not an expert MPI programmer, but I wrote a small program earlier this year that created a dynamically created array of dynamically created arrays. After doing some research into this same problem, it looked like packing/unpacking was the only way to go. Using Pack/Unpack is easy, but there is a performance hit since the data needs to be copied into the packed buffer before sending, and then copied out of the buffer after the receive. -- Prentice ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, Center for Computing and Communication Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] Why compilig in global paths (only) for configuretion files?
Hi all! We are using OpenMPI on an variety of machines (running Linux, Solaris/Sparc and /Opteron) using couple of compilers (GCC, Sun Studio, Intel, PGI, 32 and 64 bit...) so we have at least 15 versions of each release of OpenMPI (SUN Cluster Tools not included). This shows, that we have to support an complete petting zoo of OpenMPI's. Sometimes we may need to move things around. If OpenMPI is being configured, the install path may be provided using --prefix keyword, say so: ./configure --prefix=/my/love/path/for/openmpi/tmp1 After "gmake all install" in ...tmp1 an installation of OpenMPI may be found. Then, say, we need to *move* this Version to an another path, say /my/love/path/for/openmpi/blupp Of course we have to set $PATH and $LD_LIBRARY_PATH accordingly (we can that ;-) And if we tried to use OpenMPI from new location, we got error message like $ ./mpicc Cannot open configuration file /my/love/path/for/openmpi/tmp1/share/openmpi/mpicc-wrapper-data.txt Error parsing data file mpicc: Not found (note the old installation path used) That looks for me, that the install path provided with --prefix in configuration step, is compiled into opal_wrapper executable file and opal_wrapper works iff the set of configuration files is in this path. But after move of the OpenMP installation directory the configuration files aren't there... An side effect of this behaviour is the certainty that binary distributions of OpenMPI (RPM's) are not relocatable. That's uncomfortably. (Actually, this mail is initiated by the fact that Sun ClusterTools RPM's are not relocatable) So, does this behavior have an deeper sence I cannot recognise, or maybe the configuring of global paths is not needed? What I mean, is that the paths for the configuration files, which opal_wrapper need, may be setted locally like ../share/openmpi/*** without affectiong the integrity of OpenMPI. Maybe there were were more places where the usage of local paths may be needed to allowe movable (relocable) OpenMPI. What do you mean about? Best regards Paul Kapinos <> smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Need help resolving No route to host error with OpenMPI 1.1.2
Hi, First, consider to update to newer OpenMPI. Second, look on your environment on the box you startts OpenMPI (runs mpirun ...). Type ulimit -n to explore how many file descriptors your envirinment have. (ulimit -a for all limits). Note, every process on older versions of OpenMPI (prior 1.2.6 inclusively) needs an own file descriptor for each process started, IMHO. Maybe its your problem? Does your HelloWorld run OK with some 500 processes? best regards PK Prasanna Ranganathan wrote: Hi, I am trying to run a test mpiHelloWorld program that simply initializes the MPI environment on all the nodes, prints the hostname and rank of each node in the MPI process group and exits. I am using MPI 1.1.2 and am running 997 processes on 499 nodes (Nodes have 2 dual core CPUs). I get the following error messages when I run my program as follows: mpirun -np 997 -bynode -hostfile nodelist /main/mpiHelloWorld . . . [0,1,380][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] [0,1,142][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] [0,1,140][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] [0,1,390][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=113 connect() failed with errno=113connect() failed with errno=113connect() failed with errno=113[0,1,138][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=113[0,1,384][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] [0,1,144][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=113 [0,1,388][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=113[0,1,386][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=113 [0,1,139][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=113 connect() failed with errno=113 . . *The main thing is that I get these error messages around 3-4 times out of 10 attempts with the rest all completing successfully. I have looked into the FAQs in detail and also checked the tcp btl settings but am not able to figure it out. * All the 499 nodes have only eth0 active and I get the error even when I run the following: mpirun -np 997 -bynode –hostfile nodelist --mca btl_tcp_if_include eth0 /main/mpiHelloWorld I have attached the output of ompi_info —all. The following is the output of /sbin/ifconfig on the node where I start the mpi process (it is one of the 499 nodes) eth0 Link encap:Ethernet HWaddr 00:03:25:44:8F:D6 inet addr:10.12.1.11 Bcast:10.12.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1978724556 errors:17 dropped:0 overruns:0 frame:17 TX packets:1767028063 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:580938897359 (554026.5 Mb) TX bytes:689318600552 (657385.4 Mb) Interrupt:22 Base address:0xc000 loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:70560 errors:0 dropped:0 overruns:0 frame:0 TX packets:70560 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:339687635 (323.9 Mb) TX bytes:339687635 (323.9 Mb) Kindly help. Regards, Prasanna. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users <> smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Why compilig in global paths (only) for configuretion files?
Hi Jeff again! But the setting of the environtemt variable OPAL_PREFIX to an appropriate value (assuming PATH and LD_LIBRARY_PATH are setted too) is not enough to let the OpenMPI rock from the new lokation. Hmm. It should be. (update) it works with "truly" OpenMPI, but it works *not* with SUN Cluster Tools 8.0 (which is also an OpenMPI). So, it seems be an SUN problem and not general problem of openMPI. Sorry for false relating the problem. The only trouble we have now are the error messages like -- Sorry! You were supposed to get help about: no hca params found from the file: help-mpi-btl-openib.txt But I couldn't find any file matching that name. Sorry! -- (the job still runs without problems! :o) if running openmpi from new location, and the old location being removed. (if the old location being also persistense there is no error, so it seems to be an attempt to access to an file on old path). Maybe we have to explicitly pass the OPAL_PREFIX environment variable to all processes? Because of the fact, that all the files containing settings for opal_wrapper, which are located in share/openmpi/ and called e.g. mpif77-wrapper-data.txt, contain (defined by installation with --prefix) hard-coded paths, too. Hmm; they should not. In my 1.2.7 install, I see the following: - [11:14] svbu-mpi:/home/jsquyres/bogus/share/openmpi % cat mpif77-wrapper-data.txt # There can be multiple blocks of configuration data, chosen by # compiler flags (using the compiler_args key to chose which block # should be activated. This can be useful for multilib builds. See the # multilib page at: #https://svn.open-mpi.org/trac/ompi/wiki/compilerwrapper3264 # for more information. project=Open MPI project_short=OMPI version=1.2.7rc6r19546 language=Fortran 77 compiler_env=F77 compiler_flags_env=FFLAGS compiler=gfortran extra_includes= preprocessor_flags= compiler_flags= linker_flags= libs=-lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl required_file=not supported includedir=${includedir} libdir=${libdir} [11:14] svbu-mpi:/home/jsquyres/bogus/share/openmpi % - Note the "includedir" and "libdir" lines -- they're expressed in terms of ${foo}, which we can replace when OPAL_PREFIX (or related) is used. What version of OMPI are you using? Note one of configure files contained in Sun ClusterMPI 8.0 (see attached file). The paths are really hard-coded in instead of usage of variables; this makes the package really not relocable without parsing the configure files. Did you (or anyone reading this message) have any contact to SUN developers to point to this circumstance? *Why* do them use hard-coded paths? :o) best regards, Paul Kapinos # # Default word-size (used when -m flag is supplied to wrapper compiler) # compiler_args= project=Open MPI project_short=OMPI version=r19400-ct8.0-b31c-r29 language=Fortran 90 compiler_env=FC compiler_flags_env=FCFLAGS compiler=f90 module_option=-M extra_includes= preprocessor_flags= compiler_flags= libs=-lmpi -lopen-rte -lopen-pal -lnsl -lrt -lm -ldl -lutil -lpthread -lmpi_f77 -lmpi_f90 linker_flags=-R/opt/mx/lib/lib64 -R/opt/SUNWhpc/HPC8.0/lib/lib64 required_file= includedir=/opt/SUNWhpc/HPC8.0/include/64 libdir=/opt/SUNWhpc/HPC8.0/lib/lib64 # # Alternative word-size (used when -m flag is not supplied to wrapper compiler) # compiler_args=-m32 project=Open MPI project_short=OMPI version=r19400-ct8.0-b31c-r29 language=Fortran 90 compiler_env=FC compiler_flags_env=FCFLAGS compiler=f90 module_option=-M extra_includes= preprocessor_flags= compiler_flags=-m32 libs=-lmpi -lopen-rte -lopen-pal -lnsl -lrt -lm -ldl -lutil -lpthread -lmpi_f77 -lmpi_f90 linker_flags=-R/opt/mx/lib -R/opt/SUNWhpc/HPC8.0/lib required_file= includedir=/opt/SUNWhpc/HPC8.0/include libdir=/opt/SUNWhpc/HPC8.0/lib <> smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] Why compilig in global paths (only) for configuretion files?
Hi Rolf, Rolf vandeVaart wrote: I don't know -- this sounds like an issue with the Sun CT 8 build process. It could also be a by-product of using the combined 32/64 feature...? I haven't used that in forever and I don't remember the restrictions. Terry/Rolf -- can you comment? I will write an separate eMail to ct-feedb...@sun.com Hi Paul: Yes, there are Sun people on this list! We originally put those hardcoded paths in to make everything work correctly out of the box and our install process ensured that everything would be at /opt/SUNWhpc/HPC8.0. However, let us take a look at everything that was just discussed here and see what we can do. We will get back to you shortly. I've just sent an eMail to ct-feedb...@sun.com with some explanation of our troubles... The main trouble: we wanna to have *both* versions of CT8.0 (for studio and for gnu compiler) installed on same sythems. The RPMs are not relocatable, have same name and installs everything into the same directories... yes, it works out-of-box, but iff just *one* version installed. So, I started to move installations around, asking on these mailing list, setting envvars, and parsing configuretion files I think installing everyting to hard-coded paths is somewhat inflexible. Maybe you may provide relocatable RPMs somewhere in the future? But as mentioned above, our main goal is to have both versions of CT on same sythem working. Best regards, Paul Kapinos <> smime.p7s Description: S/MIME Cryptographic Signature
[OMPI users] Errors compiling OpenMPI 1.2.8 with SUN Studio express (2008/07/10) in 32bit modus
Hi all, We tried to install OpenMPI 1.2.8 on Linux in a couple of versions here (compiler from intel, pgi, studio, gcc - all 64bit and 32bit). If we used SUN Studio Express (2008/07/10) and configured to produce 32bit library, we got following errors (full log see in file my_makelog_sun32.txt) .. gmake[2]: Entering directory `/rwthfs/rz/cluster/home/pk224850/OpenMPI/openmpi-1.2.8_studio32/ompi/mca/btl/openib' source='btl_openib_component.c' object='btl_openib_component.lo' libtool=yes \ DEPDIR=.deps depmode=none /bin/sh ../../../../config/depcomp \ /bin/sh ../../../../libtool --tag=CC --mode=compile cc -DHAVE_CONFIG_H -I. -I../../../../opal/include -I../../../../orte/include -I../../../../ompi/include -DPKGDATADIR=\"/rwthfs/rz/SW/MPI/openmpi-1.2.8/linux32/studio/share/openmpi\" -I../../../..-DNDEBUG -O2 -m32 -c -o btl_openib_component.lo btl_openib_component.c libtool: compile: cc -DHAVE_CONFIG_H -I. -I../../../../opal/include -I../../../../orte/include -I../../../../ompi/include -DPKGDATADIR=\"/rwthfs/rz/SW/MPI/openmpi-1.2.8/linux32/studio/share/openmpi\" -I../../../.. -DNDEBUG -O2 -m32 -c btl_openib_component.c -KPIC -DPIC -o .libs/btl_openib_component.o "../../../../opal/include/opal/sys/ia32/atomic.h", line 167: warning: impossible constraint for "%1" asm operand "../../../../opal/include/opal/sys/ia32/atomic.h", line 167: warning: parameter in inline asm statement unused: %2 "../../../../opal/include/opal/sys/ia32/atomic.h", line 184: warning: impossible constraint for "%1" asm operand "../../../../opal/include/opal/sys/ia32/atomic.h", line 184: warning: parameter in inline asm statement unused: %2 "/usr/include/infiniband/kern-abi.h", line 103: syntax error before or at: __u64 "/usr/include/infiniband/kern-abi.h", line 109: syntax error before or at: __u64 "/usr/include/infiniband/kern-abi.h", line 124: syntax error before or at: __u64 "/usr/include/infiniband/kern-abi.h", line 135: syntax error before or at: __u64 ... This seems for us to be an error on linux headers in file kern-abi.h which includes linux/types.h which contains this: #if defined(__GNUC__) && !defined(__STRICT_ANSI__) typedef __u64 uint64_t; typedef __u64 u_int64_t; typedef __s64 int64_t; #endif So, it looks for us so, that by byilding of openmpi 1.2.8 the SUN Studio compiler cannot compile some Linux headers because of these are programmed in "GNU C" instead of ANSI C. If so then this is an Linux issue and not OpenMPI's - but, if so, *why* did you not seen this problems during of release preparation? That is, maybe we have done some mistakes? Maybe the devel headers and/or static libs are the problem? (I will try to disable them, but we want to report this problem anyway). We use Scientific Linux 5.1 which is an Red Hat Enterprice 5 Linux. $ uname -a Linux linuxhtc01.rz.RWTH-Aachen.DE 2.6.18-53.1.14.el5_lustre.1.6.5custom #1 SMP Wed Jun 25 12:17:09 CEST 2008 x86_64 x86_64 x86_64 GNU/Linux configured with: ./configure --enable-static --with-devel-headers CFLAGS="-O2 -m32" CXXFLAGS="-O2 -m32" FFLAGS="-O2 -m32" FCFLAGS="-O2 -m32" LDFLAGS="-m32" --prefix=/rwthfs/rz/SW/MPI/openmpi-1.2.8/linux32/studio Best regards, Paul Kapinos HPC Group RZ RWTH Aachen This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake. It was created by Open MPI configure 1.2.8, which was generated by GNU Autoconf 2.61. Invocation command line was $ ./configure --enable-static --with-devel-headers CFLAGS=-O2 -m32 CXXFLAGS=-O2 -m32 FFLAGS=-O2 -m32 FCFLAGS=-O2 -m32 LDFLAGS=-m32 --prefix=/rwthfs/rz/SW/MPI/openmpi-1.2.8/linux32/studio CC=cc CXX=CC FC=f95 --enable-ltdl-convenience --no-create --no-recursion ## - ## ## Platform. ## ## - ## hostname = linuxhtc01.rz.RWTH-Aachen.DE uname -m = x86_64 uname -r = 2.6.18-53.1.14.el5_lustre.1.6.5custom uname -s = Linux uname -v = #1 SMP Wed Jun 25 12:17:09 CEST 2008 /usr/bin/uname -p = x86_64 /bin/uname -X = unknown /bin/arch = x86_64 /usr/bin/arch -k = x86_64 /usr/convex/getsysinfo = unknown /usr/bin/hostinfo = unknown /bin/machine = unknown /usr/bin/oslevel = unknown /bin/universe = unknown PATH: /rwthfs/rz/SW/UTIL/StudioExpress20080724/SUNWspro/bin PATH: /home/pk224850/bin PATH: /usr/local_host/sbin PATH: /usr/local_host/bin PATH: /usr/local_rwth/sbin PATH: /usr/local_rwth/bin PATH: /usr/bin PATH: /usr/sbin PATH: /sbin PATH: /usr/dt/bin PATH: /usr/bin/X11 PATH: /usr/java/bin PATH: /usr/local/bin PATH: /usr/local/sbin PATH: /opt/csw/bin PATH: . ## --- ## ## Core tests. ## ## --- ## configure:2986: checking for a BS
[OMPI users] OMPIO correctnes issues
Dear Open MPI developers, did OMPIO (1) reached 'usable-stable' state? As we reported in (2) we had some trouble in building Open MPI with ROMIO, which fact was hidden by OMPIO implementation stepping into the MPI_IO breach. The fact 'ROMIO isn't AVBL' was detected after users complained 'MPI_IO don't work as expected with version XYZ of OpenMPI' and further investigations. Take a look at the attached example. It deliver different result in case of using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3). We've seen more examples of divergent behaviour but this one is quite handy. Is that a bug in OMPIO or did we miss something? Best Paul Kapinos 1) http://www.open-mpi.org/faq/?category=ompio 2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php 3) (ROMIO is default; on local hard drive at node 'cluster') $ ompi_info | grep romio MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ ompi_info | grep ompio MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ mpif90 main.f90 $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize1010 fileOffset, fileSize2626 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 $ export OMPI_MCA_io=ompio $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize 010 fileOffset, fileSize 016 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 program example use mpi integer:: ierr integer(MPI_OFFSET_KIND) :: fileOffset integer(KIND=MPI_OFFSET_KIND):: fileSize real :: outData(10) integer :: resUnit=565 call MPI_INIT(ierr) call MPI_file_open(MPI_COMM_WORLD, 'out.txt', MPI_MODE_WRONLY + MPI_MODE_APPEND, MPI_INFO_NULL, resUnit, ierr) call MPI_FILE_GET_SIZE (resUnit, fileSize, ierr) call MPI_file_get_position(resUnit,fileOffset,ierr) print *, 'fileOffset, fileSize', fileOffset, fileSize call MPI_file_seek (resUnit,fileOffset,MPI_SEEK_SET,ierr) call MPI_file_write(resUnit, outData, 2, & MPI_DOUBLE, MPI_STATUS_IGNORE, ierr) call MPI_file_get_position(resUnit,fileOffset,ierr) call MPI_FILE_GET_SIZE (resUnit, fileSize, ierr) print *, 'fileOffset, fileSize', fileOffset, fileSize print *, 'ierr ', ierr print *, 'MPI_MODE_WRONLY, MPI_MODE_APPEND ', MPI_MODE_WRONLY, MPI_MODE_APPEND call MPI_file_close(resUnit,ierr) call MPI_FINALIZE(ierr) end smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] OMPIO correctnes issues
Sorry, forgot to mention: 1.10.1 Open MPI: 1.10.1 Open MPI repo revision: v1.10.0-178-gb80f802 Open MPI release date: Nov 03, 2015 Open RTE: 1.10.1 Open RTE repo revision: v1.10.0-178-gb80f802 Open RTE release date: Nov 03, 2015 OPAL: 1.10.1 OPAL repo revision: v1.10.0-178-gb80f802 OPAL release date: Nov 03, 2015 MPI API: 3.0.0 Ident string: 1.10.1 On 12/09/15 11:26, Gilles Gouaillardet wrote: Paul, which OpenMPI version are you using ? thanks for providing a simple reproducer, that will make things much easier from now. (and at first glance, that might not be a very tricky bug) Cheers, Gilles On Wednesday, December 9, 2015, Paul Kapinos <kapi...@itc.rwth-aachen.de <mailto:kapi...@itc.rwth-aachen.de>> wrote: Dear Open MPI developers, did OMPIO (1) reached 'usable-stable' state? As we reported in (2) we had some trouble in building Open MPI with ROMIO, which fact was hidden by OMPIO implementation stepping into the MPI_IO breach. The fact 'ROMIO isn't AVBL' was detected after users complained 'MPI_IO don't work as expected with version XYZ of OpenMPI' and further investigations. Take a look at the attached example. It deliver different result in case of using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3). We've seen more examples of divergent behaviour but this one is quite handy. Is that a bug in OMPIO or did we miss something? Best Paul Kapinos 1) http://www.open-mpi.org/faq/?category=ompio 2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php 3) (ROMIO is default; on local hard drive at node 'cluster') $ ompi_info | grep romio MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ ompi_info | grep ompio MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ mpif90 main.f90 $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize1010 fileOffset, fileSize2626 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 $ export OMPI_MCA_io=ompio $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize 010 fileOffset, fileSize 016 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 ___ users mailing list us...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2015/12/28145.php -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] OMPIO correctnes issues
Dear Edgar, On 12/09/15 16:16, Edgar Gabriel wrote: I tested your code in master and v1.10 ( on my local machine), and I get for both version of ompio exactly the same (correct) output that you had with romio. I've tested it at local hard disk.. pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[529]$ df -h . Filesystem Size Used Avail Use% Mounted on /dev/sda3 1.1T 16G 1.1T 2% /w0 pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[530]$ echo hell-o > out.txt; ./a.out fileOffset, fileSize 7 7 fileOffset, fileSize2323 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[531]$ export OMPI_MCA_io=ompio pk224850@cluster:/tmp/pk224850/cluster_15384/TMP[532]$ echo hell-o > out.txt; ./a.out fileOffset, fileSize 0 7 fileOffset, fileSize 016 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 However, I also noticed that in the ompio version that is in the v1.10 branch, the MPI_File_get_size function is not implemented on lustre. Yes we have Lustre in the cluster. I believe that was one of 'another' issues mentioned, yes some users tend to use Lustre as HPC file system =) Thanks Edgar On 12/9/2015 8:06 AM, Edgar Gabriel wrote: I will look at your test case and see what is going on in ompio. That being said, the vast number of fixes and improvements that went into ompio over the last two years were not back ported to the 1.8 (and thus 1.10) series, since it would have required changes to the interfaces of the frameworks involved (and thus would have violated one of rules of Open MPI release series) . Anyway, if there is a simple fix for your test case for the 1.10 series, I am happy to provide a patch. It might take me a day or two however. Edgar On 12/9/2015 6:24 AM, Paul Kapinos wrote: Sorry, forgot to mention: 1.10.1 Open MPI: 1.10.1 Open MPI repo revision: v1.10.0-178-gb80f802 Open MPI release date: Nov 03, 2015 Open RTE: 1.10.1 Open RTE repo revision: v1.10.0-178-gb80f802 Open RTE release date: Nov 03, 2015 OPAL: 1.10.1 OPAL repo revision: v1.10.0-178-gb80f802 OPAL release date: Nov 03, 2015 MPI API: 3.0.0 Ident string: 1.10.1 On 12/09/15 11:26, Gilles Gouaillardet wrote: Paul, which OpenMPI version are you using ? thanks for providing a simple reproducer, that will make things much easier from now. (and at first glance, that might not be a very tricky bug) Cheers, Gilles On Wednesday, December 9, 2015, Paul Kapinos <kapi...@itc.rwth-aachen.de <mailto:kapi...@itc.rwth-aachen.de>> wrote: Dear Open MPI developers, did OMPIO (1) reached 'usable-stable' state? As we reported in (2) we had some trouble in building Open MPI with ROMIO, which fact was hidden by OMPIO implementation stepping into the MPI_IO breach. The fact 'ROMIO isn't AVBL' was detected after users complained 'MPI_IO don't work as expected with version XYZ of OpenMPI' and further investigations. Take a look at the attached example. It deliver different result in case of using ROMIO and OMPIO even with 1 MPI rank on local hard disk, cf. (3). We've seen more examples of divergent behaviour but this one is quite handy. Is that a bug in OMPIO or did we miss something? Best Paul Kapinos 1) http://www.open-mpi.org/faq/?category=ompio 2) http://www.open-mpi.org/community/lists/devel/2015/12/18405.php 3) (ROMIO is default; on local hard drive at node 'cluster') $ ompi_info | grep romio MCA io: romio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ ompi_info | grep ompio MCA io: ompio (MCA v2.0.0, API v2.0.0, Component v1.10.1) $ mpif90 main.f90 $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize1010 fileOffset, fileSize2626 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 $ export OMPI_MCA_io=ompio $ echo hello1234 > out.txt; $MPIEXEC -np 1 -H cluster ./a.out; fileOffset, fileSize 010 fileOffset, fileSize 016 ierr0 MPI_MODE_WRONLY, MPI_MODE_APPEND4 128 -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 _
[OMPI users] funny SIGSEGV in 'ompi_info'
Dear developers, also the following issue is defintely raised by a misconfiguration of Open MPI, SIGSEGV's in 'ompi_info' isn'n a good thing, thus this one mail. Just call: $ export OMPI_MCA_mtl="^tcp,^ib" $ ompi_info --param all all --level 9 ... and take a look at the below core dump of 'ompi_info' like below one. (yes we know that "^tcp,^ib" is a bad idea). Have a nice day, Paul Kapinos P.S. Open MPI: 1.10.4 and 2.0.1 have the same behaviour -- [lnm001:39957] *** Process received signal *** [lnm001:39957] Signal: Segmentation fault (11) [lnm001:39957] Signal code: Address not mapped (1) [lnm001:39957] Failing at address: (nil) [lnm001:39957] [ 0] /lib64/libpthread.so.0(+0xf100)[0x2b30f1a79100] [lnm001:39957] [ 1] /opt/MPI/openmpi-1.10.4/linux/intel_16.0.2.181/lib/libopen-pal.so.13(+0x2f11f)[0x2b30f084911f] [lnm001:39957] [ 2] /opt/MPI/openmpi-1.10.4/linux/intel_16.0.2.181/lib/libopen-pal.so.13(+0x2f265)[0x2b30f0849265] [lnm001:39957] [ 3] /opt/MPI/openmpi-1.10.4/linux/intel_16.0.2.181/lib/libopen-pal.so.13(opal_info_show_mca_params+0x91)[0x2b30f0849031] [lnm001:39957] [ 4] /opt/MPI/openmpi-1.10.4/linux/intel_16.0.2.181/lib/libopen-pal.so.13(opal_info_do_params+0x1f4)[0x2b30f0848e84] [lnm001:39957] [ 5] ompi_info[0x402643] [lnm001:39957] [ 6] /lib64/libc.so.6(__libc_start_main+0xf5)[0x2b30f1ca7b15] [lnm001:39957] [ 7] ompi_info[0x4022a9] [lnm001:39957] *** End of error message *** zsh: segmentation fault (core dumped) ompi_info --param all all --level 9 -- -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] Segmentation Fault (Core Dumped) on mpif90 -v
Hello all, we seem to run into the same issue: 'mpif90' sigsegvs immediately for Open MPI 1.10.4 compiled using Intel compilers 16.0.4.258 and 16.0.3.210, while it works fine when compiled with 16.0.2.181. It seems to be a compiler issue (more exactly: library issue on libs delivered with 16.0.4.258 and 16.0.3.210 versions). Changing the version of compiler loaded back to 16.0.2.181 (=> change of dynamically loaded libs) let the prevously-failing binary (compiled with newer compilers) to work propperly. Compiling with -O0 does not help. As the issue is likely in the Intel libs (as said changing out these solves/raises the issue) we will do a failback to 16.0.2.181 compiler version. We will try to open a case by Intel - let's see... Have a nice day, Paul Kapinos On 05/06/16 14:10, Jeff Squyres (jsquyres) wrote: Ok, good. I asked that question because typically when we see errors like this, it is usually either a busted compiler installation or inadvertently mixing the run-times of multiple different compilers in some kind of incompatible way. Specifically, the mpifort (aka mpif90) application is a fairly simple program -- there's no reason it should segv, especially with a stack trace that you sent that implies that it's dying early in startup, potentially even before it has hit any Open MPI code (i.e., it could even be pre-main). BTW, you might be able to get a more complete stack trace from the debugger that comes with the Intel compiler (idb? I don't remember offhand). Since you are able to run simple programs compiled by this compiler, it sounds like the compiler is working fine. Good! The next thing to check is to see if somehow the compiler and/or run-time environments are getting mixed up. E.g., the apps were compiled for one compiler/run-time but are being used with another. Also ensure that any compiler/linker flags that you are passing to Open MPI's configure script are native and correct for the platform for which you're compiling (e.g., don't pass in flags that optimize for a different platform; that may result in generating machine code instructions that are invalid for your platform). Try recompiling/re-installing Open MPI from scratch, and if it still doesn't work, then send all the information listed here: https://www.open-mpi.org/community/help/ On May 6, 2016, at 3:45 AM, Giacomo Rossi <giacom...@gmail.com> wrote: Yes, I've tried three simple "Hello world" programs in fortan, C and C++ and the compile and run with intel 16.0.3. The problem is with the openmpi compiled from source. Giacomo Rossi Ph.D., Space Engineer Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza" University of Rome p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com Member of Fortran-FOSS-programmers 2016-05-05 11:15 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>: gdb /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 GNU gdb (GDB) 7.11 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90...(no debugging symbols found)...done. (gdb) r -v Starting program: /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 -v Program received signal SIGSEGV, Segmentation fault. 0x76858f38 in ?? () (gdb) bt #0 0x76858f38 in ?? () #1 0x77de5828 in _dl_relocate_object () from /lib64/ld-linux-x86-64.so.2 #2 0x77ddcfa3 in dl_main () from /lib64/ld-linux-x86-64.so.2 #3 0x77df029c in _dl_sysdep_start () from /lib64/ld-linux-x86-64.so.2 #4 0x774a in _dl_start () from /lib64/ld-linux-x86-64.so.2 #5 0x77dd9d98 in _start () from /lib64/ld-linux-x86-64.so.2 #6 0x0002 in ?? () #7 0x7fffaa8a in ?? () #8 0x7fffaab6 in ?? () #9 0x in ?? () Giacomo Rossi Ph.D., Space Engineer Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza" University of Rome p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com Member of Fortran-FOSS-programmers 2016-05-05 10:44 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>: Here the result of ldd command: 'ldd /opt/openmpi/1.10.2/intel/16.0.3/b
Re: [OMPI users] Segmentation Fault (Core Dumped) on mpif90 -v
Hi all, we discussed this issue with Intel compiler support and it looks like they now know what the issue is and how to protect after. It is a known issue resulting from a backwards incompatibility in an OS/glibc update, cf. https://sourceware.org/bugzilla/show_bug.cgi?id=20019 Affected versions of the Intel compilers: 16.0.3, 16.0.4 Not affected versions: 16.0.2, 17.0 So, simply do not use affected versions (and hope on an bugfix update in 16x series if you cannot immediately upgrade to 17x, like we, despite this is the favourite option from Intel). Have a nice Christmas time! Paul Kapinos On 12/14/16 13:29, Paul Kapinos wrote: Hello all, we seem to run into the same issue: 'mpif90' sigsegvs immediately for Open MPI 1.10.4 compiled using Intel compilers 16.0.4.258 and 16.0.3.210, while it works fine when compiled with 16.0.2.181. It seems to be a compiler issue (more exactly: library issue on libs delivered with 16.0.4.258 and 16.0.3.210 versions). Changing the version of compiler loaded back to 16.0.2.181 (=> change of dynamically loaded libs) let the prevously-failing binary (compiled with newer compilers) to work propperly. Compiling with -O0 does not help. As the issue is likely in the Intel libs (as said changing out these solves/raises the issue) we will do a failback to 16.0.2.181 compiler version. We will try to open a case by Intel - let's see... Have a nice day, Paul Kapinos On 05/06/16 14:10, Jeff Squyres (jsquyres) wrote: Ok, good. I asked that question because typically when we see errors like this, it is usually either a busted compiler installation or inadvertently mixing the run-times of multiple different compilers in some kind of incompatible way. Specifically, the mpifort (aka mpif90) application is a fairly simple program -- there's no reason it should segv, especially with a stack trace that you sent that implies that it's dying early in startup, potentially even before it has hit any Open MPI code (i.e., it could even be pre-main). BTW, you might be able to get a more complete stack trace from the debugger that comes with the Intel compiler (idb? I don't remember offhand). Since you are able to run simple programs compiled by this compiler, it sounds like the compiler is working fine. Good! The next thing to check is to see if somehow the compiler and/or run-time environments are getting mixed up. E.g., the apps were compiled for one compiler/run-time but are being used with another. Also ensure that any compiler/linker flags that you are passing to Open MPI's configure script are native and correct for the platform for which you're compiling (e.g., don't pass in flags that optimize for a different platform; that may result in generating machine code instructions that are invalid for your platform). Try recompiling/re-installing Open MPI from scratch, and if it still doesn't work, then send all the information listed here: https://www.open-mpi.org/community/help/ On May 6, 2016, at 3:45 AM, Giacomo Rossi <giacom...@gmail.com> wrote: Yes, I've tried three simple "Hello world" programs in fortan, C and C++ and the compile and run with intel 16.0.3. The problem is with the openmpi compiled from source. Giacomo Rossi Ph.D., Space Engineer Research Fellow at Dept. of Mechanical and Aerospace Engineering, "Sapienza" University of Rome p: (+39) 0692927207 | m: (+39) 3408816643 | e: giacom...@gmail.com Member of Fortran-FOSS-programmers 2016-05-05 11:15 GMT+02:00 Giacomo Rossi <giacom...@gmail.com>: gdb /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 GNU gdb (GDB) 7.11 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90...(no debugging symbols found)...done. (gdb) r -v Starting program: /opt/openmpi/1.10.2/intel/16.0.3/bin/mpif90 -v Program received signal SIGSEGV, Segmentation fault. 0x76858f38 in ?? () (gdb) bt #0 0x76858f38 in ?? () #1 0x77de5828 in _dl_relocate_object () from /lib64/ld-linux-x86-64.so.2 #2 0x77ddcfa3 in dl_main () from /lib64/ld-linux-x86-64.so.2 #3 0x77df029c in _dl_sysdep_start () from /lib64/ld-linux-x86-64.so.2 #4 0x774a in _dl_start () from /lib64/l
Re: [OMPI users] openib/mpi_alloc_mem pathology
Jeff, I confirm: your patch did it. (tried on 1.10.6 - do not even need to rebuild the cp2k.popt , just load another Open MPI version compiled with Jeff'path) ( On Intel OmpiPath the same speed as with --mca btl ^tcp,openib ) On 03/16/17 01:03, Jeff Squyres (jsquyres) wrote: It looks like there were 3 separate threads on this CP2K issue, but I think we developers got sidetracked because there was a bunch of talk in the other threads about PSM, non-IB(verbs) networks, etc. So: the real issue is an app is experiencing a lot of slowdown when calling MPI_ALLOC_MEM/MPI_FREE_MEM when the openib BTL is involved. The MPI_*_MEM calls are "slow" when used with the openib BTL because we're registering the memory every time you call MPI_ALLOC_MEM and deregistering the memory every time you call MPI_FREE_MEM. This was intended as an optimization such that the memory is already registered when you invoke an MPI communications function with that buffer. I guess we didn't really anticipate the case where *every* allocation goes through ALLOC_MEM... Meaning: if the app is aggressive in using MPI_*_MEM *everywhere* -- even for buffers that aren't used for MPI communication -- I guess you could end up with lots of useless registration/deregistration. If the app does it a lot, that could be the source of quite a lot of needless overhead. We don't have a run-time bypass of this behavior (i.e., we assumed that if you're calling MPI_*_MEM, you mean to do so). But let's try an experiment -- can you try applying this patch and see if it removes the slowness? This patch basically removes the registration / deregistration with ALLOC/FREE_MEM (and instead handles it lazily / upon demand when buffers are passed to MPI functions): ```patch diff --git a/ompi/mpi/c/alloc_mem.c b/ompi/mpi/c/alloc_mem.c index 8c8fb8cd54..c62c8ff706 100644 --- a/ompi/mpi/c/alloc_mem.c +++ b/ompi/mpi/c/alloc_mem.c @@ -74,6 +74,7 @@ int MPI_Alloc_mem(MPI_Aint size, MPI_Info info, void *baseptr) OPAL_CR_ENTER_LIBRARY(); +#if 0 if (MPI_INFO_NULL != info) { int flag; (void) ompi_info_get (info, "mpool_hints", MPI_MAX_INFO_VAL, info_value, @@ -84,6 +85,9 @@ int MPI_Alloc_mem(MPI_Aint size, MPI_Info info, void *baseptr) *((void **) baseptr) = mca_mpool_base_alloc ((size_t) size, (struct opal_info_t mpool_hints); +#else +*((void **) baseptr) = malloc(size); +#endif OPAL_CR_EXIT_LIBRARY(); if (NULL == *((void **) baseptr)) { return OMPI_ERRHANDLER_INVOKE(MPI_COMM_WORLD, MPI_ERR_NO_MEM, diff --git a/ompi/mpi/c/free_mem.c b/ompi/mpi/c/free_mem.c index 4498fc8bb1..4c65ea2339 100644 --- a/ompi/mpi/c/free_mem.c +++ b/ompi/mpi/c/free_mem.c @@ -50,10 +50,16 @@ int MPI_Free_mem(void *baseptr) If you call MPI_ALLOC_MEM with a size of 0, you get NULL back. So don't consider a NULL==baseptr an error. */ +#if 0 if (NULL != baseptr && OMPI_SUCCESS != mca_mpool_base_free(baseptr)) { OPAL_CR_EXIT_LIBRARY(); return OMPI_ERRHANDLER_INVOKE(MPI_COMM_WORLD, MPI_ERR_NO_MEM, FUNC_NAME); } +#else +if (NULL != baseptr) { +free(baseptr); +} +#endif OPAL_CR_EXIT_LIBRARY(); return MPI_SUCCESS; ``` This will at least tell us if the innards of our ALLOC_MEM/FREE_MEM (i.e., likely the registration/deregistration) are causing the issue. On Mar 15, 2017, at 1:27 PM, Dave Love <dave.l...@manchester.ac.uk> wrote: Paul Kapinos <kapi...@itc.rwth-aachen.de> writes: Nathan, unfortunately '--mca memory_linux_disable 1' does not help on this issue - it does not change the behaviour at all. Note that the pathological behaviour is present in Open MPI 2.0.2 as well as in /1.10.x, and Intel OmniPath (OPA) network-capable nodes are affected only. [I guess that should have been "too" rather than "only". It's loading the openib btl that is the problem.] The known workaround is to disable InfiniBand failback by '--mca btl ^tcp,openib' on nodes with OPA network. (On IB nodes, the same tweak lead to 5% performance improvement on single-node jobs; It was a lot more than that in my cp2k test. but obviously disabling IB on nodes connected via IB is not a solution for multi-node jobs, huh). But it works OK with libfabric (ofi mtl). Is there a problem with libfabric? Has anyone reported this issue to the cp2k people? I know it's not their problem, but I assume they'd like to know for users' sake, particularly if it's not going to be addressed. I wonder what else might be affected. ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24
Re: [OMPI users] openib/mpi_alloc_mem pathology [#20160912-1315]
Hi, On 03/16/17 10:35, Alfio Lazzaro wrote: We would like to ask you which version of CP2K you are using in your tests Release 4.1 and if you can share with us your input file and output log. The question goes to Mr Mathias Schumacher, on CC: Best Paul Kapinos (Our internal ticketing system also on CC:) -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] openib/mpi_alloc_mem pathology
Nathan, unfortunately '--mca memory_linux_disable 1' does not help on this issue - it does not change the behaviour at all. Note that the pathological behaviour is present in Open MPI 2.0.2 as well as in /1.10.x, and Intel OmniPath (OPA) network-capable nodes are affected only. The known workaround is to disable InfiniBand failback by '--mca btl ^tcp,openib' on nodes with OPA network. (On IB nodes, the same tweak lead to 5% performance improvement on single-node jobs; but obviously disabling IB on nodes connected via IB is not a solution for multi-node jobs, huh). On 03/07/17 20:22, Nathan Hjelm wrote: If this is with 1.10.x or older run with --mca memory_linux_disable 1. There is a bad interaction between ptmalloc2 and psm2 support. This problem is not present in v2.0.x and newer. -Nathan On Mar 7, 2017, at 10:30 AM, Paul Kapinos <kapi...@itc.rwth-aachen.de> wrote: Hi Dave, On 03/06/17 18:09, Dave Love wrote: I've been looking at a new version of an application (cp2k, for for what it's worth) which is calling mpi_alloc_mem/mpi_free_mem, and I don't Welcome to the club! :o) In our measures we see some 70% of time in 'mpi_free_mem'... and 15x performance loss if using Open MPI vs. Intel MPI. So it goes. https://www.mail-archive.com/users@lists.open-mpi.org//msg30593.html think it did so the previous version I looked at. I found on an IB-based system it's spending about half its time in those allocation routines (according to its own profiling) -- a tad surprising. It turns out that's due to some pathological interaction with openib, and just having openib loaded. It shows up on a single-node run iff I don't suppress the openib btl, and doesn't for multi-node PSM runs iff I suppress openib (on a mixed Mellanox/Infinipath system). we're lucky - our issue is on Intel OmniPath (OPA) network (and we will junk IB hardware in near future, I think) - so we disabled the IB transport failback, --mca btl ^tcp,openib For single-node jobs this will also help on plain IB nodes, likely. (you can disable IB if you do not use it) Can anyone say why, and whether there's a workaround? (I can't easily diagnose what it's up to as ptrace is turned off on the system concerned, and I can't find anything relevant in archives.) I had the idea to try libfabric instead for multi-node jobs, and that doesn't show the pathological behaviour iff openib is suppressed. However, it requires ompi 1.10, not 1.8, which I was trying to use. ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] openib/mpi_alloc_mem pathology
Hi Dave, On 03/06/17 18:09, Dave Love wrote: I've been looking at a new version of an application (cp2k, for for what it's worth) which is calling mpi_alloc_mem/mpi_free_mem, and I don't Welcome to the club! :o) In our measures we see some 70% of time in 'mpi_free_mem'... and 15x performance loss if using Open MPI vs. Intel MPI. So it goes. https://www.mail-archive.com/users@lists.open-mpi.org//msg30593.html think it did so the previous version I looked at. I found on an IB-based system it's spending about half its time in those allocation routines (according to its own profiling) -- a tad surprising. It turns out that's due to some pathological interaction with openib, and just having openib loaded. It shows up on a single-node run iff I don't suppress the openib btl, and doesn't for multi-node PSM runs iff I suppress openib (on a mixed Mellanox/Infinipath system). we're lucky - our issue is on Intel OmniPath (OPA) network (and we will junk IB hardware in near future, I think) - so we disabled the IB transport failback, --mca btl ^tcp,openib For single-node jobs this will also help on plain IB nodes, likely. (you can disable IB if you do not use it) Can anyone say why, and whether there's a workaround? (I can't easily diagnose what it's up to as ptrace is turned off on the system concerned, and I can't find anything relevant in archives.) I had the idea to try libfabric instead for multi-node jobs, and that doesn't show the pathological behaviour iff openib is suppressed. However, it requires ompi 1.10, not 1.8, which I was trying to use. ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] Is building with "--enable-mpi-thread-multiple" recommended?
Hi Mark, On 02/18/17 09:14, Mark Dixon wrote: On Fri, 17 Feb 2017, r...@open-mpi.org wrote: Depends on the version, but if you are using something in the v2.x range, you should be okay with just one installed version How good is MPI_THREAD_MULTIPLE support these days and how far up the wishlist is it, please? Note that on 1.10.x series (even on 1.10.6), enabling of MPI_THREAD_MULTIPLE in lead to (silent) shutdown of the InfiniBand fabric for that application => SLOW! 2.x versions (tested: 2.0.1) handle MPI_THREAD_MULTIPLE on InfiniBand the right way up, however due to absence of memory hooks (= nut aligned memory allocation) we get 20% less bandwidth on IB with 2.x versions compared to 1.10.x versions of Open MPI (regardless with or without support of MPI_THREAD_MULTIPLE). On Intel OmniPath network both above issues seem to be not present, but due to a performance bug in MPI_Free_mem your application can be horribly slow (seen: CP2K) if the InfiniBand failback of OPA not disabled manually, see https://www.mail-archive.com/users@lists.open-mpi.org//msg30593.html Best, Paul Kapinos -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] Is building with "--enable-mpi-thread-multiple" recommended?
Hi, On 03/03/17 12:41, Mark Dixon wrote: Your 20% memory bandwidth performance hit on 2.x and the OPA problem are concerning - will look at that. Are there tickets open for them? OPA performance issue on CP2K (15x slowdown) : https://www.mail-archive.com/users@lists.open-mpi.org//msg30593.html (cf. the thread) workaround is to disable IB failback on OPA, > --mca btl ^tcp,openib With this tweak on OPA, OpenMPI's CP2K is less than 10% slower than Intel MPI's (the same result as on InfiniBand) - which is much much better that 1500%, huh. However Open MPI's CP2K still stays slower than Intel MPI's due to worse MPI_Alltoallv, as far as I understood the profiles. I will mail to CP2K developers soon... 20% bandwidth with Open MPI 2.x: cf. https://www.mail-archive.com/devel@lists.open-mpi.org/msg00043.html - Nathan Hjelm mean the hooks are removed by intention. We have a (nasty) workaround, cf. https://www.mail-archive.com/devel@lists.open-mpi.org/msg00052.html As far as I can see this issue is on InfiniBand only. Best Paul -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] Performance issues: 1.10.x vs 2.x
On 05/05/17 12:10, marcin.krotkiewski wrote: > in my case it was enough to allocate my own arrays using posix_memalign. Be happy. This did not work for Fortran codes.. But since that worked, it means that 1.10.6 deals somehow better with unaligned data. Anyone knows the reason for this? In 1.10.x series there were 'memory hooks' - Open MPI did take some care abount the alignment. This was removed in 2.x series, cf. the whole thread on my link. -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] openib/mpi_alloc_mem pathology [#20160912-1315]
Hi all, sorry for the long long latency - this message was buried in my mailbox for months On 03/16/2017 10:35 AM, Alfio Lazzaro wrote: > Hello Dave and others, > we jump in the discussion as CP2K developers. > We would like to ask you which version of CP2K you are using in your tests version 4.1 (release) > and > if you can share with us your input file and output log. The input file is property of Mathias Schumacher (CC:) and we need a permission of him to provide it. > Some clarifications on the way we use MPI allocate/free: > 1) only buffers used for MPI communications are allocated with MPI > allocate/free > 2) in general we use memory pools, therefore we reach a limit in the buffers > sizes after some iterations, i.e. they are not reallocated anymore > 3) there are some cases where we don't use memory pools, but their overall > contribution should be very small. You can run with the CALLGRAPH option > (https://www.cp2k.org/dev:profiling#the_cp2k_callgraph) to get more insight > where those allocations/deallocations are. We ran the data set again with CALLGRAPH option. Please have a look at the attached files. You see a callgraph file (from rank 0 of 24 used) and some exported call tree views. We can see that the *allocate* routines (mp_[de|]allocate_[i|d]) are called 33k vs. 28k times (multiple this with 24x processes per node). In the 'good case' (Intel MPI and Open MPI with workaround) these calls are only a fraction of 1% of time; in 'bad case' (OpenMPI w/o workaround, attached) the two mp_dealocate_[i|d] calls use 81% of the time in 'Self', huh. That's mainly the observation we made a long time ago: if in a node with Intel OmniPath fabric the failback to InfiniBand is not prohibited, the MPI_Free_mem() take ages. (I'm not familiar with CCachegrind so forgive me if I'm not true). Have a nice day, Paul Kapinos -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 20171019-callgraph.tar.gz Description: application/gzip smime.p7s Description: S/MIME Cryptographic Signature ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] openib/mpi_alloc_mem pathology [#20160912-1315]
On 10/20/2017 12:24 PM, Dave Love wrote: > Paul Kapinos <kapi...@itc.rwth-aachen.de> writes: > >> Hi all, >> sorry for the long long latency - this message was buried in my mailbox for >> months >> >> >> >> On 03/16/2017 10:35 AM, Alfio Lazzaro wrote: >>> Hello Dave and others, >>> we jump in the discussion as CP2K developers. >>> We would like to ask you which version of CP2K you are using in your tests >> version 4.1 (release) >> >>> and >>> if you can share with us your input file and output log. >> >> The input file is property of Mathias Schumacher (CC:) and we need a >> permission >> of him to provide it. > > I lost track of this, but the problem went away using libfabric instead > of openib, so I left it at that, though libfabric hurt IMB pingpong > latency compared with openib. > > I seem to remember there's a workaround in the cp2k development source, > but that obviously doesn't solve the general problem. > The issue has two facing pages: - CP2K used a lot of MPI_Alloc_mem() / MPI_Free_mem() calls. This was addressed by CP2K developers, (private mail by Alfio Lazzaro): > in the new CP2K release (next week will have version 5.1), I have reduced the > amount of MPI allocations. I have also added a flag to avoid any MPI > allocations, that you can add in the CP2K input file: > > > use_mpi_allocator F > > GLOBAL We will test the new release after it is available. (Note, the user still has to think on disabling use_mpi_allocator, if in doubt.) - in Open MPI compiled for both(1) IB and OPA, on a node with OPA, using *default* configuration (failback to 'openib' *not prohibited*), MPI_Free_mem() calls suddenly lasts 1 or so times longer, starting to dominate the application run time. Known workaround: prohibit the failback to 'openib' BTL by '-mca btl ^tcp,openib' - that's what we implemented. It's up to Open MPI developers if they would like to follow-up this 'small' performance issue. Best, Paul Kapinos P.S. It was a hard work even to locate this issue, as only 2 tools (of 7 or 8 tried) were able to point to the evil call... (1) yes we use the same Open MPI installation on islands with InfiniBand, with OmniPath, and even on ethernet-only. -- Dipl.-Inform. Paul Kapinos - High Performance Computing, RWTH Aachen University, IT Center Seffenter Weg 23, D 52074 Aachen (Germany) Tel: +49 241/80-24915 smime.p7s Description: S/MIME Cryptographic Signature ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] NAG Fortran 2018 bindings with Open MPI 4.1.2
Dear Jeff, I should like to point out that the NAG Fortran compiler is [and likely their developers are] the most picky and overly didactic Fortran compiler [developers] I know. (I worked tightly with more than 5 vendors and in dozens of versions, an I reported some 200 bugs to the early development stage of Mercurium Fortran compiler https://github.com/bsc-pm/mcxx and dozens to Intels 'ifort' - sorry for praising myself :-) In about 5 cases I was hard believing 'that is a bug in the NAG compiler!' because they did not compile a code accepted (and often working!) by all other compilers - intel, gfortran, Sun/Oracle studio, PGI... Then I tried to open a case by NAG (once or two times IIRC), and to read the fg Fortran language standard, and in *all* cases - without exception! - the NAGs interpretation of the standard was the *right* one. (I cannot state that about gfortran and intel, by the way.) So these guys may be snarky, but they can Fortran, definitely. And if Open MPI bindings may be compiled by this compiler - they would be likely very standard-conforming. Have a nice day and a nice year 2022, Paul Kapinos On 12/30/21 16:27, Jeff Squyres (jsquyres) via users wrote: Snarky comments from the NAG tech support people aside, if they could be a little more specific about what non-conformant Fortran code they're referring to, we'd be happy to work with them to get it fixed. I'm one of the few people in the Open MPI dev community who has a clue about Fortran, and I'm *very far* from being a Fortran expert. Modern Fortran is a legitimately complicated language. So it doesn't surprise me that we might have some code in our configure tests that isn't quite right. Let's also keep in mind that the state of F2008 support varies widely across compilers and versions. The current Open MPI configure tests straddle the line of trying to find *enough* F2008 support in a given compiler to be sufficient for the mpi_f08 module without being so overly proscriptive as to disqualify compilers that aren't fully F2008-compliant. Frankly, the state of F2008 support across the various Fortran compilers was a mess when we wrote those configure tests; we had to cobble together a variety of complicated tests to figure out if any given compiler supported enough F2008 support for some / all of the mpi_f08 module. That's why the configure tests are... complicated. -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Matt Thompson via users Sent: Thursday, December 23, 2021 11:41 AM To: Wadud Miah Cc: Matt Thompson; Open MPI Users Subject: Re: [OMPI users] NAG Fortran 2018 bindings with Open MPI 4.1.2 I heard back from NAG: Regarding OpenMPI, we have attempted the build ourselves but cannot make sense of the configure script. Only the OpenMPI maintainers can do something about that and it looks like they assume that all compilers will just swallow non-conforming Fortran code. The error downgrading options for NAG compiler remain "-dusty", "-mismatch" and "-mismatch_all" and none of them seem to help with the mpi_f08 module of OpenMPI. If there is a bug in the NAG Fortran Compiler that is responsible for this, we would love to hear about it, but at the moment we are not aware of such. So it might mean the configure script itself might need to be altered to use F2008 conforming code? On Thu, Dec 23, 2021 at 8:31 AM Wadud Miah mailto:wmiah...@gmail.com>> wrote: You can contact NAG support at supp...@nag.co.uk<mailto:supp...@nag.co.uk> but they will look into this in the new year. Regards, On Thu, 23 Dec 2021, 13:18 Matt Thompson via users, mailto:users@lists.open-mpi.org>> wrote: Oh. Yes, I am on macOS. The Linux cluster I work on doesn't have NAG 7.1 on it...mainly because I haven't asked for it. Until NAG fix the bug we are seeing, I figured why bother the admins. Still, it does *seem* like it should work. I might ask NAG support about it. On Wed, Dec 22, 2021 at 6:28 PM Tom Kacvinsky mailto:tkacv...@gmail.com>> wrote: On Wed, Dec 22, 2021 at 5:45 PM Tom Kacvinsky mailto:tkacv...@gmail.com>> wrote: On Wed, Dec 22, 2021 at 4:11 PM Matt Thompson mailto:fort...@gmail.com>> wrote: All, When I build Open MPI with NAG, I have to pass in: FCFLAGS"=-mismatch_all -fpp" this flag tells nagfor to downgrade some errors with interfaces to warnings: -mismatch_all Further downgrade consistency checking of procedure argument lists so that calls to routines in the same file which are incorrect will produce warnings instead of error messages. This option disables -C=calls. The fpp flag is how you tell NAG to do preprocessing (it doesn't automatically do it with .F90 files). I also have to pass in a lot of other flags as seen here: https://github.com/mathomp4/parcelmodulefiles/blob/main/Compiler/