FYI, Just noticed this post from the hdf group: https://forum.hdfgroup.org/t/hdf5-and-openmpi/5437 /Peter K pgpmcS_mBlpzB.pgp Description: OpenPGP digital signature ___ users mailing list email@example.com https://lists.open-mpi.org/mailman/listinfo/users
On Wed, 20 Feb 2019 10:46:10 -0500 Adam LeBlanc wrote: > Hello, > > When I do a run with OpenMPI v4.0.0 on Infiniband with this command: > mpirun --mca btl_openib_warn_no_device_params_found 0 --map-by node > --mca orte_base_help_aggregate 0 --mca btl openib,vader,self --mca > pml ob1 --mca btl_openib_allow_ib 1 -np 6 > -hostfile /home/aleblanc/ib-mpi-hosts IMB-MPI1 > > I get this error: ... > # Benchmarking Reduce_scatter ... > 2097152 20 8738.08 9340.50 9147.89 > [pandora:04500] *** Process received signal *** > [pandora:04500] Signal: Segmentation fault (11) This is very likely a bug in IMB not in OpenMPI. It's been discussed on the list before, thread name: MPI_Reduce_Scatter Segmentation Fault with Intel 2019 Update 1 Compilers on OPA-1... You can work around by using an older IMB version (the bug is in the newer/est version). /Peter K ___ users mailing list firstname.lastname@example.org https://lists.open-mpi.org/mailman/listinfo/users
On Thu, 10 Jan 2019 21:51:03 +0900 Gilles Gouaillardet wrote: > Eduardo, > > You have two options to use OmniPath > > - “directly” via the psm2 mtl > mpirun —mca pml cm —mca mtl psm2 ... > > - “indirectly” via libfabric > mpirun —mca pml cm —mca mtl ofi ... > > I do invite you to try both. By explicitly requesting the mtl you will > avoid potential conflicts. > > libfabric is used in production by Cisco and AWS (both major > contributors to both Open MPI and libfabric) so this is clearly not > something to stay away from. Both me and a 2nd person investigated 4.0.0rc on Omnipath (see devel list thread "Re: [OMPI devel] Announcing Open MPI v4.0.0rc1"). First both psm2 and ofi seemed broken but it turned out psm2 only had problems because ofi got in the way. And ofi was not that easily excluded since it also had a btl component. Essentially I got it working by deleting all mca files matching *ofi*. YMMV, Peter K ___ users mailing list email@example.com https://lists.open-mpi.org/mailman/listinfo/users
On Thu, 10 Jan 2019 11:20:12 + ROTHE Eduardo - externe wrote: > Hi Gilles, thank you so much for your support! > > For now I'm just testing the software, so it's running on a single > node. > > Your suggestion was very precise. In fact, choosing the ob1 component > leads to a successfull execution! The tcp component had no effect. > > mpirun --mca pml ob1 —mca btl tcp,self -np 2 ./a.out > Success > mpirun --mca pml ob1 -np 2 ./a.out > Success > > But... our cluster is equiped with Intel OMNI Path interconnects and > we are aiming to use psm2 through ofi component in order to take full > advantage of this technology. Ofi support in openmpi has been something to stay away from in my experience. You should just use the psm2 mtl instead. /Peter K ___ users mailing list firstname.lastname@example.org https://lists.open-mpi.org/mailman/listinfo/users
On Sat, 22 Dec 2018 12:42:24 -0500 Bennet Fauber wrote: > Maybe the distribution tar ball at > > https://download.open-mpi.org/release/open-mpi/v3.1/openmpi-3.1.3.tar.gz > > did not get refreshed after the fix in > > https://github.com/bosilca/ompi/commit/b902cd5eb765ada57f06c75048509d0716953549 > > was implemented? I downloaded the tarball from open-mpi.org today, 22 > Dec, and compiled and I get the warnings. The 3.1.3 tar ball will always be exactly what it was when released (ie. the 3.1.3 tag in git). The commit you refer to was merged to the 3.1.x branch after 3.1.3 and will as such be available in 3.1.4 if nothing unexpected happens. If you want an unreleased 3.1.x you can use the corresponding nightly build found at: https://www.open-mpi.org/nightly/v3.1.x/ Also the commit on the v3.1.x branch is: commit 9cce716e75b15c2fd7b1a017d807fe2e733e6ee6 Merge: 1704063162 00ab40cd79 Author: Ralph Castain Date: Tue Dec 4 06:14:41 2018 -0800 Merge pull request #6038 from hppritcha/topic/swat_issue5810_v3.1.x btl/openib: fix a problem with ib query /Peter___ users mailing list email@example.com https://lists.open-mpi.org/mailman/listinfo/users
On Tue, 4 Dec 2018 09:15:13 -0500 George Bosilca wrote: > I'm trying to replicate using the same compiler (icc 2019) on my OSX > over TCP and shared memory with no luck so far. So either the > segfault it's something specific to OmniPath or to the memcpy > implementation used on Skylake. Note that it's the imb-2019.1 that is the problem (I think). And I did get it to crash even on a single node (skylake / centos7). /Peter -- Sent from my Android device with K-9 Mail. Please excuse my brevity.___ users mailing list firstname.lastname@example.org https://lists.open-mpi.org/mailman/listinfo/users
On Mon, 3 Dec 2018 19:41:25 + "Hammond, Simon David via users" wrote: > Hi Open MPI Users, > > Just wanted to report a bug we have seen with OpenMPI 3.1.3 and 4.0.0 > when using the Intel 2019 Update 1 compilers on our > Skylake/OmniPath-1 cluster. The bug occurs when running the Github > master src_c variant of the Intel MPI Benchmarks. I've noticed this also when using intel mpi (2018 and 2019u1). I classified it as a bug in imb but didn't look too deep (new reduce_scatter code). /Peter K -- Sent from my Android device with K-9 Mail. Please excuse my brevity.___ users mailing list email@example.com https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] MPI cartesian grid : cumulate a scalar value through the procs of a given axis of the grid
On Wed, 2 May 2018 08:39:30 -0400 Charles Antonelli
wrote: > This seems to be crying out for MPI_Reduce. No, the described reduction cannot be implemented with MPI_Reduce (note the need for partial sums along the axis). > Also in the previous solution given, I think you should do the > MPI_Sends first. Doing the MPI_Receives first forces serialization. It needs that. The first thing that happens is that the first rank skips the recv and sends its SCAL to the 2nd process that just posted recv. Each process needs to complete the recv to know what to send (unless you split it out into many more sends which is possible). What's the best solution depends on if this part is performance critical and how large K is. /Peter K > Regards, > Charles ... > > Something like (simplified psuedo code): > > > > if (not_first_along_K) > > MPI_RECV(SCAL_tmp, previous) > > SCAL += SCAL_tmp > > > > if (not_last_along_K) > > MPI_SEND(SCAL, next) > > > > /Peter K > > ___ > > users mailing list > > firstname.lastname@example.org > > https://lists.open-mpi.org/mailman/listinfo/users > > -- Sent from my Android device with K-9 Mail. Please excuse my brevity.___ users mailing list email@example.com https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] MPI cartesian grid : cumulate a scalar value through the procs of a given axis of the grid
On Wed, 02 May 2018 06:32:16 -0600 Nathan Hjelm
wrote: > Hit send before I finished. If each proc along the axis needs the > partial sum (ie proc j gets sum for i = 0 -> j-1 SCAL[j]) then > MPI_Scan will do that. I must confess that I had forgotten about MPI_Scan when I replied to the OP. In fact, I don't think I've ever used it... :-) /Peter K -- Sent from my Android device with K-9 Mail. Please excuse my brevity.___ users mailing list firstname.lastname@example.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] MPI cartesian grid : cumulate a scalar value through the procs of a given axis of the grid
On Wed, 2 May 2018 11:15:09 +0200 Pierre Gubernatis
wrote: > Hello all... > > I am using a *cartesian grid* of processors which represents a spatial > domain (a cubic geometrical domain split into several smaller > cubes...), and I have communicators to address the procs, as for > example a comm along each of the 3 axes I,J,K, or along a plane > IK,JK,IJ, etc..). > > *I need to cumulate a scalar value (SCAL) through the procs which > belong to a given axis* (let's say the K axis, defined by I=J=0). > > Precisely, the origin proc 0-0-0 has a given value for SCAL (say > SCAL000). I need to update the 'following' proc (0-0-1) by doing SCAL > = SCAL + SCAL000, and I need to *propagate* this updating along the K > axis. At the end, the last proc of the axis should have the total sum > of SCAL over the axis. (and of course, at a given rank k along the > axis, the SCAL value = sum over 0,1, K of SCAL) > > Please, do you see a way to do this ? I have tried many things (with > MPI_SENDRECV and by looping over the procs of the axis, but I get > deadlocks that prove I don't handle this correctly...) > Thank you in any case. Why did you try SENDRECV? As far as I understand your description above data only flows one direction (along K)? There is no MPI collective to support the kind of reduction you describe but it should not be hard to do using normal SEND and RECV. Something like (simplified psuedo code): if (not_first_along_K) MPI_RECV(SCAL_tmp, previous) SCAL += SCAL_tmp if (not_last_along_K) MPI_SEND(SCAL, next) /Peter K ___ users mailing list email@example.com https://lists.open-mpi.org/mailman/listinfo/users
On Wed, 13 Dec 2017 20:34:52 +0330 Mahmood Naderan
wrote: > >Currently I am using two Tesla K40m cards for my computational work > >on quantum espresso (QE) suit http://www.quantum-espresso.org/. My > >GPU enabled QE code running very slower than normal version > > Hi, > When I hear such words, I would say, yeah it is quite natural! > > My personal experience with a GPU (Quadro M2000) was actually a > failure and loss of money. With various models, configs and companies, > it is very hard to determine if a GPU product really boosts the > performance Agreed. GPU performance is not a given. It depends on app, version, input files, hardware, job-geometry, .. > At the end of the day, I think companies put all good features in > their high-end products (multi thousand dollar ones). So, I think the > K40m version, where it uses passive cooling, misses many good features > although it has 12GB of GDDR5. K40m is the very high end (of the previous generation, Kepler). The only higher speced GPU is the K80 which is just two slightly less impressive K40 in one package. As far as "passively cooled" goes. It's a server component where the server is expected to provide the needed airflow. The K40m is a high TDP part. /Peter -- Sent from my Android device with K-9 Mail. Please excuse my brevity.___ users mailing list firstname.lastname@example.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] IMB-MPI1 hangs after 30 minutes with Open MPI 3.0.0 (was: Openmpi 1.10.4 crashes with 1024 processes)
On Fri, 1 Dec 2017 21:32:35 +0100 Götz Waschk
wrote: ... > # Benchmarking Alltoall > # #processes = 1024 > # >#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] > 0 1000 0.04 0.09 0.05 > 1 1000 253.40 335.35 293.06 > 2 1000 266.93 346.65 306.23 > 4 1000 303.52 382.41 342.21 > 8 1000 383.89 493.56 439.34 >16 1000 501.27 627.84 569.80 >32 1000 1039.65 1259.70 1163.12 >64 1000 1710.12 2071.47 1910.62 > 128 1000 3051.68 3653.44 3398.65 As a potentially interesting data point, I dug through my archive of imb output and found an example that also showed something strange happening at the 128 to 256 byte transition on alltoall @1024 ranks (although in my case it didn't completely hang): # Benchmarking Alltoall # #processes = 1024 # #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 1 1000 417.44 417.59 417.54 2 1000 410.50 410.72 410.67 4 1000 365.92 366.21 365.99 8 1000 583.21 583.51 583.37 16 1000 652.90 653.09 652.98 32 1000 982.09 982.42 982.28 64 1000 2090.70 2091.11 2090.90 128 1000 2590.91 2591.93 2591.44 256 93 70077.42 70219.70 70174.85 512 93 88611.39 88711.53 88672.84 My output was run on OpenMPI-1.7.6 on CentOS-6 on Mellanox FDR ib (using the normal verbs/openib transport). /Peter K -- Sent from my Android device with K-9 Mail. Please excuse my brevity.___ users mailing list email@example.com https://lists.open-mpi.org/mailman/listinfo/users
On Tue, 10 Oct 2017 11:57:51 -0400 Michael Di Domenico
wrote: > i'm getting stuck trying to run some fairly large IMB-MPI alltoall > tests under openmpi 2.0.2 on rhel 7.4 What is the IB stack used, just RHEL inbox? Do you run openmpi on the psm mtl for qlogic and openib btl for mellanox or something different? > i have two different clusters, one running mellanox fdr10 and one > running qlogic qdr > > if i issue > > mpirun -n 1024 ./IMB-MPI1 -npmin 1024 -iter 1 -mem 2.001 alltoallv Does it work if you run with something that more obviously fits in RAM? Like "-mem 0.2" /Peter K ___ users mailing list firstname.lastname@example.org https://lists.open-mpi.org/mailman/listinfo/users
HPGMG-FV is easy to build and to run both serial, mpi, openmp and mpi+openmp. /Peter On Mon, 9 Oct 2017 17:54:02 + "Sasso, John (GE Digital, consultant)"
wrote: > I am looking for a decent hybrid MPI+OpenMP benchmark utility which I > can easily build and run with OpenMPI 1.6.5 (at least) and OpenMP > under Linux, using GCC build of OpenMPI as well as the Intel Compiler > suite. I have looked at CP2K but that is much too complex a build > for its own good (I managed to build all the prerequisite libraries, > only to have the build of cp2k itself just fail). Also looked at > HOMB 1.0. > > I am wondering what others have used. The build should be simple and > not require a large # of prereq libraries to build beforehand. > Thanks! > > --john ___ users mailing list email@example.com https://lists.open-mpi.org/mailman/listinfo/users
On Thu, 14 Sep 2017 19:01:08 +0900 Gilles Gouaillardet
wrote: > Peter and all, > > an easier option is to configure Open MPI with > --mpirun-prefix-by-default this will automagically add rpath to the > libs. Yes that sorts out the OpenMPI libs but I was imagining a more general situation (and the OP later tried adding openblas). It's also only available if the OpenMPI in question is built with it or if you can rebuild OpenMPI. The OP seems at least partially interested in additional libraries and not rebuilding the system provided OpenMPI. /Peter -- Sent from my Android device with K-9 Mail. Please excuse my brevity.___ users mailing list firstname.lastname@example.org https://lists.open-mpi.org/mailman/listinfo/users
On Thu, 14 Sep 2017 14:28:08 +0430 Mahmood Naderan
wrote: > >In short, "mpicc -Wl,-rpath=/my/lib/path helloworld.c -o hello", will > >compile a dynamic binary "hello" with built in search path > >to "/my/lib/path". > > Excuse me... Is that a path or file? I get this: It should be a path ie. directory. > mpif90 -g -pthread -Wl,rpath=/share/apps/computer/OpenBLAS-0.2.18 -o > iotk_print_kinds.x iotk_print_kinds.o libiotk.a > /usr/bin/ld: rpath=/share/apps/computer/OpenBLAS-0.2.18: No such > file: No such file or directory I think it didn't like passing "-rpath=/a/b/c" in one chunk. Try this variant: -Wl,-rpath,/path/to/directory or even: -Wl,-rpath -Wl,/path/to/directory /Peter ___ users mailing list email@example.com https://lists.open-mpi.org/mailman/listinfo/users
On Wed, 13 Sep 2017 20:13:54 +0430 Mahmood Naderan
wrote: ... > `/usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../lib64/libc.a(strcmp.o)' > can not be used when making an executable; recompile with -fPIE and > relink with -pie collect2: ld returned 1 exit status > > > With such an error, I thought it is better to forget static linking! > (as it is related to libc) and work with the shared libs and > LD_LIBRARY_PATH First, I think giving up on static linking is the right choice. If the main thing you were after was the convenience of a binary that will run without the need to setup LD_LIBRARY_PATH correctly you should have a look at passing -rpath to the linker. In short, "mpicc -Wl,-rpath=/my/lib/path helloworld.c -o hello", will compile a dynamic binary "hello" with built in search path to "/my/lib/path". With OpenMPI this will be added as a "runpath" due to how the wrappers are designed. Both rpath and runpath works for finding "/my/lib/path" wihtout LD_LIBRARY_PATH but the difference is in priority. rpath is higher priority than LD_LIBRARY_PATH etc. and runpath is lower. You can check your rpath or runpath in a binary using the command chrpath (package on rhel/centos/... is chrpath): $ chrpath hello hello: RUNPATH=/my/lib/path If what you really wanted is the rpath behavior (winning over any LD_LIBRARY_PATH in the environment etc.) then you need to modify the openmpi wrappers (rebuild openmpi) such that it does NOT pass "--enable-new-dtags" to the linker. /Peter ___ users mailing list firstname.lastname@example.org https://lists.open-mpi.org/mailman/listinfo/users
On Wed, 15 Jun 2016 15:00:05 +0530 Sreenidhi Bharathkar Ramesh
wrote: > hi Mehmet / Llolsten / Peter, > > Just curious to know what is the NIC or fabric you are using in your > respective clusters. > > If it is Mellanox, is it not better to use the MLNX_OFED ? We run both Mellanox ConnectX3 based clusters and Intel Trulescale. Today it may be warranted to look into specific driver if you're using Omnipath or newer Mellanox HCAs using the mlx5 driver (ConnectX4/ConnectIB). /Peter K > This information may help us build our cluster. Hence, asking. > > Thanks, > - Sreenidhi.
On Tue, 14 Jun 2016 13:18:33 -0400 "Llolsten Kaonga"
wrote: > Hello Grigory, > > I am not sure what Redhat does exactly but when you install the OS, > there is always an InfiniBand Support module during the installation > process. We never check/install that module when we do OS > installations because it is usually several versions of OFED behind > (almost obsolete). It's not as bad as you assume. Also as I said before it's not an OFED version at all. We (and many other medium+ HPC centers) run the redhat stack for reason that it is 1) good enough 2) not an extra complication for the system environment. /Peter K (with ~3000 hpc nodes on rhel-ib for many years)
On Tue, 14 Jun 2016 16:20:42 + Grigory Shamov <grigory.sha...@umanitoba.ca> wrote: > On 2016-06-14, 3:42 AM, "users on behalf of Peter Kjellström" > <users-boun...@open-mpi.org on behalf of c...@nsc.liu.se> wrote: > > >On Mon, 13 Jun 2016 19:04:59 -0400 > >Mehmet Belgin <mehmet.bel...@oit.gatech.edu> wrote: > > > >> Greetings! > >> > >> We have not upgraded our OFED stack for a very long time, and still > >> running on an ancient version (22.214.171.124, yeah we know). We are now > >> considering a big jump from this version to a tested and stable > >> recent version and would really appreciate any suggestions from the > >> community. > > > >Some thoughts on the subject. > > > >* Not installing an external ibstack is quite attractive imo. > > RHEL/CentOS stack (not based on any direct OFED version) works fine > > for us. It simplifies cluster maintenance (kernel updates etc.). > > > I am curious on how Redhat stack is ³not based on any direct OFED > version²? > Doesn¹t Redhat just ship an old OFED build, or they do their own > changes to it like to the kernel? No, let's define things a bit. OFED is a packaging of many opensource components with various upstreams. Simplified it draws upon kernel.org/linux-rdma for kernel side stuff and many spread out user side projects (mostly under the openfabrics umbrella). If you run an upstream kernel and pull+build, for example, the current master branch of the libraries you need you're not running any form of OFED. OFED does (mainly) three things in my view 1) pick a set of versions and test it together 2) backport the kernel side to popular enterprisy kernels 3) put it all in a complete package. Redhat does not base its ib stack on a specific OFED release. Functionality is cherry picked and backported from upstream (kernel) and user space packages are pulled directly for their respective places (and updated when needed). /Peter K
On Mon, 13 Jun 2016 19:04:59 -0400 Mehmet Belgin
wrote: > Greetings! > > We have not upgraded our OFED stack for a very long time, and still > running on an ancient version (126.96.36.199, yeah we know). We are now > considering a big jump from this version to a tested and stable > recent version and would really appreciate any suggestions from the > community. Some thoughts on the subject. * Not installing an external ibstack is quite attractive imo. RHEL/CentOS stack (not based on any direct OFED version) works fine for us. It simplifies cluster maintenance (kernel updates etc.). * If you use an external IB-stack consider the constraints it may put on your update plans (for example, you want to update to CentOS-7.3 but your OFED only supports 7.2...). * Also consider updates for the stack itself wrt. security. Upstream OFED has been quite good at patching security bus but they DO NOT maintain older releases (-> you may have to run a nightly build of latest). Mellanox has patched when poked at but also only for latest version. Intel does not seem to do security afaict and with a dist stack it's covered by the normal dist updates. /Peter K
On Tuesday, September 13, 2011 09:07:32 AM nn3003 wrote: > Hello ! > > I am running wrf model on 4x AMD 6172 which is 12 core CPU. I use OpenMPI > 1.4.3 and libgomp 4.3.4. I have binaries compiled for shared-memory and > distributed-memory (OpenMP and OpenMPI) I use following command > mpirun -np 4 --cpus-per-proc 6 --report-bindings --bysocket wrf.exe > It works ok and in top i see there are 4 wrf.exe and each has 6 threads on > cpu0-5 12-17 24-29 36-41 However, if I want to run 8 or more e.g. > mpirun -np 4 --cpus-per-proc 12 --report-bindings --bysocket wrf.exe > I get error > Your job has requested more cpus per process(rank) than there > are cpus in a socket: > Cpus/rank: 8 > #cpus/socket: 6 > > Why is that ? There are 12 cores per socket in AMD 6172. In reality a 12 core Magnycours is two 6 core dies on a socket. I'm guessing that the topology code sees your 4x 12 core as a 8x 6 core. /Peter signature.asc Description: This is a digitally signed message part.
On Wednesday, May 25, 2011 01:16:04 PM Andrew Senin wrote: > Hello list, > > I have an application which uses MPI_Allgather with derived types. It works > correctly with mpich2 and mvapich2. However it crashes periodically with > openmpi2. After investigation I found that the crash takes place when I use > derived datatypes with MPI_AllGather and number of ranks greater than 8. Would 8 happen to be the number of cores you have per node so what we're seeing is: single node OK, multi node FAIL? If so what kind of inter node network are you (trying to) use(ing)? /Peter signature.asc Description: This is a digitally signed message part.
On Wednesday, May 25, 2011 01:16:04 PM Andrew Senin wrote: > Hello list, > > I have an application which uses MPI_Allgather with derived types. It works > correctly with mpich2 and mvapich2. However it crashes periodically with > openmpi2. Which version of OpenMPI are you using? There is no such thing as openmpi2... /Peter signature.asc Description: This is a digitally signed message part.
On Wednesday, May 04, 2011 04:04:37 PM hi wrote: > Greetings !!! > > I am observing following error messages when executing attached test > program... > > > C:\test>mpirun mar_f.exe ... > [vbgyor:9920] *** An error occurred in MPI_Allreduce > [vbgyor:9920] *** on communicator MPI_COMM_WORLD > [vbgyor:9920] *** MPI_ERR_OP: invalid reduce operation > [vbgyor:9920] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) I'm not a fortran programmer but it seems to me that placing the MPI_Allreduce call in a subroutine like that broke the meaning of MPI_SUM and MPI_REAL in that scope. Adding: include 'mpif.h' after SUBROUTINE PAR_BLAS2(m, n, a, b, c, comm) helps. /Peter signature.asc Description: This is a digitally signed message part.
On Monday, March 21, 2011 12:25:37 pm Dave Love wrote: > I'm trying to test some new nodes with ConnectX adaptors, and failing to > get (so far just) IMB to run on them. ... > I'm using gcc-compiled OMPI 1.4.3 and the current RedHat 5 OFED with IMB > 3.2.2, specifying `btl openib,sm,self' (or `mtl psm' on the Qlogic > nodes). I'm not sure what else might be relevant. The output from > trying to run IMB follows, for what it's worth. > > > -- > At least one pair of MPI processes are unable to reach each other for MPI > communications. This means that no Open MPI device has indicated that it > can be used to communicate between these processes. This is an error; > Open MPI requires that all MPI processes be able to reach each other. > This error can sometimes be the result of forgetting to specify the "self" > BTL. > > Process 1 ([[25307,1],2]) is on host: lvgig116 > Process 2 ([[25307,1],12]) is on host: lvgig117 > BTLs attempted: self sm Are you sure you launched it correctly and that you have (re)built OpenMPI against your Redhat-5 ib stack? > Your MPI job is now going to abort; sorry. ... > [lvgig116:07931] 19 more processes have sent help message > help-mca-bml-r2.txt / unreachable proc [lvgig116:07931] Set MCA parameter Seems to me that OpenMPI gave up because it didn't succeed in initializing any inter-node btl/mtl. I'd suggest you try (roughly in order): 1) ibstat on all nodes to verify that your ib interfaces are up 2) try a verbs level test (like ib_write_bw) to verify data can flow 3) make sure your OpenMPI was built with the redhat libibverbs-devel present (=> a suitable openib btl is built). /Peter > "orte_base_help_aggregate" to 0 to see all help / error messages > [lvgig116:07931] 19 more processes have sent help message help-mpi-runtime > / mpi_init:startup:internal-failure signature.asc Description: This is a digitally signed message part.
On Monday, March 14, 2011 09:37:54 pm Bernardo F Costa wrote: > Ok. Native ibverbs/openib is preferable although cannot be used by all > applications (those who do not have a native ip interface). Applications (in this context at least) uses the MPI interface. MPI in general and OpenMPI in perticular can and should run on top of verbs(btl:openib) or psm(mtl:psm) (Mellanox or Qlogic repectively). /Peter signature.asc Description: This is a digitally signed message part.
On Thursday, March 10, 2011 08:30:19 pm Thierry LAMOUREUX wrote: > Hello, > > We add recently enhanced our network with Infiniband modules on a six node > cluster. > > We have install all OFED drivers related to our hardware > > We have set network IP like following : > - eth : 192.168.1.0 / 255.255.255.0 > - ib : 192.168.70.0 / 255.255.255.0 > > After first tests all seems good. IB interfaces ping each other, ssh and > other king of exchanges over IB works well. A very important thing to realise is that TCP/IP on Infiniband, while quite possible and sometimes useful, has very little to do with running MPI/OpenMPI "using" Infiniband. MPI data transport can run on either TCP/IP (btl: tcp) or natively on IB (for Mellanox btl: openib, for Qlogic mtl: psm). On top of this job startup uses TCP/IP. > Then we started to run our job thought openmpi (building with --with-openib > option) and our first results were very bad. This builds the openib btl but it wont be used runtime if there's no active ib interface (I'm _NOT_ talking about interface as listed by ifconfig). Check you IB with ibstat or similar. Also, while it's possible to run MPI traffic on the openib btl (verbs) on Qlogic cards you'll have to use the psm mtl (psm) for good performance. /Peter > After investigations, our system have the following behaviour : > - job starts over ib network (few packet are sent) > - job switch to eth network (all next packet sent to these interfaces) > > We never specified the IP Address of our eth interfaces. > > We tried to launch our jobs with the following options : > - mpirun -hostfile hostfile.list -mca blt openib,self > /common_gfs2/script-test.sh > - mpirun -hostfile hostfile.list -mca blt openib,sm,self > /common_gfs2/script-test.sh > - mpirun -hostfile hostfile.list -mca blt openib,self -mca > btl_tcp_if_exclude lo,eth0,eth1,eth2 /common_gfs2/script-test.sh > > The final behaviour remain the same : job is initiated over ib and runs > over eth. > > We grab performance tests file (osu_bw and osu_latency) and we got not so > bad results (see attached files). > > We had tried plenty of different things but we are stuck : we don't have > any error message... > > Thanks per advance for your help. > > Thierry. signature.asc Description: This is a digitally signed message part.
On Monday, January 10, 2011 03:06:06 pm Michael Di Domenico wrote: > I'm not sure if these are being reported from OpenMPI or through > OpenMPI from OpenFabrics, but i figured this would be a good place to > start > > On one node we received the below errors, i'm not sure i under the > error sequence, hopefully someone can shed some light on what > happened. > > [[5691,1],49][btl_openib_component.c:3294:handle_wc] from node27 to: ... > network is qlogic qdr end to end, openmpi 1.5 and ofed 1.5.2 (q stack) Not really addressing your problem, but, with qlogic you should be using psm, not verbs (btl_openib). That said, openib should work (slowly). /Peter signature.asc Description: This is a digitally signed message part.
On Monday 06 December 2010 15:03:13 Mathieu Gontier wrote: > Hi, > > A small update. > My colleague made a mistake and there is no arithmetic performance > issue. Sorry for bothering you. > > Nevertheless, one can observed some differences between MPICH and > OpenMPI from 25% to 100% depending on the options we are using into our > software. Tests are lead on a single SGI node on 6 or 12 processes, and > thus, I am focused on the sm option. A few previous threads on sm performance have been related to what /tmp is. OpenMPI relies on (or at least used to rely on) this being backed by page cache (tmpfs, a local ext3 or similar). I'm not sure what the behaviour is in the latest version but then again you didn't say which version you've tried. /Peter signature.asc Description: This is a digitally signed message part.
On Friday 19 November 2010 01:03:35 HeeJin Kim wrote: ... > * mlx4: There is a mismatch between the kernel and the userspace > libraries: Kernel does not support XRC. Exiting.* ... > What I'm thinking is that the infiniband card is installed but it doesn't > work in correct mode. > My linux kernel version is *2.6.18-164.el5*, and installed ofed > version is *kernel-ib-pp-1.4.1-ofed20090528r1.4.1sgi605r1.rhel5 Why don't you as a first step try the ib software that is included with EL5.4 (that is, don't install OFED). We run several clusters this way. Also, consider updating to 5.5 (the version you're on includes several security vulnerabilities). /Peter signature.asc Description: This is a digitally signed message part.
On Tuesday 05 September 2006 09:19, Aidaros Dev wrote: > Nowdays we hear about intel dual core processor, An Intel dual-core > processor consists of two complete execution cores in one physical > processor both running at the same frequency. Both cores share the same > packaging and the same interface with the chipset/memory. > Can I use MPI library to communicate these processors? Can we consider as > they are separated? You can treat one dual core processor like it was two normal single core processors. As such, MPI works fine as it does on any smp. /Peter pgpsrLaskJUbO.pgp Description: PGP signature
Hello Carsten, Have you considered the possibility that this is the effect of a non-optimal ethernet switch? I don't know how many nodes you need to reproduce it on or if you even have physical access (and opportunity) but popping in another decent 16-port switch for a testrun might be interesting. just my .02 euros, Peter On Tuesday 03 January 2006 18:45, Carsten Kutzner wrote: > On Tue, 3 Jan 2006, Graham E Fagg wrote: > > Do you have any tools such as Vampir (or its Intel equivalent) available > > to get a time line graph ? (even jumpshot of one of the bad cases such as > > the 128/32 for 256 floats below would help). > > Hi Graham, > > I have attached an slog file of an all-to-all run for 1024 floats (ompi > tuned alltoall). I could not get clog files for >32 processes - is this > perhaps a limitation of MPE? So I decided to take the case 32 CPUs on > 32 nodes which is performance-critical as well. From the run output you > can see that 2 of the 5 tries yield a fast execution while the others > are slow (see below). > > Carsten > > > > ckutzne@node001:~/mpe> mpirun -hostfile ./bhost1 -np 32 ./phas_mpe.x > Alltoall Test on 32 CPUs. 5 repetitions. > --- New category (first test not counted) --- > MPI: sending1024 floats (4096 bytes) to 32 processes ( 1 > times) took ...0.00690 seconds > - > MPI: sending1024 floats (4096 bytes) to 32 processes ( 1 > times) took ...0.00320 seconds MPI: sending1024 floats (4096 > bytes) to 32 processes ( 1 times) took ...0.26392 seconds ! MPI: > sending1024 floats (4096 bytes) to 32 processes ( 1 times) > took ...0.26868 seconds ! MPI: sending1024 floats (4096 bytes) > to 32 processes ( 1 times) took ...0.26398 seconds ! MPI: sending > 1024 floats (4096 bytes) to 32 processes ( 1 times) took ... > 0.00339 seconds Summary (5-run average, timer resolution 0.01): > 1024 floats took 0.160632 (0.143644) seconds. Min: 0.003200 max: > 0.268681 Writing logfile > Finished writing logfile. -- Peter Kjellström | National Supercomputer Centre | Sweden | http://www.nsc.liu.se pgpqBS6LxHJl2.pgp Description: PGP signature
Hello, First I'd like to say that I'm really happy and excited that public access to svn is now open :-) Here is what went fine: check-out, autogen, configure, make, ompi_info and simple mpi app (both build and run!!!) Now I'd like to control over which channels/transports/networks the data flows... I configured and built ompi against mvapi (mellanox ibgd-1.8.0) and as far as I can tell it went well. Judging by the behaviour of the tests I have done it defaults to tcp (over ethernet in my case). How do I select mvapi? Here's some detailed information: ompi-version: 1.0a1r6976 configure : --prefix=/usr/local/openmpi-svn6976/intel-8.1e-027 \ --with-btl-mvapi=/opt/ibgd/driver/infinihost compilers : icc, ifort 8.1.027 (64-bit for em64t) os : centos-4.1 64-bit (el4u1 rebuild) kernel : 2.6.9-11smp mvapi : mellanox ibgd-1.8.0 ompi_info | grep -i mvapi: MCA mpool : mvapi (MCA v1.0, API v1.0, Component v1.0) MCA btl : mvapi (MCA v1.0, API v1.0, Component v1.0) hardware : dual Xeon Nocona 2 GiB mem, mell. pci-exress HCAs tia, Peter -- Peter Kjellström | National Supercomputer Centre | Sweden | http://www.nsc.liu.se pgpEultSX5i25.pgp Description: PGP signature