Re: [OMPI users] Querying/limiting OpenMPI memory allocations

2018-12-20 Thread Gilles Gouaillardet
Adam, Are you using btl/tcp (e.g. plain TCP/IP) for internode communications ? Or are you using libfabric on top of the latest EC2 drivers ? There is no control flow in btl/tcp, which means for example if all your nodes send messages to rank 0, that can create a lot of unexpected messages on

Re: [OMPI users] Querying/limiting OpenMPI memory allocations

2018-12-20 Thread Adam Sylvester
Gilles, It is btl/tcp (we'll be upgrading to newer EC2 types next year to take advantage of libfabric). I need to write a script to log and timestamp the memory usage of the process as reported by /proc//stat and sync that up with the application's log of what it's doing to say this

Re: [OMPI users] Querying/limiting OpenMPI memory allocations

2018-12-20 Thread Adam Sylvester
This case is actually quite small - 10 physical machines with 18 physical cores each, 1 rank per machine. These are AWS R4 instances (Intel Xeon E5 Broadwell processors). OpenMPI version 2.1.0, using TCP (10 Gbps). I calculate the memory needs of my application upfront (in this case ~225 GB per

[OMPI users] Querying/limiting OpenMPI memory allocations

2018-12-20 Thread Adam Sylvester
Is there a way at runtime to query OpenMPI to ask it how much memory it's using for internal buffers? Is there a way at runtime to set a max amount of memory OpenMPI will use for these buffers? I have an application where for certain inputs OpenMPI appears to be allocating ~25 GB and I'm not

[OMPI users] CfP 7th ICCS/ALCHEMY track on Heterogeneous Computing

2018-12-20 Thread CUDENNEC Loic
Please accept our apologies if you receive multiple copies of this CfP. --- 7th ALCHEMY Track, as part of ICCS 2019 12-14th June, 2019, Faro, Algarve, Portugal Architecture, Languages, Compilation

Re: [OMPI users] Querying/limiting OpenMPI memory allocations

2018-12-20 Thread Nathan Hjelm via users
How many nodes are you using? How many processes per node? What kind of processor? Open MPI version? 25 GB is several orders of magnitude more memory than should be used except at extreme scale (1M+ processes). Also, how are you calculating memory usage? -Nathan > On Dec 20, 2018, at 4:49 AM,

[OMPI users] v2.1.1 How to utilise multiple NIC ports

2018-12-20 Thread Bob Beattie
Hi everyone, I'm working on OpenFOAM v5 and have been successful in getting two nodes working together. (both 18.04 LTS connected via GbE) As both machines have a quad port gigabit NIC I have been trying to persuade mpirun to use more than a single link on each machine for its communications,

Re: [OMPI users] Querying/limiting OpenMPI memory allocations

2018-12-20 Thread Gilles Gouaillardet
Adam, you can rewrite MPI_Allgatherv() in your app. it should simply invoke PMPI_Allgatherv() (note the leading 'P') with the same arguments followed by MPI_Barrier() in the same communicator (feel free to also MPI_Barrier() before PMPI_Allgatherv()). That can make your code slower, but it will

Re: [OMPI users] v2.1.1 How to utilise multiple NIC ports

2018-12-20 Thread Jeff Squyres (jsquyres) via users
On Dec 20, 2018, at 3:33 PM, Bob Beattie wrote: > > I'm working on OpenFOAM v5 and have been successful in getting two nodes > working together. (both 18.04 LTS connected via GbE) > As both machines have a quad port gigabit NIC I have been trying to persuade > mpirun to use more than a single

Re: [OMPI users] Querying/limiting OpenMPI memory allocations

2018-12-20 Thread Jeff Hammond
You might try replacing MPI_Allgatherv with the equivalent Send+Recv followed by Broadcast. I don't think MPI_Allgatherv is particularly optimized (since it is hard to do and not a very popular function) and it might improve your memory utilization. Jeff On Thu, Dec 20, 2018 at 7:08 AM Adam