Re: [OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

2020-02-06 Thread Angel de Vicente via users
Hi, Joshua Ladd writes: > This is an ancient version of HCOLL. Please upgrade to the latest > version (you can do this by installing HPC-X > https://www.mellanox.com/products/hpc-x-toolkit) Just to close the circle and inform that all seems OK now. I don't have root permission in this machine

Re: [OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

2020-02-05 Thread Joshua Ladd via users
This is an ancient version of HCOLL. Please upgrade to the latest version (you can do this by installing HPC-X https://www.mellanox.com/products/hpc-x-toolkit) Josh On Wed, Feb 5, 2020 at 4:35 AM Angel de Vicente wrote: > Hi, > > Joshua Ladd writes: > > > We cannot reproduce this. On four node

Re: [OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

2020-02-05 Thread Angel de Vicente via users
Hi, Joshua Ladd writes: > We cannot reproduce this. On four nodes 20 PPN with and w/o hcoll it > takes exactly the same 19 secs (80 ranks).  > > What version of HCOLL are you using? Command line? Thanks for having a look at this. According to ompi_info, our OpenMPI (version 3.0.1) was config

Re: [OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

2020-02-04 Thread Joshua Ladd via users
We cannot reproduce this. On four nodes 20 PPN with and w/o hcoll it takes exactly the same 19 secs (80 ranks). What version of HCOLL are you using? Command line? Josh On Tue, Feb 4, 2020 at 8:44 AM George Bosilca via users < users@lists.open-mpi.org> wrote: > Hcoll will be present in many case

Re: [OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

2020-02-04 Thread George Bosilca via users
Hcoll will be present in many cases, you don’t really want to skip them all. I foresee 2 problem with the approach you propose: - collective components are selected per communicator, so even if they will not be used they are still loaded. - from outside the MPI library you have little access to int

Re: [OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

2020-02-04 Thread Angel de Vicente via users
Hi, George Bosilca writes: > If I'm not mistaken, hcoll is playing with the opal_progress in a way > that conflicts with the blessed usage of progress in OMPI and prevents > other components from advancing and timely completing requests. The > impact is minimal for sequential applications using

Re: [OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

2020-02-03 Thread George Bosilca via users
If I'm not mistaken, hcoll is playing with the opal_progress in a way that conflicts with the blessed usage of progress in OMPI and prevents other components from advancing and timely completing requests. The impact is minimal for sequential applications using only blocking calls, but is jeopardizi

[OMPI users] Trouble with Mellanox's hcoll component and MPI_THREAD_MULTIPLE support?

2020-02-03 Thread Angel de Vicente via users
Hi, in one of our codes, we want to create a log of events that happen in the MPI processes, where the number of these events and their timing is unpredictable. So I implemented a simple test code, where process 0 creates a thread that is just busy-waiting for messages from any process, and which