Re: [OMPI devel] OFI issues on Open MPI v4.0.0rc1

2018-09-20 Thread George Bosilca
Sorry, I missed the 4.0 on the PR (despite being the first thing in the title). George. > On Sep 20, 2018, at 22:15 , Ralph H Castain wrote: > > That’s why we are leaving it in master - only removing it from release branch > > Sent from my iPhone > > On Sep 20, 2018, at 7:02 PM, George

Re: [OMPI devel] OFI issues on Open MPI v4.0.0rc1

2018-09-20 Thread Ralph H Castain
That’s why we are leaving it in master - only removing it from release branch Sent from my iPhone > On Sep 20, 2018, at 7:02 PM, George Bosilca wrote: > > Why not simply ompi_ignore it ? Removing a component to bring it back later > would force us to lose all history. I would a rather add an

Re: [OMPI devel] OFI issues on Open MPI v4.0.0rc1

2018-09-20 Thread George Bosilca
Why not simply ompi_ignore it ? Removing a component to bring it back later would force us to lose all history. I would a rather add an .ompi_ignore and give an opportunity to power users do continue playing with it. George. On Thu, Sep 20, 2018 at 8:04 PM Ralph H Castain wrote: > I already

Re: [OMPI devel] OFI issues on Open MPI v4.0.0rc1

2018-09-20 Thread Ralph H Castain
I already suggested the configure option, but it doesn’t solve the problem. I wouldn’t be terribly surprised to find that Cray also has an undetected problem given the nature of the issue - just a question of the amount of testing, variety of environments, etc. Nobody has to wait for the next

Re: [OMPI devel] OFI issues on Open MPI v4.0.0rc1

2018-09-20 Thread Patinyasakdikul, Thananon
I understand and agree with your point. My initial email is just out of curiosity. Howard tested this BTL for Cray in the summer as well. So this seems to only affected OPA hardware. I just remember that in the summer, I have to make some change in libpsm2 to get this BTL to work for OPA. Maybe

Re: [OMPI devel] OFI issues on Open MPI v4.0.0rc1

2018-09-20 Thread Ralph H Castain
I suspect it is a question of what you tested and in which scenarios. Problem is that it can bite someone and there isn’t a clean/obvious solution that doesn’t require the user to do something - e.g., like having to know that they need to disable a BTL. Matias has proposed an mca-based

Re: [OMPI devel] OFI issues on Open MPI v4.0.0rc1

2018-09-20 Thread Patinyasakdikul, Thananon
In the summer, I tested this BTL with along with the MTL and able to use both of them interchangeably with no problem. I dont know what changed. libpsm2? Arm On Thu, Sep 20, 2018, 7:06 PM Ralph H Castain wrote: > We have too many discussion threads overlapping on the same email chain - > so

[OMPI devel] OFI issues on Open MPI v4.0.0rc1

2018-09-20 Thread Ralph H Castain
We have too many discussion threads overlapping on the same email chain - so let’s break the discussion on the OFI problem into its own chain. We have been investigating this locally and found there are a number of conflicts between the MTLs and the OFI/BTL stepping on each other. The correct

Re: [OMPI devel] Announcing Open MPI v4.0.0rc1

2018-09-20 Thread Pavel Shamis
More UCX packages: Fedora: http://rpms.famillecollet.com/rpmphp/zoom.php?rpm=ucx OpenSUSE: https://software.opensuse.org/package/openucx On Thu, Sep 20, 2018 at 7:53 AM Yossi Itigin wrote: > Currently the target is RH 8 > And yes, UCX is also available on EPEL, for example: >

Re: [OMPI devel] Announcing Open MPI v4.0.0rc1

2018-09-20 Thread Yossi Itigin
Currently the target is RH 8 And yes, UCX is also available on EPEL, for example: https://centos.pkgs.org/7/epel-x86_64/ucx-1.3.1-1.el7.x86_64.rpm.html -Original Message- From: Peter Kjellström Sent: Thursday, September 20, 2018 2:11 PM To: Yossi Itigin Cc: Open MPI Developers

Re: [OMPI devel] Announcing Open MPI v4.0.0rc1

2018-09-20 Thread Peter Kjellström
On Thu, 20 Sep 2018 14:18:35 +0200 Peter Kjellström wrote: > On Wed, 19 Sep 2018 16:24:53 + > "Gabriel, Edgar" wrote: ... > > So bottom line, if I do > > > > mpirun –mca btl^openib –mca mtl^ofi …. > > > > my tests finish correctly, although mpirun will still return an > > error. > > I

Re: [OMPI devel] Announcing Open MPI v4.0.0rc1

2018-09-20 Thread Peter Kjellström
On Wed, 19 Sep 2018 16:24:53 + "Gabriel, Edgar" wrote: > I performed some tests on our Omnipath cluster, and I have a mixed > bag of results with 4.0.0rc1 I've also tried it on our OPA cluster (skylake+centos-7+inbox) with very similar results. > compute-1-1.local.4351PSM2 has not been

Re: [OMPI devel] Announcing Open MPI v4.0.0rc1

2018-09-20 Thread Peter Kjellström
On Thu, 20 Sep 2018 09:03:44 + Yossi Itigin wrote: > Hi, > > UCX is on the way into RH distro and will be available and ON by > default (auto-detectable by OMPI build process) automatically in the > near future. That is good to hear, already in the upcomming 7.6? > Meanwhile, user can

Re: [OMPI devel] Announcing Open MPI v4.0.0rc1

2018-09-20 Thread Peter Kjellström
On Tue, 18 Sep 2018 20:49:52 + "Jeff Squyres \(jsquyres\) via devel" wrote: > On Sep 18, 2018, at 3:46 PM, Thananon Patinyasakdikul > wrote: > > > > I tested on our cluster (UTK). I will give a thumb up but I have > > some comments. > > > > What I understand with 4.0. > > - openib btl is

Re: [OMPI devel] Announcing Open MPI v4.0.0rc1

2018-09-20 Thread Yossi Itigin
Hi, UCX is on the way into RH distro and will be available and ON by default (auto-detectable by OMPI build process) automatically in the near future. Meanwhile, user can enable UCX by two methods: 1. Download & Install UCX from openucx.org and Build openmpi with it. 2. Download HPC-X from