Re: [OMPI devel] Reminder: assign as well as request review

2017-01-27 Thread r...@open-mpi.org
eview-requests > <https://github.com/blog/2306-filter-pull-request-reviews-and-review-requests> > > This appears to include an "Awaiting review from you" filter. > Not quite a dashboard or notification, but at least a way to make the query. > > -Paul > >

Re: [OMPI devel] Problem on master

2017-01-28 Thread r...@open-mpi.org
The problems on master have been resolved - thanks for your patience. > On Jan 27, 2017, at 8:57 AM, r...@open-mpi.org wrote: > > Hello all > > There is a known issue on master that we are attempting to debug. Sadly, it > is one that only shows on multi-node operations,

[OMPI devel] OMPI v1.10.6rc1 ready for test

2017-01-30 Thread r...@open-mpi.org
Usual place: https://www.open-mpi.org/software/ompi/v1.10/ Scheduled release: Fri Feb 3rd 1.10.6 -- - Fix bug in timer code that caused problems at optimization settings greater than 2 - OSHMEM: make mmap allocator the default instead of sysv

Re: [OMPI devel] define a new ENV variable in etc/openmpi-mca-params.conf

2017-02-24 Thread r...@open-mpi.org
I think Jeff got lost in the weeds here. If you define a new MCA param in the default param file, we will automatically pick it up and it will be in the environment of your application. You don’t need to do anything. However, you checked for the wrong envar. Anything you provide is going to have

Re: [OMPI devel] Q: Using a hostfile in managed environment?

2017-02-24 Thread r...@open-mpi.org
> On Feb 24, 2017, at 11:57 AM, Thomas Naughton wrote: > > Hi, > > We're trying to track down some curious behavior and decided to take a step > back and check a base assumption. > > When running within a managed environment (job allocation): > >Q: Should you be able to use `--hostfile` o

Re: [OMPI devel] v2.1.0rc1 has been released

2017-02-26 Thread r...@open-mpi.org
> On Feb 26, 2017, at 6:47 AM, Jeff Squyres (jsquyres) > wrote: > > - Should be none. > ^^^ JMS Did we change --host or --hostfile behavior? > Not that I am aware of - we talked about it, but I think we concluded that we couldn’t/shouldn’t as that would constitute a change to “3.0” in the r

[OMPI devel] Launch scaling change

2017-03-05 Thread r...@open-mpi.org
Hello folks PR https://github.com/open-mpi/ompi/pull/2916 contains modifications that will significantly improve launch performance when launching via mpirun at scale. It contains two changes: 1. it pushes all mapping operations to the backend compu

Re: [OMPI devel] [2.1.0rc2] PMIX build failures

2017-03-06 Thread r...@open-mpi.org
Thanks Paul. Actually, all .c files are required to include pmix_config.h at the very beginning of the file, before anything else. Sounds like some files are missing it, so we’ll have to go back and fix those. > On Mar 6, 2017, at 2:35 PM, Paul Hargrove wrote: > > Ralph, > > I found a couple

Re: [OMPI devel] [2.1.0rc2] ring_c SEGV on OpenBSD/i386

2017-03-06 Thread r...@open-mpi.org
I’m not sure what could be going on here. I take it you were able to run this example for the 2.0 series under this environment, yes? This code hasn’t changed since that release, so I’m not sure why it would be failing to resolve symbols now. > On Mar 6, 2017, at 2:22 PM, Paul Hargrove wrote:

Re: [OMPI devel] I have problem in nancy site with Open MPI

2017-03-24 Thread r...@open-mpi.org
The problem is likely to be a firewall between the target node and the node where mpirun is executing - see the error message and suggested causes: > * not finding the required libraries and/or binaries on > one or more nodes. Please check your PATH and LD_LIBRARY_PATH > settings, or config

Re: [OMPI devel] about PSM2 in openmpi

2017-03-28 Thread r...@open-mpi.org
How did you compile your app? Using mpicc? > On Mar 28, 2017, at 9:55 AM, Dahai Guo via devel > wrote: > > I installed intel PSM2 and then configured open mpi as follow. > > > ./configure \ > --prefix=$HOME/ompi_install \ > --with-psm2=$HOME/PSM2_install/usr \ > --with-psm2-libdir=$HOME

Re: [OMPI devel] Pull request: LANL-XXX tests failing

2017-03-30 Thread r...@open-mpi.org
You didn’t do anything wrong - the Jenkins test server at LANL is having a problem. > On Mar 30, 2017, at 8:22 AM, DERBEY, NADIA wrote: > > Hi, > > I just created a pull request and I have a failure in 4/8 checks, all related > to LANL: > > LANL-CI > LANL-OS-X > LANL-disable-dlopen > LANL-di

Re: [OMPI devel] bug in MPI_Comm_accept?

2017-04-04 Thread r...@open-mpi.org
There is a particular use-case that is not currently supported, but will be fixed as time permits. Jobs launched by the same mpirun can currently execute MPI_Comm_connect/accept. > On Apr 4, 2017, at 5:33 AM, Kawashima, Takahiro > wrote: > > I filed a PR against v1.10.7 though v1.10.7 may no

Re: [OMPI devel] external hwloc causing libevent problems?

2017-04-05 Thread r...@open-mpi.org
Not quite the problem I mentioned. The problem arises if you want external hwloc, but internal libevent - and both have external versions in (say) /usr. If you point hwloc there, then the -I and -L flags will cause us to pull in the /usr libevent versions instead of the internal ones - and havoc

Re: [OMPI devel] external hwloc causing libevent problems?

2017-04-05 Thread r...@open-mpi.org
t variation on the > theme I’m missing. > > Brian > >> On Apr 5, 2017, at 11:36 AM, r...@open-mpi.org wrote: >> >> Not quite the problem I mentioned. The problem arises if you want external >> hwloc, but internal libevent - and both have external versions in (say)

Re: [OMPI devel] anybody ported OMPI to hwloc 2.0 API?

2017-04-05 Thread r...@open-mpi.org
It hasn’t come into master - dunno if someone has it sitting on a branch somewhere (I don’t see a PR that indicates it) > On Apr 5, 2017, at 11:56 AM, Brice Goglin wrote: > > Hello > > Did anybody start porting OMPI to the new hwloc 2.0 API (currently in > hwloc git master)? > Gilles, I seem t

Re: [OMPI devel] Problem with bind-to

2017-04-05 Thread r...@open-mpi.org
I believe this has been fixed now - please let me know > On Mar 30, 2017, at 1:57 AM, Cyril Bordage wrote: > > Hello, > > I am using the git version of MPI with "-bind-to core -report-bindings" > and I get that for all processes: > [miriel010:160662] MCW rank 0 not bound > > > When I use an o

[OMPI devel] Topics for Tues telecon

2017-04-08 Thread r...@open-mpi.org
Hi folks There are a few things I’d like to cover on Tuesday’s call: * review of detailed launch timings - I’m seeing linear scaling vs ppn for the initialization code at the very beginning of MPI_Init. This consists of the following calls: ompi_hook_base_mpi_init_top ompi_mpi_t

Re: [OMPI devel] Problem with bind-to

2017-04-13 Thread r...@open-mpi.org
t; Can you be a bit more specific? >>>> >>>> - What version of Open MPI are you using? >>>> - How did you configure Open MPI? >>>> - How are you launching Open MPI applications? >>>> >>>> >>>>> On Apr 13, 2017, at 9:0

Re: [OMPI devel] Problem with bind-to

2017-04-13 Thread r...@open-mpi.org
have also that. But not > when I run it from a login node (with the same machine file). > > > Cyril. > > Le 13/04/2017 à 16:22, r...@open-mpi.org a écrit : >> We are asking all these questions because we cannot replicate your problem - >> so we are trying to help you

Re: [OMPI devel] Problem with bind-to

2017-04-13 Thread r...@open-mpi.org
>> [miriel026:163996] MCW rank 45 not bound >> [miriel026:163979] MCW rank 28 not bound >> [miriel026:163990] MCW rank 39 not bound >> [miriel026:163976] MCW rank 25 not bound >> [miriel026:163997] MCW rank 46 not bound >> [miriel025:60971] MCW rank 2 not bound >

Re: [OMPI devel] Problem with bind-to

2017-04-13 Thread r...@open-mpi.org
not bound > [miriel025:62350] MCW rank 6 not bound > [miriel026:165146] MCW rank 30 not bound > [miriel025:62352] MCW rank 7 not bound > [miriel025:62354] MCW rank 8 not bound > [miriel026:165148] MCW rank 31 not bound > [miriel025:62356] MCW rank 9 not bound > [miriel025:62358

Re: [OMPI devel] Problem with bind-to

2017-04-13 Thread r...@open-mpi.org
performace is far better). > > Le 13/04/2017 à 17:24, r...@open-mpi.org a écrit : >> Okay, so as far as OMPI is concerned, it correctly bound everyone! So how >> are you generating this output claiming it isn’t bound? >> >>> On Apr 13, 2017, at 7:57 AM,

Re: [OMPI devel] Problem with bind-to

2017-04-14 Thread r...@open-mpi.org
ocket 1[core 11[hwt 0]]: > [./././././././.][./././B/./././.] > [n2:53187] MCW rank 24 bound to socket 0[core 4[hwt 0]]: > [././././B/././.][./././././././.] > [n2:53187] MCW rank 25 bound to socket 1[core 12[hwt 0]]: > [./././././././.][././././B/././.] > [n2:53187] MCW rank 26 bound to socket 0[

Re: [OMPI devel] Problem with bind-to

2017-04-14 Thread r...@open-mpi.org
Ah, wait - I had missed your bind-to core directive. With that, it does indeed behave poorly, so I can now replicate. > On Apr 14, 2017, at 2:21 AM, r...@open-mpi.org wrote: > > Sorry, but both of your non-working examples work fine for me: > > $ mpirun -n 16 -host rhc002:16 --

Re: [OMPI devel] Program which runs wih 1.8.3, fails with 2.0.2

2017-04-19 Thread r...@open-mpi.org
Fully expected - if ORTE can’t start one or more daemons, then the MPI job itself will never be executed. There was an SGE integration issue in the 2.0 series - I fixed it, but IIRC it didn’t quite make the 2.0.2 release. In fact, I just checked and it did indeed miss that release. You have th

Re: [OMPI devel] openib oob module

2017-04-20 Thread r...@open-mpi.org
Hi Shiqing! Been a long time - hope you are doing well. I see no way to bring the oob module back now that the BTLs are in the OPAL layer - this is why it was removed as the oob is in ORTE, and thus not accessible from OPAL. Ralph > On Apr 20, 2017, at 6:02 AM, Shiqing Fan wrote: > > Dear al

Re: [OMPI devel] openib oob module

2017-04-20 Thread r...@open-mpi.org
devel [mailto:devel-boun...@lists.open-mpi.org] On Behalf Of > r...@open-mpi.org > Sent: Thursday, April 20, 2017 3:49 PM > To: OpenMPI Devel > Subject: Re: [OMPI devel] openib oob module > > Hi Shiqing! > > Been a long time - hope you are doing well. > > I see n

Re: [OMPI devel] openib oob module

2017-04-20 Thread r...@open-mpi.org
t test. Here is the updated output file. > > Thanks, > Shiqing > > From: devel [mailto:devel-boun...@lists.open-mpi.org] On Behalf Of > r...@open-mpi.org > Sent: Thursday, April 20, 2017 4:29 PM > To: OpenMPI Devel > Subject: Re: [OMPI devel] openib oob module >

Re: [OMPI devel] openib oob module

2017-04-21 Thread r...@open-mpi.org
openib oob module > > Folks, > > > fwiw, i made https://github.com/open-mpi/ompi/pull/3393 and it works for me > on a mlx4 cluster (Mellanox QDR) > > > Cheers, > > > Gilles > > > On 4/21/2017 1:31 AM, r...@open-mpi.org wrote: >> I

Re: [OMPI devel] remote spawn - have no children

2017-05-03 Thread r...@open-mpi.org
The orte routed framework does that for you - there is an API for that purpose. > On May 3, 2017, at 12:17 AM, Justin Cinkelj wrote: > > Important detail first: I get this message from significantly modified Open > MPI code, so problem exists solely due to my mistake. > > Orterun on 192.168.1

Re: [OMPI devel] remote spawn - have no children

2017-05-03 Thread r...@open-mpi.org
system. Note that the output has nothing to do with spawning your mpi_hello - it is solely describing the startup of the daemons. > On May 3, 2017, at 6:26 AM, r...@open-mpi.org wrote: > > The orte routed framework does that for you - there is an API for that > purpose. > > >

Re: [OMPI devel] remote spawn - have no children

2017-05-03 Thread r...@open-mpi.org
send command to orted to start mpi > application? > Which event names should I search for? > > Thank you, > Justin > > - Original Message - >> From: r...@open-mpi.org >> To: "OpenMPI Devel" >> Sent: Wednesday, May 3, 2017 3:29:16 PM >> S

Re: [OMPI devel] v3 branch - Problem with LSF

2017-05-05 Thread r...@open-mpi.org
I would suggest not bringing it over in isolation - we planned to do an update that contains a lot of related changes, including the PMIx update. Probably need to do that pretty soon given the June target. > On May 5, 2017, at 3:04 PM, Vallee, Geoffroy R. wrote: > > Hi, > > I am running some

Re: [OMPI devel] Open MPI 3.x branch naming

2017-05-05 Thread r...@open-mpi.org
+1 Go for it :-) > On May 5, 2017, at 2:34 PM, Barrett, Brian via devel > wrote: > > To be clear, we’d do the move all at once on Saturday morning. Things that > would change: > > 1) nightly tarballs would rename from openmpi-v3.x--.tar.gz > to openmpi-v3.0.x--.tar.gz > 2) nightly tarballs

Re: [OMPI devel] orte-clean not cleaning left over temporary I/O files in /tmp

2017-05-08 Thread r...@open-mpi.org
What version of OMPI are you using? > On May 8, 2017, at 8:56 AM, Christoph Niethammer wrote: > > Hello > > According to the manpage "...orte-clean attempts to clean up any processes > and files left over from Open MPI jobs that were run in the past as well as > any currently running jobs. Th

[OMPI devel] OMPI v1.10.7rc1 ready for evaluation

2017-05-12 Thread r...@open-mpi.org
Hi folks We want/need to release a final version of the 1.10 series that will contain all remaining cleanups. Please take a gander at it. https://www.open-mpi.org/software/ompi/v1.10/ Changes: 1.10.7 -- - Fix bug in TCP BTL that impacted per

Re: [OMPI devel] Quick help with OMPI_COMM_WORLD_LOCAL_RANK

2017-05-12 Thread r...@open-mpi.org
That’s a pretty ancient release, but a quick glance at the source code indicates that you should always see it when launched via mpirun, and never when launched via srun > On May 12, 2017, at 9:22 AM, Kumar, Amit wrote: > > Dear OpenMPI, > > Under what circumstances I would find that OMPI_C

Re: [OMPI devel] Quick help with OMPI_COMM_WORLD_LOCAL_RANK

2017-05-12 Thread r...@open-mpi.org
If you configure with --enable-debug, then you can set the following mca params on your cmd line: --mca plm_base_verbose 5 will show you the details of the launch --mca odls_base_verbose 5 will show you the details of the fork/exec > On May 12, 2017, at 10:30 AM, Kumar, Amit wrote: > > > >

Re: [OMPI devel] Socket buffer sizes

2017-05-15 Thread r...@open-mpi.org
Thanks - already done, as you say > On May 15, 2017, at 7:32 AM, Håkon Bugge wrote: > > Dear Open MPIers, > > > Automatic tuning of socket buffers has been in the linux kernel since > 2.4.17/2.6.7. That is some time ago. I remember, at the time, that we removed > the default setsockopt() for

Re: [OMPI devel] Combining Binaries for Launch

2017-05-15 Thread r...@open-mpi.org
So long as both binaries use the same OMPI version, I can’t see why there would be an issue. It sounds like you are thinking of running an MPI process on the GPU itself (instead of using an offload library)? People have done that before - IIRC, the only issue is trying to launch a process onto t

[OMPI devel] Updating the v1.10.7 tag

2017-05-19 Thread r...@open-mpi.org
Hi folks I apparently inadvertently tagged the wrong hash the other night when tagging v1.10.7. I have corrected it, but if you updated your clone _and_ checked out the v1.10.7 tag in the interim, you might need to manually delete the tag on your clone and re-pull. It’s trivial to do: $ git t

Re: [OMPI devel] Updating the v1.10.7 tag

2017-05-19 Thread r...@open-mpi.org
is a good idea to do it. > On May 19, 2017, at 8:03 AM, Jeff Squyres (jsquyres) > wrote: > > On May 19, 2017, at 5:06 AM, r...@open-mpi.org wrote: >> >> $ git tag -d v1.10.7 >> $ git pull (or whatever y

Re: [OMPI devel] Updating the v1.10.7 tag

2017-05-19 Thread r...@open-mpi.org
, since that release series is now > effectively done. > > More specifically: hopefully everyone does the "git tag -d ..." instructions > and this becomes a moot point. > > > >> On May 19, 2017, at 11:25 AM, r...@open-mpi.org wrote: >> >> I would o

[OMPI devel] Stale PRs

2017-05-26 Thread r...@open-mpi.org
Hey folks We’re seeing a number of stale PRs hanging around again - these are PRs that were submitted against master (in some cases, months ago) that cleared CI and were never committed. Could people please take a look at their PRs and either commit them or delete them? We are trying to get 3.

[OMPI devel] Please turn off MTT on v1.10

2017-05-30 Thread r...@open-mpi.org
The v1.10 series is closed and no new commits will be made to that branch. So please turn off any MTT runs you have scheduled for that branch - this will allow people to commit tests that will not run on the v1.10 series. Thanks Ralph ___ devel mailin

Re: [OMPI devel] PMIX busted

2017-05-31 Thread r...@open-mpi.org
No - I just rebuilt it myself, and I don’t see any relevant MTT build failures. Did you rerun autogen? > On May 31, 2017, at 7:02 AM, George Bosilca wrote: > > I have problems compiling the current master. Anyone else has similar issues ? > > George. > > > CC base/ptl_base_frame.l

Re: [OMPI devel] PMIX busted

2017-05-31 Thread r...@open-mpi.org
Sorry for the hassle... > On May 31, 2017, at 7:31 AM, George Bosilca wrote: > > After removing all leftover files and redoing the autogen things went back to > normal. Sorry for the noise. > > George. > > > > On Wed, May 31, 2017 at 10:06 AM, r...@ope

Re: [OMPI devel] Open MPI 3.x branch naming

2017-05-31 Thread r...@open-mpi.org
> On May 31, 2017, at 7:48 AM, Jeff Squyres (jsquyres) > wrote: > > On May 30, 2017, at 11:37 PM, Barrett, Brian via devel > wrote: >> >> We have now created a v3.0.x branch based on today’s v3.x branch. I’ve >> reset all outstanding v3.x PRs to the v3.0.x branch. No one has permissions

Re: [OMPI devel] mapper issue with heterogeneous topologies

2017-05-31 Thread r...@open-mpi.org
I don’t believe we check topologies prior to making that decision - this is why we provide map-by options. Seems to me that this oddball setup has a simple solution - all he has to do is set a mapping policy for that environment. Can even be done in the default mca param file. I wouldn’t modify

Re: [OMPI devel] Time to remove Travis?

2017-06-01 Thread r...@open-mpi.org
I’d vote to remove it - it’s too unreliable anyway > On Jun 1, 2017, at 6:30 AM, Jeff Squyres (jsquyres) > wrote: > > Is it time to remove Travis? > > I believe that the Open MPI PRB now covers all the modern platforms that > Travis covers, and we have people actively maintaining all of the m

[OMPI devel] Master MTT results

2017-06-01 Thread r...@open-mpi.org
Hey folks I scanned the nightly MTT results from last night on master, and the RTE looks pretty solid. However, there are a LOT of onesided segfaults occurring, and I know that will eat up people’s disk space. Just wanted to ensure folks were aware of the problem Ralph

[OMPI devel] ompi_info "developer warning"

2017-06-02 Thread r...@open-mpi.org
I keep seeing this when I run ompi_info --all: ** *** DEVELOPER WARNING: A field in ompi_info output is too long and *** will appear poorly in the prettyprint output. *** *** Value: "MCA (disabled) pml monitoring" ***

Re: [OMPI devel] ompi_info "developer warning"

2017-06-04 Thread r...@open-mpi.org
behavior with > > OMPI_MCA_sharedfp=^lockedfile ompi_info --all > > > one option is to bump centerpoint (opal/runtime/opal_info_support.c) from 24 > to something larger, > an other option is to mark disabled components with a shorter string, for > example > "MCA

Re: [OMPI devel] ompi_info "developer warning"

2017-06-05 Thread r...@open-mpi.org
I added the change to https://github.com/open-mpi/ompi/pull/3651 <https://github.com/open-mpi/ompi/pull/3651>. We’ll just have to hope that people intuitively understand that “-“ means “disabled”. > On Jun 5, 2017, at 7:01 AM, r...@open-mpi.org wrote: > > Fine with me - I don’t

Re: [OMPI devel] ompi_info "developer warning"

2017-06-05 Thread r...@open-mpi.org
s, > > > Gilles > > - Original Message ----- > > So we are finally getting rid of the 80 chars per line limit? > > George. > > > > On Sun, Jun 4, 2017 at 11:23 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> > mailto:r...@open-mpi.org>>

[OMPI devel] SLURM 17.02 support

2017-06-13 Thread r...@open-mpi.org
Hey folks Brian brought this up today on the call, so I spent a little time investigating. After installing SLURM 17.02 (with just --prefix as config args), I configured OMPI with just --prefix config args. Getting an allocation and then executing “srun ./hello” failed, as expected. However, c

[OMPI devel] Coverity strangeness

2017-06-15 Thread r...@open-mpi.org
I’m trying to understand some recent coverity warnings, and I confess I’m a little stumped - so I figured I’d ask out there and see if anyone has a suggestion. This is in the PMIx repo, but it is reported as well in OMPI (down in opal/mca/pmix/pmix2x/pmix). The warnings all take the following fo

Re: [OMPI devel] Coverity strangeness

2017-06-16 Thread r...@open-mpi.org
, and hence the >> false positive >> >> >> if you have contacts at coverity, it would be interesting to report this >> false positive >> >> >> >> Cheers, >> >> >> Gilles >> >> >> On 6/16/2017 1

Re: [OMPI devel] SLURM 17.02 support

2017-06-19 Thread r...@open-mpi.org
> I think a helpful error message would suffice. >> >> Howard >> >> r...@open-mpi.org schrieb am Di. 13. Juni 2017 um 11:15: >> Hey folks >> >> Brian brought this up today on the call, so I spent a little time >> investigating. After installing SL

Re: [OMPI devel] orte-clean not cleaning left over temporary I/O files in /tmp

2017-06-20 Thread r...@open-mpi.org
I updated orte-clean in master, and for v3.0, so it cleans up all both current and legacy session directory files as well as any pmix artifacts. I don’t see any files named OMPI_*.sm, though that might be something from v2.x? I don’t recall us ever making files of that name before - anything we

[OMPI devel] Abstraction violation!

2017-06-22 Thread r...@open-mpi.org
I don’t understand what someone was thinking, but you CANNOT #include “mpi.h” in opal/util/info.c. It has broken pretty much every downstream project. Please fix this! Ralph ___ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortiu

Re: [OMPI devel] Abstraction violation!

2017-06-22 Thread r...@open-mpi.org
when > opal/util/info.c hasn’t included mpi.h. That seems odd, but so does info > being in opal. > > Brian > >> On Jun 22, 2017, at 3:46 PM, r...@open-mpi.org wrote: >> >> I don’t understand what someone was thinking, but you CANNOT #include >> “mpi.h” i

Re: [OMPI devel] Abstraction violation!

2017-06-22 Thread r...@open-mpi.org
available machines have an mpi.h somewhere in the default path because we always install _something_. I wonder if our master would fail in a distro that didn’t have an MPI installed... > On Jun 22, 2017, at 5:02 PM, r...@open-mpi.org wrote: > > It apparently did come in that way. We just n

Re: [OMPI devel] orterun busted

2017-06-23 Thread r...@open-mpi.org
Odd - I guess my machine is just consistently lucky, as was the CI’s when this went thru. The problem field is actually stale - we haven’t used it in years - so I simply removed it from orte_process_info. https://github.com/open-mpi/ompi/pull/3741 S

[OMPI devel] PMIx Working Groups: Call for participants

2017-06-26 Thread r...@open-mpi.org
Hello all There are two new PMIx working groups starting up to work on new APIs and attributes to support application/tool interactions with the system management stack in the following areas: 1. tiered storage support - prepositioning of files/binaries/libraries, directed hot/warm/cold storag

Re: [OMPI devel] SLURM 17.02 support

2017-06-27 Thread r...@open-mpi.org
> > Don’t think it really matters, since v2.x probably wasn’t what the customer > wanted. > > Brian > >> On Jun 19, 2017, at 7:18 AM, Howard Pritchard > <mailto:hpprit...@gmail.com>> wrote: >> >> Hi Ralph >> >> I think the alternativ

Re: [OMPI devel] Open MPI 3.0.0 first release candidate posted

2017-06-29 Thread r...@open-mpi.org
I tracked down a possible source of the oob/tcp error - this should address it, I think: https://github.com/open-mpi/ompi/pull/3794 > On Jun 29, 2017, at 3:14 PM, Howard Pritchard wrote: > > Hi Brian, > > I tested this rc using both srun native lau

[OMPI devel] Issue/PR tagging

2017-07-19 Thread r...@open-mpi.org
Hey folks I know we made some decisions last week about how to tag issues and PRs to make things easier to track for release branches, but the wiki notes don’t cover what we actually decided to do. Can someone briefly summarize? I honestly have forgotten if we tag issues, or tag PRs Ralph ___

Re: [OMPI devel] Issue/PR tagging

2017-07-19 Thread r...@open-mpi.org
ue is fixed in master, but not merged into branches, don’t close the > issue > > I think that’s about it. There’s some workflows we want to build to automate > enforcing many of these things, but for now, it’s just hints to help the RMs > not lose track of issues. > > Brian

Re: [OMPI devel] LD_LIBRARY_PATH and environment variables not getting set in remote hosts

2017-07-20 Thread r...@open-mpi.org
You must be kidding - 1.2.8??? We wouldn’t even know where to begin to advise you on something that old - I’m actually rather surprised it even compiled on a new Linux. > On Jul 20, 2017, at 4:22 AM, saisilpa b via devel > wrote: > > HI Gilles, > > Thanks for your immediate response. > > I

Re: [OMPI devel] hwloc 2 thing

2017-07-20 Thread r...@open-mpi.org
Yes - I have a PR about cleared that will remove the hwloc2 install. It needs to be redone > On Jul 20, 2017, at 8:18 PM, Howard Pritchard wrote: > > Hi Folks, > > I'm noticing that if I pull a recent version of master with hwloc 2 support > into my local repo, that my autogen,pl run fails u

Re: [OMPI devel] hwloc 2 thing

2017-07-22 Thread r...@open-mpi.org
braries while running. > > thanks, > silpa > > > > > On Friday, 21 July 2017 8:52 AM, "r...@open-mpi.org" > wrote: > > > Yes - I have a PR about cleared that will remove the hwloc2 install. It needs > to be redone > >> On Jul 20, 2017

Re: [OMPI devel] PMIX visibility

2017-07-25 Thread r...@open-mpi.org
Ouch - sorry about that. pmix_setenv is actually defined down in the code base, so let me investigate why it got into pmix_common. > On Jul 24, 2017, at 10:26 PM, George Bosilca wrote: > > The last PMIX import broke the master on all platforms that support > visibility. I have pushed a patch t

Re: [OMPI devel] PMIX visibility

2017-07-25 Thread r...@open-mpi.org
George - I believe this PR fixes the problems. At least, it now runs on OSX for me: https://github.com/open-mpi/ompi/pull/3957 <https://github.com/open-mpi/ompi/pull/3957> > On Jul 25, 2017, at 5:27 AM, r...@open-mpi.org wrote: > > Ouch - sorry about that. pmix_setenv is a

Re: [OMPI devel] Verbosity for "make check"

2017-08-08 Thread r...@open-mpi.org
Okay, I’ll update that PR accordingly > On Aug 8, 2017, at 10:51 AM, Jeff Squyres (jsquyres) > wrote: > > Per our discussion on the webex today about getting verbosity out of running > "make check" (e.g., to see what the heck is going on in > https://github.com/open-mpi/ompi/pull/4028). > >

[OMPI devel] Stale PRs

2017-08-30 Thread r...@open-mpi.org
Hey folks This is getting ridiculous - we have PRs sitting on GitHub that are more than a year old! If they haven’t been committed in all that time, they can’t possibly be worth anything now. Would people _please_ start paying attention to their PRs? Either close them, or update/commit them.

Re: [OMPI devel] [2.1.2rc3] libevent SEGV on FreeBSD/amd64

2017-08-30 Thread r...@open-mpi.org
Yeah, that caught my eye too as that is impossibly large. We only have a handful of active queues - looks to me like there is some kind of alignment issue. Paul - has this configuration worked with prior versions of OMPI? Or is this something new? Ralph > On Aug 30, 2017, at 4:17 PM, Larry Ba

Re: [OMPI devel] Stale PRs

2017-08-31 Thread r...@open-mpi.org
Thanks to those who made a first pass at these old PRs. The oldest one is now dated Dec 2015 - nearly a two-year old change for large messages over the TCP BTL, waiting for someone to commit. > On Aug 30, 2017, at 7:34 AM, r...@open-mpi.org wrote: > > Hey folks > > This is get

Re: [OMPI devel] Stale PRs

2017-08-31 Thread r...@open-mpi.org
> we have today, unfortunately not perfect as it would require additions to the > configure. Waiting for reviews. > > George. > > > On Thu, Aug 31, 2017 at 10:12 AM, r...@open-mpi.org > <mailto:r...@open-mpi.org> mailto:r...@open-mpi.org>> > wrote: > Th

Re: [OMPI devel] configure --with paths don't allow for absolute path specification

2017-09-02 Thread r...@open-mpi.org
I’m honestly confused by this as I don’t understand what you are trying to accomplish. Neither OMPI nor PMIx uses those headers. PMIx provides them just as a convenience for anyone wanting to compile a PMI based code, and so that we could internally write functions that translate from PMI to the

Re: [OMPI devel] configure --with paths don't allow for absolute path specification

2017-09-02 Thread r...@open-mpi.org
g --with-pmi=/usr/include/slurm if pmi.h and pmi2.h are > installed *only* in /usr/include/slurm. > > > On Saturday, September 2, 2017 9:55 AM, "r...@open-mpi.org" > wrote: > > > I’m honestly confused by this as I don’t understand what you are trying to

Re: [OMPI devel] Open MPI 3.1 Feature List

2017-09-05 Thread r...@open-mpi.org
We currently have PMIx v2.1.0beta in OMPI master. This includes cross-version support - i.e., OMPI v3.1 would be able to run against an RM using any PMIx version. At the moment, the shared memory (or dstore) support isn’t working across versions, but I’d consider that a “bug” that will hopefully

Re: [OMPI devel] Stale PRs

2017-09-06 Thread r...@open-mpi.org
then and probably is no longer even relevant (and has lots of conflicts as a result) Ralph > On Aug 31, 2017, at 11:15 AM, r...@open-mpi.org wrote: > > Thanks George - wasn’t picking on you, just citing the oldest one on the > list. Once that goes in, I’ll be poking the next :-)

[OMPI devel] ORTE DVM update

2017-09-18 Thread r...@open-mpi.org
Hi all The DVM on master is working again. You will need to use the new “prun” tool instead of “orterun” to submit your jobs - note that “prun” automatically finds the DVM, and so there is no longer any need to have orte-dvm report its URI, nor does prun take the “-hnp” argument. The “orte-ps”

Re: [OMPI devel] Map by socket broken in 3.0.0?

2017-10-03 Thread r...@open-mpi.org
Found the bug - see https://github.com/open-mpi/ompi/pull/4291 Will PR for the next 3.0.x release > On Oct 2, 2017, at 9:55 PM, Ben Menadue wrote: > > Hi, > > I having trouble using map by socket on remote nodes. > > Running on the same node as mp

[OMPI devel] Jenkins nowhere land again

2017-10-03 Thread r...@open-mpi.org
We are caught between two infrastructure failures: Mellanox can’t pull down a complete PR OMPI is hanging on the OS-X server Can someone put us out of our misery? Ralph ___ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman

Re: [OMPI devel] Jenkins nowhere land again

2017-10-03 Thread r...@open-mpi.org
ilder are to either wait until Nathan or I > get home and get our servers running again or to not test OS X (which has its > own problems). I don’t have a strong preference here, but I also don’t want > to make the decision unilaterally. > > Brian > > >> On Oct 3, 2017,

Re: [OMPI devel] HWLOC / rmaps ppr build failure

2017-10-04 Thread r...@open-mpi.org
Hmmm...I suspect this is a hwloc v2 vs v1 issue. I’ll fix it > On Oct 4, 2017, at 10:54 AM, Barrett, Brian via devel > wrote: > > It looks like a change in either HWLOC or the rmaps ppr component is causing > Cisco build failures on master for the last couple of days: > > https://mtt.open-mp

Re: [OMPI devel] Cuda build break

2017-10-04 Thread r...@open-mpi.org
I’ll fix > On Oct 4, 2017, at 10:57 AM, Sylvain Jeaugey wrote: > > See my last comment on #4257 : > > https://github.com/open-mpi/ompi/pull/4257#issuecomment-332900393 > > We should completely disable CUDA in hwloc. It is breaking the build, but > more importantly, it creates an extra depende

Re: [OMPI devel] HWLOC / rmaps ppr build failure

2017-10-04 Thread r...@open-mpi.org
Thanks! Fix is here: https://github.com/open-mpi/ompi/pull/4301 > On Oct 4, 2017, at 11:10 AM, Brice Goglin wrote: > > Looks like you're using a hwloc < 1.11. If you want to support this old > API while using the 1.11 names, you can add this to OMPI

Re: [OMPI devel] Cuda build break

2017-10-04 Thread r...@open-mpi.org
Fix is here: https://github.com/open-mpi/ompi/pull/4301 <https://github.com/open-mpi/ompi/pull/4301> > On Oct 4, 2017, at 11:19 AM, Jeff Squyres (jsquyres) > wrote: > > Thanks Ralph. > >> On Oct 4, 2017, at 2:07 PM, r...@open-mpi.org wrote: >> >> I’ll

Re: [OMPI devel] Enable issue tracker for ompi-www repo?

2017-11-04 Thread r...@open-mpi.org
Hi Chris It was just an oversight - I have turned on the issue tracker, so feel free to post, or a PR is also welcome Ralph > On Nov 4, 2017, at 5:03 AM, Gilles Gouaillardet > wrote: > > Chris, > > feel free to issue a PR, or fully describe the issue so a developer > can update the FAQ acc

Re: [OMPI devel] hwloc 2 thing

2017-12-13 Thread r...@open-mpi.org
the work between the hosts.. > > > Thanks for your help. > > Best regards, > Silpa > > Sent from Yahoo Mail on Android > <https://overview.mail.yahoo.com/mobile/?.src=Android> > On Sat, Jul 22, 2017 at 6:28 PM, r...@open-mpi.org > wrote: > You’ll

Re: [OMPI devel] hwloc2 and cuda and non-default cudatoolkit install location

2017-12-20 Thread r...@open-mpi.org
FWIW: what we do in PMIx (where we also have some overlapping options) is to add in OMPI a new --enable-pmix-foo option and then have the configury in the corresponding OMPI component convert it to use inside of the embedded PMIx itself. It isn’t a big deal - just have to do a little code to sav

Re: [OMPI devel] cannot push directly to master anymore

2018-01-31 Thread r...@open-mpi.org
> On Jan 31, 2018, at 7:36 AM, Jeff Squyres (jsquyres) > wrote: > > On Jan 31, 2018, at 10:14 AM, Gilles Gouaillardet > wrote: >> >> I tried to push some trivial commits directly to the master branch and >> was surprised that is no more allowed. >> >> The error message is not crystal clear

Re: [OMPI devel] cannot push directly to master anymore

2018-01-31 Thread r...@open-mpi.org
> On Jan 31, 2018, at 8:41 AM, Jeff Squyres (jsquyres) > wrote: > > On Jan 31, 2018, at 11:33 AM, r...@open-mpi.org wrote: >> >> If CI takes 30 min, then not a problem - when CI takes 6 hours (as it >> sometimes does), then that’s a different story.

Re: [OMPI devel] hwloc issues in this week telcon?

2018-01-31 Thread r...@open-mpi.org
hwloc2 is for OMPI 4.0, not 3.1. > On Jan 31, 2018, at 3:28 PM, Brice Goglin wrote: > > Hello > > Two hwloc issues are listed in this week telcon: > > "hwloc2 WIP, may need help with." > https://github.com/open-mpi/ompi/pull/4677 > * Is this really a 3.0.1 thing? I thought hwloc2 was only for

[OMPI devel] Fabric manager interactions: request for comments

2018-02-05 Thread r...@open-mpi.org
Hello all The PMIx community is starting work on the next phase of defining support for network interactions, looking specifically at things we might want to obtain and/or control via the fabric manager. A very preliminary draft is shown here: https://pmix.org/home/pmix-standard/fabric-manager-

Re: [OMPI devel] Running on Kubernetes

2018-03-16 Thread r...@open-mpi.org
I haven’t really spent any time with Kubernetes, but it seems to me you could just write a Kubernetes plm (and maybe an odls) component and bypass the ssh stuff completely given that you say there is a launcher API. > On Mar 16, 2018, at 11:02 AM, Jeff Squyres (jsquyres) > wrote: > > On Mar 1

<    1   2   3   >