Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-40-g93eba3a

2014-10-08 Thread Howard Pritchard
t; > > - Log - > > > https://github.com/open-mpi/ompi/commit/93eba3ac70606db12465319804f2733f13bc9ca4 > > > > commit 93eba3ac70606db12465319804f2733f13bc9ca4 > > Merge: fd6a044 bd2974f > > Author: Howard Pritchard <hpprit..

Re: [OMPI devel] Pull requests to release branch

2014-10-09 Thread Howard Pritchard
Hi Ralph, Just so its clear to everyone, what is the definition of "mark" in this context? Howard 2014-10-09 16:28 GMT-06:00 Ralph Castain : > Hi folks > > I would appreciate it if people marked their pull requests for the 1.8 > series with the commit hash from the devel

[OMPI devel] fixing a bug in 1.8 that's not in master

2014-10-27 Thread Howard Pritchard
Hi Folks, A cut and past error seems to have happened with plm_alps_modules.c in 1.8 which causes a compile failure when building for cray. So right now, there's no building ompi 1.8 for crays. The problem is not present in master. For these kinds of problems, are we suppose to bypass all the

[OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Howard Pritchard
Hi Folks, I'm trying to figure out what broke for pmi configure since now the pmix/cray component doesn't compile any longer in master. I was happening to look in the s1 and s2 configure.m4's and noticed a AC_REQUIRE for OPAL_CHECK_UGNI. This doesn't make sense to me. Maybe these were

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Howard Pritchard
Hi Ralph, 2014-10-28 12:26 GMT-06:00 Ralph Castain <r...@open-mpi.org>: > > > On Oct 28, 2014, at 11:16 AM, Howard Pritchard <hpprit...@gmail.com> > wrote: > > > > Hi Folks, > > > > I'm trying to figure out what broke for pmi configure since now t

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Howard Pritchard
ri > > > -Paul > > On Tue, Oct 28, 2014 at 12:02 PM, Ralph Castain <r...@open-mpi.org> wrote: > >> >> On Oct 28, 2014, at 11:59 AM, Paul Hargrove <phhargr...@lbl.gov> wrote: >> >> >> On Tue, Oct 28, 2014 at 11:53 AM, Howard Pritchard <hppri

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Howard Pritchard
HI Ralph, I think I found the problem. Thanks. Howard 2014-10-28 12:58 GMT-06:00 Ralph Castain <r...@open-mpi.org>: > > On Oct 28, 2014, at 11:53 AM, Howard Pritchard <hpprit...@gmail.com> > wrote: > > Hi Ralph, > > > 2014-10-28 12:26 GMT-06

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Howard Pritchard
t;r...@open-mpi.org>: > > On Oct 28, 2014, at 1:05 PM, Howard Pritchard <hpprit...@gmail.com> wrote: > > Hi Folks, > > The simplest and best way on cray is to use the pkg-config command. > No looking for odd header file names, etc. There is a minor issue > with externa

Re: [OMPI devel] configure.m4 for pmix/s1 and pmix/s2 question

2014-10-28 Thread Howard Pritchard
path. > Perhaps you should add the directory containing `cray-alpslli.pc' > to the PKG_CONFIG_PATH environment variable > Package 'cray-alpslli', required by 'cray-pmi', not found > > -Paul > > > On Tue, Oct 28, 2014 at 1:05 PM, Howard Pritchard <hpprit...@gmail.com> &g

Re: [OMPI devel] ROMIO+Lustre problems in OpenMPI 1.8.3

2014-10-29 Thread Howard Pritchard
Hi Paul, Thanks for the forward. I've opened an issue #255 to track the ROMIO config regression. Just to make sure, older releases of the 1.8 branch still configure and build properly with your current lustre setup? Thanks, Howard 2014-10-28

[OMPI devel] OpenMPI Developers Face to Face Q1 2015 poll

2014-11-04 Thread Howard Pritchard
Hi OMPI folks, We're planning to hold another developers face to face in Q1 2015. Currently, we're thinking of holding the face to face either the last week of January, or one of the first two weeks of February. The format will be similar to the previous f2f in Chicago - start on Monday afternoon

[OMPI devel] Open MPI Developers Face to Face Q1 2015 (updated doodle poll link)

2014-11-04 Thread Howard Pritchard
Hi Folks, Per request to have a yes/yesifneedbe/no poll, and limitation of doodle to change options, a new doodle poll for deciding on the date for the next developers f2f is at: https://doodle.com/zzaupgxge9y6medu There is also a wiki page for the meeting:

[OMPI devel] Open MPI Developers F2F Q1 2015 (poll closes on Friday, 7th of November)

2014-11-05 Thread Howard Pritchard
Hi Folks, We've gotten a number of responses to the doodle poll for a week to hold the next OMPI developers F2F. The responses are definitely favoring a meeting the week of January 26th. The poll will be kept open till COB (PST) Friday, the 7th of November.

Re: [OMPI devel] Open MPI Developers F2F Q1 2015 (poll closes on Friday, 7th of November)

2014-11-05 Thread Howard Pritchard
attendees > >- it's less likely to have weather-related travel problems in Jan/Feb > > > > However, it was brought to my attention today that San Francisco (or > Atlanta or New York?) may be more attractive to European attendees. I.e., > there's few direct flights from Europe

Re: [OMPI devel] RFC: revamp btl rdma interface

2014-11-06 Thread Howard Pritchard
HI Nathan, How would you get things right with atomics and multirail? Getting the memory consistency right would be really difficult. You'd have to keep issuing zero length rdma reads and hoping that that would have the effect of a pci-e flush in the case of multiple updates to a given target

Re: [OMPI devel] Pull requests on the trunk

2014-11-06 Thread Howard Pritchard
HI Ralph, We should discuss this on Tuesday. I thought we'd decided for master to use a model where developers would directly push to ompi/master. I'd be willing to pull the request from Giles marked as bugs tomorrow. Howard 2014-11-06 13:16 GMT-07:00 Ralph Castain : >

Re: [OMPI devel] OMPI devel] Pull requests on the trunk

2014-11-07 Thread Howard Pritchard
Hi Gilles, I'm fine with the pull request method too. We hadn't been considering this avenue for master updates in the transition to github. I think as long as we have a set way for associating the pull of a given request into master, so they don't end up in a kind of purgatory, we'll be in

Re: [OMPI devel] 1.8.3 and PSM errors

2014-11-11 Thread Howard Pritchard
Hi Folks, I remember in the psm provider for libfabric, that there is a check in the av_insert method for endpoints that had previously been inserted into the av. In the libfabric psm provider, a mask array is created and fed in to the psm_ep_connect call to handle ep's that were already

Re: [OMPI devel] 1.8.3 and PSM errors

2014-11-12 Thread Howard Pritchard
HI Folks, I think this is a bug in the PSM MTL add_procs. The call to psm_ep_connect needs to be taking previously connected ep's into account, much like what is done in the libfabric psm provider code. Howard 2014-11-12 3:12 GMT-07:00 Rainer Keller : > Dear

Re: [OMPI devel] 1.8.3 and PSM errors

2014-11-13 Thread Howard Pritchard
Hi Adrian, Please do your PSM results in the database. Would be very much appreciated. Howard 2014-11-13 7:46 GMT-07:00 Adrian Reber : > I applied the fix committed on master and described in > > https://github.com/open-mpi/ompi/issues/268 > > on 1.8.3 and 1.8.4rc1 and this

[OMPI devel] RTLD_GLOBAL question

2014-12-01 Thread Howard Pritchard
Hi ompi developers, If you always configure ompi with --disable-dlopen you can delete this message now. There has been some discussion of end case situations with use of dlopen in the ompi mca framework that can lead to unresolved symbols when subsequent shared libraries are dlopen'd that might

Re: [OMPI devel] RTLD_GLOBAL question

2014-12-03 Thread Howard Pritchard
Hello Artem, No, but I was also told by schedmd that the slurm we have on our systems is ancient. So I'm no longer considering this problem very important. We have a workaround of always configuring with --disable-dlopen. Thanks, Howard 2014-12-02 20:59 GMT-07:00 Artem Polyakov

Re: [OMPI devel] jenkins runtime failures

2014-12-03 Thread Howard Pritchard
some kind > of race -- it only happens in some runs, not all (but still pretty > frequent). > >> > >> Ralph is looking into it. > > > > Well, I would be looking into it if I could reproduce it, but I can’t > seem to do so. Talking to Nathan about it now > &

Re: [OMPI devel] (no subject)

2014-12-08 Thread Howard Pritchard
Hello Kevin, Could you try testing with Open MPI 1.8.3? There was a bug in 1.8.1 that you are likely hitting in your testing. Thanks, Howard 2014-12-07 17:18 GMT-07:00 Kevin Buckley < kevin.buckley.ecs.vuw.ac...@gmail.com>: > Apologies for the lack of a subject line: cut and pasted the body

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-477-g09d03a1

2014-12-09 Thread Howard Pritchard
Well the build is broken again for cray. I'd like to have this stop. 2014-12-09 7:23 GMT-08:00 Ralph Castain : > No problem - just wanted to make sure you were aware of it. > > > > On Dec 9, 2014, at 7:21 AM, Jeff Squyres (jsquyres) > wrote: > > > > Yes,

Re: [OMPI devel] Update to usnic BTL / libfabric

2014-12-09 Thread Howard Pritchard
HI Ralph, Jeff fixed this in c40fd09. That's the problem I hit, in addition to later not having psm_infinipath. After that commit,and commit cd0a54d you should be able to config and make again. Howard 2014-12-09 13:45 GMT-08:00 Ralph Castain : > Just as an FYI: we

[OMPI devel] opal_lifo/opal_fifo fail with make distcheck

2014-12-09 Thread Howard Pritchard
Hi Folks, I've tried running make distcheck on master and get failures for opal_fifo/opal_lifo: make[4]: Leaving directory `/global/u2/h/hpp/ompi/openmpi-gitclone/_build/test/class' make check-TESTS make[4]: Entering directory `/global/u2/h/hpp/ompi/openmpi-gitclone/_build/test/class'

[OMPI devel] still supporting pgi?

2014-12-11 Thread Howard Pritchard
Hi Folks, I'm trying to use mtt on a cluster where it looks like the only functional compiler that 1) can build open mpi master 2) can also build the ibm test suite may be pgi. Can't compile write now, so I'm trying to fix it. But I'm now wondering whether we are still supporting building

Re: [OMPI devel] still supporting pgi?

2014-12-11 Thread Howard Pritchard
Okay, I'll try to fix things. problem in opal_datatype_internal.h, then a meltdown with libfabric owing to the fact that its probably only been used in a gnu env. I'll open an issue on that one and assign it to Jeff. I think we should be turning this libfabric build off unless one asks for it.

Re: [OMPI devel] OpenIB has some borked code

2014-12-12 Thread Howard Pritchard
Nathan, Please make sure the fix for this problem is contained in its own commit. Howard 2014-12-12 9:38 GMT-07:00 Nathan Hjelm : > > > Yeah, that code is completely wrong. I have a fix in my btl > modifications branch. > > >

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-509-g38d6627

2014-12-15 Thread Howard Pritchard
I'd prefer Paul's suggestion to disable xpmem for sgi/uv for 1.8.X Is anyone actually supporting this? Howard 2014-12-15 8:56 GMT-07:00 Nathan Hjelm : > > > Not yet. I am still trying to pinpoint the problem. From what I can tell > the SGI version of XPMEM should be nearly

[OMPI devel] ofi/mtl causing problems

2014-12-17 Thread Howard Pritchard
I noticed my MTT smoke test failed with todays master build: name=PMI_process_mapping, (val=(vector,(0,4,4))) ./c_hello./c_hello: : symbol lookup errorsymbol lookup error: :

Re: [OMPI devel] ofi/mtl causing problems

2014-12-17 Thread Howard Pritchard
10:09 GMT-07:00 Jeff Squyres (jsquyres) <jsquy...@cisco.com>: > > Is this on a PSM-enabled cluster? > > Can you send the full output from configure, the config.log, and the > output from "make"? > > Are you building statically (i.e., libmpi.a)? > > > &g

Re: [OMPI devel] ofi/mtl causing problems

2014-12-17 Thread Howard Pritchard
u building statically (i.e., libmpi.a)? > > > > On Dec 17, 2014, at 12:04 PM, Howard Pritchard <hpprit...@gmail.com> > wrote: > > > I noticed my MTT smoke test failed with todays master build: > > > > name=PMI_process_mapping, (val=(vector,(0,4,4))) >

Re: [OMPI devel] ofi/mtl causing problems

2014-12-17 Thread Howard Pritchard
until folks can quiet the > noise. If memory serves me, that's the position the community took with > OSHMEM. > > Josh > > On Wed, Dec 17, 2014 at 1:40 PM, Howard Pritchard <hpprit...@gmail.com> > wrote: >> >> Jeff, >> >> I think the problem is t

Re: [OMPI devel] ofi/mtl causing problems

2014-12-17 Thread Howard Pritchard
com>: > > On Dec 17, 2014, at 2:19 PM, Howard Pritchard <hpprit...@gmail.com> wrote: > > > I did another mtt run with --disable-libfabric included on the configure > line and still failed with the same problem, mtl/ofi thinks its okay to > build... > > FWIW: this pro

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-564-g6c468b8

2014-12-17 Thread Howard Pritchard
Hi Jeff, Why did you delete the il libmca_common_alps_so_version thats going to break my stuff. 2014-12-17 14:36 GMT-07:00 : > > This is an automated email from the git hooks/post-receive script. It was > generated because a ref change was pushed to the repository

[OMPI devel] ABI compatibility proposal for 1.9/2.0 release stream

2014-12-18 Thread Howard Pritchard
Hi Folks, Jeff and I have been considering changing the ABI compatibility store fory Open MPI for the 1.9/2.0 release stream. Basically no promises for the odd/feature release series, but keep the current ABI promise for the even release series. I've updated the 1.9/2.0 release page

[OMPI devel] simple ./configure doesn't work on master/HEAD

2014-12-18 Thread Howard Pritchard
Hi Folks, I just tried to run ./configure on an updated master and its failing with hwloc header file issues: checking components to build statically... noos xml synthetic custom xml_nolibxml linux linuxpci x86 checking components to build as plugins... checking whether hwloc configure

[OMPI devel] commit be6d4649

2014-12-18 Thread Howard Pritchard
Hi Folks, commit be6d4649 broke simple ./configure of master. I'd like to revert this commit unless someone can figure out a better solution to Gilles --without-hwloc issue soon. Howard

Re: [OMPI devel] BUG in ADIOI_NFS_WriteStrided

2014-12-19 Thread Howard Pritchard
HI Eric, Does your app also work with MPICH? The romio in Open MPI is getting a bit old, so it would be useful to know if you see the same valgrind error using a recent MPICH. Howard 2014-12-19 9:50 GMT-07:00 Eric Chamberland : > > Hi, > > I encountered a

Re: [OMPI devel] [Open MPI Announce] Open MPI 1.8.4 released

2014-12-22 Thread Howard Pritchard
I opened an issue 322 about this and gave put it on 1.8.5 milestone. I'll submit a PR to remove the sn/xpmem.h usage in the vader config file. I think to do justice to SGI UV, someone would have to put in time to figure out how to use their GRU api.

Re: [OMPI devel] RFC: remove --disable-smp-locks

2015-01-06 Thread Howard Pritchard
I agree. Please remove this config option. 2015-01-06 9:44 GMT-07:00 Nathan Hjelm : > > What: Remove the --disable-smp-locks configure option from master. > > Why: Use of this option produces incorrect results/undefined behavior > when any shared memory BTL is in use. Since BTL

Re: [OMPI devel] Changed behaviour with PSM on master

2015-01-09 Thread Howard Pritchard
Hi Adrian and Andrew, I"m able to reproduce your problem on one of our qlogic clusters. We are using PSM 1.14 and slurm. I'm noticing that for some reason in our setup the ORTE_MCA_orte_precondition_transports env. variable is not being set. Could you run your test with --mca odls_base_verbose

Re: [OMPI devel] Changed behaviour with PSM on master

2015-01-09 Thread Howard Pritchard
HI Folks, Sorry for my stupidity. I now see the problem. App is calling pmi_init twice because of the new ofiwg libfabric mtl. You can try mpirun blah blah blah --mca btl and things should work. Howard 2015-01-09 7:52 GMT-07:00 Friedley, Andrew : > No this is

Re: [OMPI devel] Changed behaviour with PSM on master

2015-01-09 Thread Howard Pritchard
HI Adrian, Andrew, Sorry try again, both the libfabric psm provider and the open mpi psm mtl are trying to use psm_init. So, to avoid this problem, add --mca mtl psm to your mpirun command line. Sorry for the confusion. Howard 2015-01-09 7:52 GMT-07:00 Friedley, Andrew

Re: [OMPI devel] Changed behaviour with PSM on master

2015-01-09 Thread Howard Pritchard
ways have > to provide '--mca mtl psm' in the future? > > On Fri, Jan 09, 2015 at 12:27:59PM -0700, Howard Pritchard wrote: > > HI Adrian, Andrew, > > > > Sorry try again, both the libfabric psm provider and the open mpi psm > > mtl are trying to use psm_in

Re: [OMPI devel] Changed behaviour with PSM on master

2015-01-11 Thread Howard Pritchard
ject: Re: [OMPI devel] Changed behaviour with PSM on master > > +1 -- someone should file a bug. > > I think Intel needs to decide how they want to handle this (e.g., whether > the PSM MTL or OFI MTL should be the default, and how the other can detect > if it's not the default a

Re: [OMPI devel] #327

2015-01-11 Thread Howard Pritchard
HI George, Thanks for the feedback. This PR was only to address one piece ( a first step) for ways to handle thread based progression for RDMA capable nics within OMPI. It by no means represents a complete solution. That more complete solution was what I understood we were interested in in the

Re: [OMPI devel] Coll ML issues

2015-01-25 Thread Howard Pritchard
Hi George, I put this on the agenda for this week's meeting. Howard 2015-01-23 16:43 GMT-07:00 George Bosilca : > During some experiments we have identified several major issues with coll > ML with a very recent version of Open MPI master (22ab638 Jan 20 13:21:44). >

Re: [OMPI devel] When libltdl is not your friend

2015-02-02 Thread Howard Pritchard
Hi Paul, Thanks for checking in depth into this. Just to help in determining how to proceed, which national center is this? Howard 2015-02-02 19:35 GMT-07:00 Paul Hargrove : > Below is one example of what happens when you assume that you can trust > the libltdl installed

Re: [OMPI devel] omni-release Github comment bot

2015-02-04 Thread Howard Pritchard
+1 great stuff 2015-02-04 5:55 GMT-07:00 Jeff Squyres (jsquyres) : > OMPI devs -- > > Per lots of previous discussions, you all know that you can't assign > labels, milestones, or users to issues/pull requests on the ompi-release > repo. > > Gilles has written a Github bot

Re: [OMPI devel] omni-release Github comment bot

2015-02-05 Thread Howard Pritchard
ne pushes new commits to a > PR after it has been rm-approved, do you want the rm-approved label > removed? My gut feeling is "no" -- it stays approved. > > Thoughts? > > > > On Feb 4, 2015, at 2:26 PM, Howard Pritchard <hpprit...@gmail.com> wrote: > &g

[OMPI devel] turning the bot on for ompi-release?

2015-02-05 Thread Howard Pritchard
HI Jeff and Gilles Do we have an ETA for enabling the bot on ompi-release? I think it will be a great help. Howard

Re: [OMPI devel] ess:alps build failure with PGI

2015-02-09 Thread Howard Pritchard
HI Paul, I'll fix this. Howard 2015-02-06 17:38 GMT-07:00 Paul Hargrove : > The following in orte/mca/ess/alps/Makefile.am assumes a GNU (or GNU-like) > compiler: > > mca_ess_alps_la_CPPFLAGS = $(ess_alps_CPPFLAGS) -fno-ident > > If building with PGI, the result is >

Re: [OMPI devel] OMPI devel] RoCE plus QDR IB tunable parameters

2015-02-10 Thread Howard Pritchard
HI George, I'd say commit cf377db82 explains the vanishing of the bandwidth metric as well as the mis-labeling of the latency metric. Howard 2015-02-10 18:41 GMT-07:00 George Bosilca : > Somehow one of the most basic information about the capabilities of the > BTLs

[OMPI devel] MTT failures

2015-02-18 Thread Howard Pritchard
Hi Folks I noticed that the NERSC (carver/edison) MTT smoke tests are failing now. I also see a lot of ivy cluster runs are also failing. All the nersc runs are failing with: c1479:05071] OPAL ERROR: Bad parameter in file util/attr.c at line 431 [c1479:05071] [[57033,0],0] ORTE_ERROR_LOG: Bad

Re: [OMPI devel] git commit id in coverity

2015-02-19 Thread Howard Pritchard
HI Ralph, How does one get this "MPI Create success" message? Is there a mailing list specifically for the nightly builds? Thanks, Howard 2015-02-16 21:48 GMT-07:00 Ralph Castain : > It's the git id of the nightly tarball - which you should get via the MPI > Create

Re: [OMPI devel] Tues Mar 3rd telecon

2015-02-26 Thread Howard Pritchard
I will also be available but suggest we skip next Tuesday. On Feb 25, 2015 5:04 PM, "Ralph Castain" wrote: > Hey folks > > Given that some number of us will be at the MPI Forum next week, do we > have a quorum available for the weekly telecon? Who would be able to make > it?

[OMPI devel] opal_verbs_want_fork_support question

2015-02-26 Thread Howard Pritchard
Hi Folks, Just tried to build a fresh head of master and am getting opal_verbs_want_fork_support as undefined symbol when trying to build opal lib. Any ideas on where this should go? It would be nice to get jenkins checking everything, or at least a light weight travis check. Howard

[OMPI devel] psm and process affinity in open mpi

2015-03-03 Thread Howard Pritchard
Hi Folks, First initial disclaimer - I've looked through the open mpi faq and have been unable so far an answer to my question below. I've been having a discussion with one of the other trilab folks about some issues with using PSM within mvpaich where the default cpu affinity behavior of PSM

Re: [OMPI devel] psm and process affinity in open mpi

2015-03-03 Thread Howard Pritchard
eady has the fix for you. > > > > Andrew > > > > *From:* devel [mailto:devel-boun...@open-mpi.org] *On Behalf Of *Howard > Pritchard > *Sent:* Tuesday, March 3, 2015 8:21 AM > *To:* Open MPI Developers List > *Subject:* [OMPI devel] psm and process affinity in open m

Re: [OMPI devel] libfabric code does not build with pgi-{10,11}

2015-03-05 Thread Howard Pritchard
Hi Paul The carver mtt runs use the pgi 14.0. Ill take a look at 10 and 11 for libfabric. I don't know if anybody *cares*, but I find that the common:libfabric code compiles with pgi-{9,12,13,14} but not with pgi-{10,11}. These are versions 10.9 and 11.9 of pgi that built Open MPI 1.8.4 just

Re: [OMPI devel] libfabric code does not build with pgi-{10,11}

2015-03-05 Thread Howard Pritchard
HI Paul, For the 10.9 and 11.9 does the libfabric get configured to build for you on carver? I get a failure at config. I don't think this should be high priority since the libfabric embedding within open mpi should hopefully soon be a thing of the past. Howard 2015-03-04 14:28 GMT-07:00 Paul

Re: [OMPI devel] mpi_test_suite question

2015-03-06 Thread Howard Pritchard
tal mistakes in the test suite, (e.g. writing with a > file view, reading back without the file view and assuming that the byte > pattern is the same) that I can not easily fix. > > Thanks > Edgar > > On 3/6/2015 12:03 PM, Howard Pritchard wrote: > >> Hi Folks, >>

Re: [OMPI devel] jenkins and openmpi

2015-03-09 Thread Howard Pritchard
Die batman Lampe, ein toller Einfall! ! 2015-03-09 10:11 GMT-06:00 Mike Dubman : > > Hello, > > Please check updated OMPI wiki page for detailed information for Jenkins > testing of OMPI repositories. > > https://github.com/open-mpi/ompi/wiki/PRJenkins > > Comments and

[OMPI devel] f08ts

2015-03-10 Thread Howard Pritchard
Hi Folks, If you're one of those souls unfortunate enough to know about f08ts read on, otherwise ignore this email. It looks like Open MPI is missing all of the f08ts fortran interfaces. I notice the MPICH master does have such interfaces defined. Is there a historical reason why we don't

Re: [OMPI devel] BML changes

2015-03-11 Thread Howard Pritchard
My experience with DMA engines located on the other side of a PCI-e 16x gen3 bus from the cpus is that for a couple of ranks doing large transfers between each other on a node, using the DMA engine looks good. But once there are multiple ranks exchanging data (like up to 32 ranks on a dual socket

[OMPI devel] dlclose of libmpi, java gc, and pthread_key destructors

2015-04-06 Thread Howard Pritchard
Hi Folks, There seems to have been recent outburst of interest in the mpi java bindings, so moving in retrograde fashion back to what I use to be doing, I've started investigating the Ompi JNI code. I'm noticing that at least on sles11sp3, that soon after the java vm invokes the JNI_OnUnload of

Re: [OMPI devel] v1.8.5 NEWS and README

2015-04-17 Thread Howard Pritchard
Hi Jeff Minor cray corrections below On Apr 17, 2015 6:57 AM, "Jeff Squyres (jsquyres)" wrote: > > The v1.8 branch NEWS, README, and VERSION files have been updated in preparation for the v1.8.5 release. Please double check them -- especially NEWS, particularly to ensure

[OMPI devel] mtt failures from last nite

2015-04-17 Thread Howard Pritchard
HI Folks, I'm seeing build failures on both carver/pgi at nersc and on a cray internal machine with the nightly build of master. >From the cray box: ommon_ugni.c:30:5: error: 'MCA_BASE_VERSION_2_0_0' undeclared here (not in a function) MCA_BASE_VERSION_2_0_0, common_ugni.c:31:5: warning:

Re: [OMPI devel] v1.8.5 NEWS and README

2015-04-17 Thread Howard Pritchard
Right on Paul! I can certainly get 1.8.5 to "work" on cray systems like hopper, but last time I tried out of the box I had to fix up the pmi in ess_pmi_module.c because with recent cray PMI's (like the ones now default on hopper), the configure ends up resulting in the use of PMI_KVS_Put/get,

[OMPI devel] noticing odd message

2015-04-20 Thread Howard Pritchard
Hi Folks, Working on master, I"m getting an odd message: malloc debug: Request for 1 zeroed elements of size 0 (mca_base_var.c, 170) whenever I launch a job. It looks like this can be traced back to this line in orte_ess_singleton_component_register: mca_base_var_register_synonym(ret "orte",

Re: [OMPI devel] Fwd: OpenIB module initialisation causes segmentation fault when locked memory limit too low

2015-04-22 Thread Howard Pritchard
Hi Raphael, Thanks very much for the patches. Would one of the developers on the list have a system where they can make these kernel limit changes and which have HCAs installed? I don't have access to any system where I have such permissions. Howard 2015-04-22 8:55 GMT-06:00 Raphaël

Re: [OMPI devel] Fwd: OpenIB module initialisation causes segmentation fault when locked memory limit too low

2015-04-22 Thread Howard Pritchard
23e2d] >> [c1436:05774] [14] examples/ring_c[0x42dceb] >> [c1436:05774] [15] examples/ring_c[0x407285] >> [c1436:05774] [16] >> /lib64/libc.so.6(__libc_start_main+0xf4)[0x2b4107cd6994] >> [c1436:05774] [17] examples/ring_c[0x407129] >> [c1436:05774] *** End of error message *** >&g

Re: [OMPI devel] Fwd: OpenIB module initialisation causes segmentation fault when locked memory limit too low

2015-04-22 Thread Howard Pritchard
Hi Rafael, I give you an A+ for effort. We always appreciate patches. Howard 2015-04-22 12:43 GMT-06:00 Nathan Hjelm : > > Umm, why are you cleaning up this way. The allocated resources *should* > be freed by the udcm_module_finalize call. If there is a bug in that > path

Re: [OMPI devel] Suggested README changes

2015-04-23 Thread Howard Pritchard
Hi Paul, Portals4 may be able to work on cray XE/XC on top of IAA (ibverbs simulation), but it absolutely is not the support library for Cray interconnects since XE days. Never was on Cray XT either, as you point out that was portals 3.X. Howard 2015-04-23 12:29 GMT-06:00 Paul Hargrove

[OMPI devel] romio refresh on master

2015-05-01 Thread Howard Pritchard
Hi Folks, I merged in the refresh of romio 3.1.4, special thanks to Gilles for doing this! I did some testing, but can't say it was extensive. If others would have time to run some of the MTT setups requesting romio rather than ompio for a bit that would be great. Thanks, Howard

[OMPI devel] is anyone seeing this on their intel/inifinipath cluster?

2015-05-01 Thread Howard Pritchard
Hi Folks, I'm doing some work with master on a intel/infinipath system and there some odd undefined symbols errors showing up: /users/hpp/ompi_install/lib/libmca_common_libfabric.so.0: undefined symbol: psmx_eq_open anyone else seeing this on their intel/infinipath system? What's bizarre is

[OMPI devel] oops, jenkins mishap

2015-05-11 Thread Howard Pritchard
Hi Folks, Sorry for the comments pushed to PRs just a while ago. I was suppose to be configuring jenkins for a different project, not ompi. Sorry for the confusion. Howard

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-1731-g8e30579

2015-05-14 Thread Howard Pritchard
Is this by any chance associated with issue 579? 2015-05-14 20:49 GMT-06:00 Ralph Castain : > I'll look at the lines you cite, but that clearly isn't the problem we are > seeing here. I can verify that because the test case: > > mpirun -n 1 sleep 1000 > > does not open up any

Re: [OMPI devel] Open MPI collectives algorithm selection

2015-05-20 Thread Howard Pritchard
HI Gilles, First a disclaimer - I do not know what the intended design was nor where the design document for this feature is located. However, I would certainly prefer that if the communicator size wasn't specifically specified in the rule file, a fall back do-no-harm algorithm would be

Re: [OMPI devel] Proposal: update Open MPI's version number and release process

2015-05-20 Thread Howard Pritchard
Hi Dave, > The other way to solve this issue would be to stop treating the master as > a general dumping ground for potentially unstable code where anyone can > just push any time they want. If we switched to using PRs for > (essentially) all code that goes into master as well, then we

[OMPI devel] ompi forking tomorrow

2015-06-15 Thread Howard Pritchard
Hi Folks, The plan is to fork ompi master tomorrow to a 2.0 branch. Last chance for any really good case that we should delay this by a day or two. Thanks, Howard

Re: [OMPI devel] ompi forking tomorrow

2015-06-15 Thread Howard Pritchard
Thanks for the heads up. taking a look. 2015-06-15 11:39 GMT-06:00 Ralph Castain <r...@open-mpi.org>: > You might take a gander at the MTT results first - they don’t look very > good on master :-( > > > > On Jun 15, 2015, at 10:14 AM, Howard Pritchard <hpprit...@gma

[OMPI devel] Fwd: MTT test has completed, status: failed

2015-06-24 Thread Howard Pritchard
: Aborted (6) [c1477:19137] Signal code: (-6) [c1476:07375] Signal: Aborted (6) c_ring: pml_ob1_component.c:308: mca_pml_ob1_component_fini: Assertion `((0xdeafbeedULL << 32) + 0xdeafbeedULL) == ((opal_object_t *) (mca_pml_ob1_recvreq))->obj_magic_id' failed. -- Forwarded message -

Re: [OMPI devel] === CREATE FAILURE (dev-1979-g13425e7) ===

2015-06-26 Thread Howard Pritchard
Hi folks, I'm confused about this build failure. It should have been caught by the make distcheck IU jenkins project I would think. Should the IU jenkins project do something else beside make -j X distcheck to catch this problem? Or, did this problem happen because someone bypassed the PR

Re: [OMPI devel] === CREATE FAILURE (dev-1979-g13425e7) ===

2015-06-26 Thread Howard Pritchard
sorry, not true. look at the logs on IU. runs at 3:07 and 4:08 IU time. 2015-06-25 21:46 GMT-06:00 Jeff Squyres (jsquyres) <jsquy...@cisco.com>: > Howard -- > > The LANL distcheck jenkins hasn't been running all day. > > > > On Jun 25, 2015, at 8:33 PM, Howard Pri

[OMPI devel] OMPI_PROC_BIND value is invalid errors

2015-06-29 Thread Howard Pritchard
Hi Folks, I'm seeing an error I've not seen before in the MTT runs on the ibm dataplex at NERSC. The mpirun launched jobs are failing with OMPI_PROC_BIND value is invalid errors. This is is for the trivial ring tests. Is anyone else seeing these types of errors? Howard

Re: [OMPI devel] OMPI_PROC_BIND value is invalid errors

2015-06-29 Thread Howard Pritchard
> Can you provide an MTT short URL to show the results? > > Or, if the MTT results are not on the community reporter, can you show a > bit more context in the output? > > > > On Jun 29, 2015, at 11:47 AM, Howard Pritchard <hpprit...@gmail.com> > wrote: > > &

Re: [OMPI devel] OMPI_PROC_BIND value is invalid errors

2015-06-29 Thread Howard Pritchard
On Mon, Jun 29, 2015 at 1:19 PM, Jeff Squyres (jsquyres) < > jsquy...@cisco.com> wrote: > >> Ahh... it's OMP_PROC_BIND, not OMPI_PROC_BIND. >> >> Yes, Ralph just added this. >> >> I chatted with him about this on the phone moments ago; he's pretty sure >>

Re: [OMPI devel] OMPI_PROC_BIND value is invalid errors

2015-06-29 Thread Howard Pritchard
te: >>> >>>> Ahh... it's OMP_PROC_BIND, not OMPI_PROC_BIND. >>>> >>>> Yes, Ralph just added this. >>>> >>>> I chatted with him about this on the phone moments ago; he's pretty >>>> sure he knows where to go look to

Re: [OMPI devel] OMPI_PROC_BIND value is invalid errors

2015-07-01 Thread Howard Pritchard
* <gpaul...@us.ibm.com> > [image: IBM] > > 1177 S Belt Line Rd > Coppell, TX 75019-4642 > United States > > > [image: Inactive hide details for Howard Pritchard ---06/29/2015 09:27:12 > PM---I decided just to disable the carver/pgi mtt runs. 2015-]Howard > Pritchar

[OMPI devel] getting v1.10 and v2.x nightly tarballs where?

2015-07-15 Thread Howard Pritchard
Hi Folks, I'm trying to locate on jaguar/www.open-mpi.org where the nightly tarballs are for v1.10 and v2.x. I'm needing these tarballs for our installs on some new systems at LANL where we want to start out with these versions. Thanks for any help, Howard

[OMPI devel] anyone built master on qlogic system today?

2015-07-22 Thread Howard Pritchard
Hello Folks, I"m investigating a psm/ofi mtl problem on one of our qlogic systems and ended up investigating something else. There seem to be a bunch of missing config.h.in files if I of current master head. If I go back to bd60ce16 things seem to be okay. The upshot is that one doesn't get

Re: [OMPI devel] anyone built master on qlogic system today?

2015-07-22 Thread Howard Pritchard
Hi Folks, Found the problem, had to do a hard reset to origin/master for some reason to get missing files back. Howard 2015-07-22 12:17 GMT-06:00 Jeff Squyres (jsquyres) <jsquy...@cisco.com>: > On Jul 22, 2015, at 1:46 PM, Howard Pritchard <hpprit...@gmail.com> wrote: >

Re: [OMPI devel] 1.10.0rc2

2015-07-24 Thread Howard Pritchard
looks like ofi mtl is being naughty. its tje onlx mtl which registers with opal progress in component init method. -- sent from my smart phonr so no good type. Howard On Jul 23, 2015 7:03 PM, "Ralph Castain" wrote: > It looks like one of the MTL components is

Re: [OMPI devel] 1.10.0rc2

2015-07-24 Thread Howard Pritchard
Paul Could you rerun with --mca mtl_base_verbose 10 added to cmd line and send output? Howard -- sent from my smart phonr so no good type. Howard On Jul 23, 2015 6:06 PM, "Paul Hargrove" wrote: > Yohann, > > With PR409 as it stands right now (commit 6daef310) I

Re: [OMPI devel] 1.10.0rc2

2015-07-24 Thread Howard Pritchard
8:19 AM, "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> wrote: > Yohann -- > > Can you have a look? > > > > On Jul 24, 2015, at 10:15 AM, Howard Pritchard <hpprit...@gmail.com> > wrote: > > > > looks like ofi mtl is being naughty. its tje

Re: [OMPI devel] 1.10.0rc2

2015-07-24 Thread Howard Pritchard
Squyres (jsquyres) <jsquy...@cisco.com>: > I think Ralph answered this question: if you register a progress function > but then get your component unloaded without un-registering the progress > function... kaboom. > > > > On Jul 24, 2015, at 10:37 AM, Howard Pritchard &l

  1   2   3   >