Re: [hwloc-devel] lstopo-nox strikes back

2012-04-25 Thread Jeffrey Squyres
FWIW: Having lstopo plugins for output would obviate the need for having two executable names. On Apr 25, 2012, at 9:42 AM, Jiri Hladky wrote: > Hello, > > I would strongly vote to split the hwloc package to the core (ASCII only, > including ASCII only version of lstopo ) package and GUI

[OMPI devel] Fwd: GNU autoconf 2.69 released [stable]

2012-04-25 Thread Jeffrey Squyres
There are a number of new Autoconf macros that would be useful for OMPI's Fortran configury. Meaning: we have klugearounds in our existing configury, but the new AC 2.69 macros are Better. How would people feel about upgrading the autoconf requirement on the trunk to AC 2.69? (Terry: please

Re: [OMPI devel] How to debug segv

2012-04-25 Thread Jeffrey Squyres
Another thing to try is to load up the core file in gdb and see if that gives you a valid stack trace of where exactly the segv occurred. On Apr 25, 2012, at 9:30 AM, Alex Margolin wrote: > On 04/25/2012 02:57 PM, Ralph Castain wrote: >> Strange that your code didn't generate any symbols - is

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26329

2012-04-24 Thread Jeffrey Squyres
Ok. :-) On Apr 24, 2012, at 4:47 PM, Nathan Hjelm wrote: > This was RFC'd last month. No one objected :) > > -Nathan > > On Tue, 24 Apr 2012, Jeffrey Squyres wrote: > >> There's some pretty extensive ob1 changes in here. >> >> Can we get these reviewed?

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26329

2012-04-24 Thread Jeffrey Squyres
There's some pretty extensive ob1 changes in here. Can we get these reviewed? Brian / George? On Apr 24, 2012, at 4:18 PM, hje...@osl.iu.edu wrote: > Author: hjelmn > Date: 2012-04-24 16:18:56 EDT (Tue, 24 Apr 2012) > New Revision: 26329 > URL:

Re: [OMPI devel] RFC: opal_cache_line_size

2012-04-23 Thread Jeffrey Squyres
On Apr 23, 2012, at 5:53 PM, George Bosilca wrote: > However, I did a quick grep and most of our headers are larger than a single > line of cache (even Itanium L2) so I suppose that making opal_cache_line_size > equal to the L2 cache line size will not be a too big waste of memory overall.

Re: [OMPI devel] RFC: opal_cache_line_size

2012-04-23 Thread Jeffrey Squyres
> On Apr 23, 2012, at 16:21 , Jeffrey Squyres wrote: > >> No one replied to this RFC. Does anyone have an opinion about it? >> >> I have attached a patch (including some debugging output) showing my initial >> implementation. If no one objects by the end of this

Re: [OMPI devel] RFC: opal_cache_line_size

2012-04-23 Thread Jeffrey Squyres
, at 1:09 PM, Jeffrey Squyres wrote: > I was just recently reminded of a comment that is near the top of > opal_init_util(): > >/* JMS See note in runtime/opal.h -- this is temporary; to be > replaced with real hwloc information soon (in trunk/v1.5 and > beyond,

Re: [OMPI devel] Fortran linking problem: libraries have changed

2012-04-23 Thread Jeffrey Squyres
On Apr 23, 2012, at 1:40 AM, Eugene Loh wrote: >> [rhc@odin001 ~/svn-trunk]$ mpifort --showme >> gfortran -I/nfs/rinfs/san/homedirs/rhc/openmpi/include >> -I/nfs/rinfs/san/homedirs/rhc/openmpi/lib >> -L/nfs/rinfs/san/homedirs/rhc/openmpi/lib -lmpi_usempi -lmpi_mpifh -lmpi >> -lopen-rte

[OMPI devel] RFC: removing maffinity, paffinity, carto frameworks

2012-04-21 Thread Jeffrey Squyres
WHAT: Remove 3 outdated frameworks: maffinity, paffinity, carto WHY: Their functionality is wholly replaced by hwloc. Removing these frameworks has actually been a to-do item since we made hwloc a 1st-class citizen in OMPI. WHERE: rm -rf opal/mca/[maffinity, paffinity, carto], and update

Re: [OMPI devel] testing if Fortran compiler likes the C++ exception flags

2012-04-21 Thread Jeffrey Squyres
Oops! Sorry about that -- fixed in r26309. On Apr 20, 2012, at 10:01 PM, Eugene Loh wrote: > I think this is related to the "Fortran merge." > > Last night, Oracle MTT tests couldn't build the trunk (r26307) with Intel > compilers. Specifically, configure fails with > >checking to see

Re: [OMPI devel] [PATCH] Open MPI on ARMv5

2012-04-19 Thread Jeffrey Squyres
Thanks Evan! (sorry for the delay in replying -- I was on vacation all last week and I'm *still* catching up...) Lief -- does this look good to you? On Apr 13, 2012, at 11:13 PM, Evan Clinton wrote: > At present Open MPI only supports ARMv7 processors. Attached is a > patch against current

[OMPI devel] After svn up...

2012-04-19 Thread Jeffrey Squyres
You may have missed it at the bottom of my mail last night, but after running "svn up" to get all the new Fortran stuff, you might want to run these commands to clean out kruft that "svn up" may not remove: rm -f ompi/mpiext/affinity/OMPI_Affinity_str.3 rm -rf ompi/mpiext/example/f77 rm -f

[OMPI devel] Fortran merge complete

2012-04-18 Thread Jeffrey Squyres
AFAICT, everything should be working now. It took me longer than expected today to deal with configury for older Folder compilers. Yuck! It looks like the nightly tarball built ok; we'll get some MTT results to look at for the morning. Please let me know if you have any problems. Some

[OMPI devel] SVN quiet time (Fortran merge)

2012-04-18 Thread Jeffrey Squyres
I am starting the Fortran merge. Please hold off on committing to SVN or updating from SVN until I have completed the merge. Remember that this will almost certainly take multiple SVN commits, and the trunk will be unstable until I have finished. Thanks for your patience. -- Jeff Squyres

Re: [OMPI devel] RFC: New Fortran bindings

2012-04-17 Thread Jeffrey Squyres
eakage after I'm all done, too -- you'll be testing configurations that we have not, so post-commit bug fixing will be inevitable. Please be patient with me while we slog through these issues. Thanks! On Apr 5, 2012, at 11:37 AM, Jeffrey Squyres wrote: > WHAT: Revamp the entire MPI Fortran bindin

[OMPI devel] Fwd: Non-zero exit status

2012-04-16 Thread Jeffrey Squyres
Can we add this to the agenda tomorrow? Begin forwarded message: > From: Ralph Castain > Subject: Re: [OMPI devel] Non-zero exit status > Date: April 13, 2012 6:40:53 PM EDT > To: Open MPI Developers > Reply-To: Open MPI Developers >

Re: [OMPI devel] [EXTERNAL] Re: Developers Meeting

2012-04-16 Thread Jeffrey Squyres
l K wrote: >>>> >>>> My vote is for San Jose. >>>> >>>> Sam >>>> >>>> >>>> From: devel-boun...@open-mpi.org [devel-boun...@open-mpi.org] on behalf of >>>> Josh

Re: [OMPI devel] OpenMPI and R

2012-04-06 Thread Jeffrey Squyres
On Apr 5, 2012, at 9:07 PM, Benedict Holland wrote: > Oh how interesting and I hope this helps someone. Following another link, I > had to use: > > ./configure --prefix /usr --enable-shared --enable-static This makes sense. You were falling victim to the fact that R dlopens libmpi as a

Re: [OMPI devel] [patch] Bugs in mpi-f90-interfaces.h and its bridge implementation

2012-04-06 Thread Jeffrey Squyres
On Apr 6, 2012, at 7:09 AM, Kawashima wrote: > I've checked your code in bitbucket. Two types of error are found. > I've attached the patch. > > First one (ignore-tkr) seems to be an error by manual patching. > Second one (tkr) seems that patch command could not apply my fixes > because

Re: [OMPI devel] RFC: New Fortran bindings

2012-04-05 Thread Jeffrey Squyres
ot; (or "em pee eye eff h" or "em-piff-dot-h" for short) - "use mpi" (or "mpi module") - "mpi_f08" (no need to say "use" or "module" here) > I'm willing to help test/debug, but I don't know enough about the new > {for

[OMPI devel] RFC: New Fortran bindings

2012-04-05 Thread Jeffrey Squyres
WHAT: Revamp the entire MPI Fortran bindings; new "mpifort" wrapper compiler WHY: Much better mpi module implementation; addition of MPI-3 mpi_f08 module WHERE: Remove ompi/mpi/f77 and ompi/mpi/f90, replace with ompi/mpi/fortran TIMEOUT: Teleconf, Tue Apr 17, 2012

Re: [OMPI devel] Intel test MPI_Keyval3_f now failing

2012-04-05 Thread Jeffrey Squyres
I'm able to duplicate the problem, but I don't know if this is worth digging in to. The entire Fortran bindings will be replaced in about 2 weeks, and the problem doesn't occur on my mpi3-fortran bitbucket. On Apr 5, 2012, at 7:03 AM, TERRY DONTJE wrote: > I noticed both IU and Oracle are

Re: [OMPI devel] [patch] Bugs in mpi-f90-interfaces.h and its bridge implementation

2012-04-04 Thread Jeffrey Squyres
On Apr 3, 2012, at 10:56 PM, Kawashima wrote: > I and my coworkers checked mpi-f90-interfaces.h against MPI 2.2 standard > and found many bugs in it. Attached patches fix them for trunk. > Though some of them are trivial, others are not so trivial. > So I'll explain them below. Excellent -- many

Re: [OMPI devel] mca_btl_tcp_alloc

2012-04-04 Thread Jeffrey Squyres
+1 on Rolf's explanation. Additionally, note that you don't have to do it this way. You can implement yours in whatever style you want; this is just the style we used for the TCP BTL. On Apr 4, 2012, at 10:18 AM, Rolf vandeVaart wrote: > Here is my explanation. The call to

Re: [OMPI devel] Set alignment for openib internal buffers

2012-04-04 Thread Jeffrey Squyres
Mellanox is re-working the patch; the original commit violated several abstractions. I hope they'll have a new patch soon, but I don't know the exact timeframe. On Apr 4, 2012, at 4:19 AM, ludovic.hab...@ext.bull.net wrote: > Hi everybody, > > I've seen that some changes have been committed

Re: [OMPI devel] Developers Meeting

2012-04-03 Thread Jeffrey Squyres
On Apr 3, 2012, at 11:44 AM, Barrett, Brian W wrote: > There is discussion of attempting to have a developers meeting this > summer. We haven't had one in a while and people thought it would be good > to work through some of the ideas on how to implement features for 1.7. > We don't have a

Re: [OMPI devel] Regarding the Installation of Open MPI in Amazon EC2 cloud by using UNIVA cluster

2012-04-02 Thread Jeffrey Squyres
I can't quite parse your question. In general, if libmpi.so is not in your default linker search path, you probably need to add it to your LD_LIBRARY_PATH (e.g., in your shell startup file, such as .bashrc). This will be true regardless of whether you are using real machines or virtual

[OMPI devel] RFC: opal_cache_line_size

2012-03-30 Thread Jeffrey Squyres
I was just recently reminded of a comment that is near the top of opal_init_util(): /* JMS See note in runtime/opal.h -- this is temporary; to be replaced with real hwloc information soon (in trunk/v1.5 and beyond, only). This *used* to be a #define, so it's important

Re: [OMPI devel] Openmpi-1.5.3 issue " initialization failure on /dev/ipath (err=23)"

2012-03-29 Thread Jeffrey Squyres
or any other ways... > > Regards > Raju... > > On Thu, Mar 29, 2012 at 8:58 PM, Jeffrey Squyres <jsquy...@cisco.com> wrote: > This looks like a PSM problem (PSM is the layer than runs below Open MPI on > QLogic NICs). You might need to contact QLogic tech support to f

Re: [hwloc-devel] [hwloc-svn] svn:hwloc r4417

2012-03-29 Thread Jeffrey Squyres
On Mar 27, 2012, at 4:55 PM, Brice Goglin wrote: > Did you already test this within OMPI/hwloc1.3 ? > I am running some tests here, no problem with different kernels, mlx4_0 > and qib0 hardware, 1 or 2 ports so far. FWIW: I have tested it via lstopo -v on a bunch of my nodes with IB, and the

Re: [OMPI devel] Openmpi-1.5.3 issue " initialization failure on /dev/ipath (err=23)"

2012-03-29 Thread Jeffrey Squyres
This looks like a PSM problem (PSM is the layer than runs below Open MPI on QLogic NICs). You might need to contact QLogic tech support to find out how to solve it. On Mar 29, 2012, at 11:26 AM, Raju wrote: > Hi Ralph, > > I recompiled OMPI with --with-tm option, but still same issue... I

[OMPI devel] opal/mca/common: you can remove this directory

2012-03-29 Thread Jeffrey Squyres
FYI: The opal/mca/common directory had been functionally empty for a while, so I "svn rm"'ed it last week or so. However, if you svn up, it SVN will likely still leave that directory around because it probably contains a Makefile and Makefile.in. It is safe to rm -rf this entire tree and

Re: [hwloc-devel] [hwloc-svn] svn:hwloc r4417

2012-03-27 Thread Jeffrey Squyres
uff to stable branches before I am > sure they work fine on trunk :) We haven't even tested this on many > Linux kernels and OFED hardware yet. > > The risk is very low here. So if you need it, I can certainly backport it. > > Brice > > > > Le 27/03/2012 22:27, J

Re: [hwloc-devel] [hwloc-svn] svn:hwloc r4417

2012-03-27 Thread Jeffrey Squyres
Brice -- Is there a reason to not bring this to v1.4? On Mar 21, 2012, at 3:29 AM, bgog...@osl.iu.edu wrote: > Author: bgoglin > Date: 2012-03-21 03:29:17 EDT (Wed, 21 Mar 2012) > New Revision: 4417 > URL: https://svn.open-mpi.org/trac/hwloc/changeset/4417 > > Log: > Add Port info attribute

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26180

2012-03-22 Thread Jeffrey Squyres
>From the context of the code, I'm assuming it's supposed to be MPI_SOURCE. >I'll commit shortly. On Mar 22, 2012, at 7:54 PM, Ralph Castain wrote: > Yo Brian > > I believe you have an error in this commit: > > pml_ob1_iprobe.c:113: error: 'ompi_status_public_t' has no member named >

Re: [OMPI devel] MPI_Init_thread problem on ubuntu ARM (open-mpi 1.4.3)

2012-03-22 Thread Jeffrey Squyres
We did not support ARM until Open MPI 1.5.x. On Mar 21, 2012, at 7:07 AM, Juan Solano wrote: > > Hello, > > I have a problem using Open MPI on my linux system (pandaboard running > Ubuntu precise). A call to MPI_Init_thread with the following parameters > hangs: > > MPI_Init_thread(0, 0,

Re: [hwloc-devel] PCI device name question

2012-03-20 Thread Jeffrey Squyres
On Mar 20, 2012, at 5:30 PM, Brice Goglin wrote: >> Does the new patch add port numbers at all if /device/infiniband >> doesn't exist? > > No. For each ethX, the hwloc "ethX" object will only get a Port number > if the corresponding sysfs device has some infiniband "child". > Otherwise, no Port

Re: [hwloc-devel] PCI device name question

2012-03-20 Thread Jeffrey Squyres
On Mar 20, 2012, at 5:07 PM, Brice Goglin wrote: > New patch attached, it doesn't add port numbers for non-IB devices. Does the new patch add port numbers at all if /device/infiniband doesn't exist? I.e., is the dev_id/port number irrelevant if it's not an OpenFabrics device? -- Jeff

Re: [hwloc-devel] PCI device name question

2012-03-20 Thread Jeffrey Squyres
On Mar 20, 2012, at 3:45 PM, Brice Goglin wrote: > That looks good to me, as long as starting port numbers to 1 for > non-IB/OFED is OK. Hmm. Not sure about that. I always thought it was strange that IB devices started with port 1. Are *we* (hwloc) supplying the port number, or are you

[OMPI devel] 1.5.5rc5 is released

2012-03-20 Thread Jeffrey Squyres
Only one change since yesterday: http://www.open-mpi.org/software/ompi/v1.5/ - Fix a Mellanox/FCA issue -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [hwloc-devel] PCI device name question

2012-03-20 Thread Jeffrey Squyres
On Mar 20, 2012, at 3:29 PM, Brice Goglin wrote: > By the way, do you want Port numbers to start at 0 or 1? IIRC, IB (and probably RoCE) port numbers start with 1. Shrug. So let's report whatever they report. The sample output you showed looks perfect to me. Is your patch small enough to

Re: [hwloc-devel] PCI device name question

2012-03-20 Thread Jeffrey Squyres
On Mar 20, 2012, at 3:09 PM, Brice Goglin wrote: > Looks like we just need to read /sys/class/net/ib*/dev_id, make that > decimal, add one, and we get the port number. > > How would you like this to appear in the topology? Is a object info such > as "Port=%d" in each network interface in a OFED

Re: [hwloc-devel] PCI device name question

2012-03-20 Thread Jeffrey Squyres
On Mar 20, 2012, at 12:02 PM, Brice Goglin wrote: > Actually, what we don't know is how to map that to port 1/2 (we have > ib0/ib1 mac addresses, those are = GUID+1/2 on my machine) Yes, that is more correctly stated. A Mellanox guy pointed me to the ibdev2netdev script in current OFED

Re: [hwloc-devel] PCI device name question

2012-03-20 Thread Jeffrey Squyres
On Mar 20, 2012, at 10:46 AM, Brice Goglin wrote: >> Is there a way in the hwloc topology data to tell which port eth0 and eth1 >> correspond to? > > You should have a "Address" info attribute in each eth object containing > something like

Re: [hwloc-devel] [hwloc-svn] svn:hwloc r4409

2012-03-20 Thread Jeffrey Squyres
Samuel: What do you think of this patch? It separates out the individual version checking to make the #define logic a little easier to read. Index: include/hwloc/autogen/config.h.in === --- include/hwloc/autogen/config.h.in

[hwloc-devel] PCI device name question

2012-03-20 Thread Jeffrey Squyres
On a machine I have, I'm getting output like this with hwloc trunk: PCIBridge PCI 15b3:6750 Net L#11 "eth0" Net L#12 "eth1" OpenFabrics L#13 "mlx4_0" which is all well and good (mlx4_0 is a RoCE card). Is there a way in the hwloc topology data to tell which

Re: [hwloc-devel] trunk build problem

2012-03-20 Thread Jeffrey Squyres
m'. make[2]: *** No rule to make target `doxygen-doc/man/man3/hwloc_obj_cache_type_e.3', needed by `all-am'. make[2]: *** No rule to make target `doxygen-doc/man/man3/hwloc_obj_cache_type_t.3', needed by `all-am'. On Mar 20, 2012, at 7:50 AM, Jeffrey Squyres wrote: > FYI: > > m

[hwloc-devel] trunk build problem

2012-03-20 Thread Jeffrey Squyres
FYI: make[2]: *** No rule to make target `doxygen-doc/man/man3/HWLOC_TOPOLOGY_FLAG_ICACHES.3', needed by `all-am'. Stop. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI devel] Replacing poll()

2012-03-19 Thread Jeffrey Squyres
On Mar 19, 2012, at 3:35 PM, Alex Margolin wrote: > I've removed put and get from mosix (feels good to cut down on code > lines...), but now the question has to be asked: > Threre are send and sendi (is sendi sufficient, or must i include send as > well?) send is sufficient. sendi is an

[OMPI devel] 1.5.5rc4 posted

2012-03-19 Thread Jeffrey Squyres
We're getting close. I swear! 1.5.5rc4 is posted: http://www.open-mpi.org/software/ompi/v1.5/ Fixes since rc3: - MXM fixes - Coll tuned fixes for large data (> 2GB) - Fix "external" hwloc component build - Fix pmi modex so local/node ranks are correctly assigned - Print error when mpool

Re: [OMPI devel] RFC: ob1: fallback on put/send on rget failure

2012-03-19 Thread Jeffrey Squyres
George / Brian -- Can you guys comment on this patch? On Mar 15, 2012, at 5:07 PM, Nathan Hjelm wrote: > What: Update ob1 to do the following: > - fallback on send after rdma_put_retries_limit failures of prepare_dst > - fallback on put (single non-pipelined) if the btl returns >

[OMPI devel] Fwd: [hwloc-devel] possible membind changes coming in the Linux kernel

2012-03-16 Thread Jeffrey Squyres
This isn't strictly related to Open MPI, but all of us here care about NUMA, locality, and performance, so I thought I'd pass along something that Brice forwarded to the hwloc-devel list. See Brice's note below, and the original mail to the LKML below that. Begin forwarded message: > From:

[hwloc-devel] topology-x86.c warning

2012-03-15 Thread Jeffrey Squyres
I found this warning in OMPI 1.5: CC topology-x86.lo topology-x86.c: In function 'look_proc': topology-x86.c:189: warning: 'ways' may be used uninitialized in this function On the hwloc trunk, the ways variable is not initialized, and there's an "if" block where one of the branches

Re: [OMPI devel] poor btl sm latency

2012-03-15 Thread Jeffrey Squyres
On Mar 15, 2012, at 8:06 AM, Matthias Jurenz wrote: > We made a big step forward today! > > The used Kernel has a bug regarding to the shared L1 instruction cache in AMD > Bulldozer processors: > See >

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26106

2012-03-13 Thread Jeffrey Squyres
I would like to understand this more. Let's talk about it tomorrow on the weekly teleconf. On Mar 9, 2012, at 5:55 PM, Nathan Hjelm wrote: > I tested my grdma mpool with the openib btl and IMB Alltoall/Alltoallv on a > system that consistently hangs. If I give the connection module the

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26106

2012-03-09 Thread Jeffrey Squyres
On Mar 9, 2012, at 1:32 PM, Nathan Hjelm wrote: > An mpool that is aware of local processes lru's will solve the problem in > most cases (all that I have seen) I agree -- don't let words in my emails make you think otherwise. I think this will fix "most" problems, but undoubtedly, some will

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26106

2012-03-09 Thread Jeffrey Squyres
On Mar 9, 2012, at 1:14 PM, George Bosilca wrote: >> The hang occurs because there is nothing on the lru to deregister and >> ibv_reg_mr (or GNI_MemRegister in the uGNI case) fails. The PML then puts >> the request on its rdma pending list and continues. If any message comes in >> the rdma

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26106

2012-03-09 Thread Jeffrey Squyres
vel, the PML (in the mca_pml_ob1_send_request_start function) > intercept it and insert the request into a pending list. Later on this > pending list will be examined and the request for resource re-issued. > > Why do we need to trigger a BTL_ERROR for OUT_OF_RESOURCES? > > george. > > On Mar 6, 20

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26119

2012-03-08 Thread Jeffrey Squyres
Can you add this to NEWS, please? Thanks. On Mar 8, 2012, at 5:02 PM, jjhur...@osl.iu.edu wrote: > Author: jjhursey > Date: 2012-03-08 17:02:28 EST (Thu, 08 Mar 2012) > New Revision: 26119 > URL: https://svn.open-mpi.org/trac/ompi/changeset/26119 > > Log: > Create an MCA parameter

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26106

2012-03-06 Thread Jeffrey Squyres
Mike -- I would make this a bit better of an error. I.e., use orte_show_help(), so you can explain the issue more, and also remove all duplicates (i.e., if it fails to register multiple times). On Mar 6, 2012, at 8:25 AM, mi...@osl.iu.edu wrote: > Author: miked > Date: 2012-03-06 09:25:56

Re: [OMPI devel] Replacing poll()

2012-03-02 Thread Jeffrey Squyres
ge passing. On Mar 2, 2012, at 2:22 PM, Alex Margolin wrote: > > On 03/02/2012 04:33 PM, Jeffrey Squyres wrote: >> Note that the OMPI 1.4.x series is about to be retired. If you're doing new >> stuff, I'd advise you to be working with the Open MPI SVN trunk. In the >>

Re: [OMPI devel] Replacing poll()

2012-03-02 Thread Jeffrey Squyres
Note that the OMPI 1.4.x series is about to be retired. If you're doing new stuff, I'd advise you to be working with the Open MPI SVN trunk. In the trunk, we've changed how we build libevent, so if you're adding to it, you probably want to be working there for max forward-compatibility. That

Re: [OMPI devel] poor btl sm latency

2012-03-02 Thread Jeffrey Squyres
Hah! I just saw your ticket about how --with-hwloc=/path/to/install is broken in 1.5.5. So -- let me go look in to that... On Mar 2, 2012, at 8:58 AM, Jeffrey Squyres wrote: > Ok. Good that there's no oversubscription bug, at least. :-) > > Did you see my off-list mail to you

Re: [OMPI devel] poor btl sm latency

2012-03-02 Thread Jeffrey Squyres
Ok. Good that there's no oversubscription bug, at least. :-) Did you see my off-list mail to you yesterday about building with an external copy of hwloc 1.4 to see if that helps? On Mar 2, 2012, at 8:26 AM, Matthias Jurenz wrote: > To exclude a possible bug within the LSF component, I

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)

2012-03-01 Thread Jeffrey Squyres
...or in 1.5.5. How soon will you be able to tell if it fixes some hangs? On Mar 1, 2012, at 10:56 AM, Nathan Hjelm wrote: > Found a pretty nasty frag leak (and a minor one) in ob1 (see commit below). > If this fix addresses some hangs we are seeing on infiniband LANL might want > a 1.4.6

Re: [OMPI devel] Open MPI nightly tarballs suspended / 1.5.5rc3

2012-02-29 Thread Jeffrey Squyres
On Feb 28, 2012, at 6:40 PM, Paul H. Hargrove wrote: > Testing 1.5.5rc3 on a "representative sampling" of my many platforms looks > good. > In particular, I've retested various platforms that showed any significant > problems previously and found them to be fixed. > > Though minor, I do see

Re: [OMPI devel] 1.5.5rc2 missing a Mellanox PCI vendor ID

2012-02-29 Thread Jeffrey Squyres
bel and Sinai HCA > entries as well. > It is already listed for Hermon. > > -Paul > > On 2/23/2012 5:17 AM, Jeffrey Squyres wrote: >> We finally have 1.5.5rc2: >> >> http://www.open-mpi.org/software/ompi/v1.5/ >> >> Given the amount of testing we

[OMPI devel] OMPI tool CLI improvements

2012-02-29 Thread Jeffrey Squyres
I mentioned this on the call yesterday; here's some more details. There have been two improvements to OMPI's tools' CLI behavior recently. These are targeted at 1.7 and beyond, not 1.5/1.6. 1. Ralph committed a change to the general CLI parser such that if an CLI option is expecting an

Re: [OMPI devel] typo in a copyright message

2012-02-28 Thread Jeffrey Squyres
Fixed -- thanks! On Feb 28, 2012, at 6:54 PM, Paul H. Hargrove wrote: > By chance I noticed the following in the trunk: > > Index: ompi-trunk/orte/mca/rml/oob/rml_oob_component.c > === > ---

[OMPI devel] Open MPI nightly tarballs suspended / 1.5.5rc3

2012-02-28 Thread Jeffrey Squyres
There is a serious chilled water issue at IU right now; all non-essential servers (including Open MPI's nightly build server) have been turned off. So we have no new "official" 1.5.5 RC, and no new nightlies will be produced until further notice. However, to keep the 1.5.5 release train

[hwloc-devel] Nightly tarballs temporarily suspended

2012-02-28 Thread Jeffrey Squyres
The CS department at IU, our hosting provider for www.open-mpi.org, is having a serious chilled water issue right now -- all non-essential servers have been powered off to reduce heat in their machine room. This includes hwloc's nightly tarball build server. As such, until the chilled water

Re: [OMPI devel] Compiling OpenMPI 1.5.4 on Debian 6 qemu arm6l

2012-02-28 Thread Jeffrey Squyres
ebian > "armel") will still miss out. > >> -Original Message- >> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On >> Behalf Of Jeffrey Squyres >> Sent: 28 February 2012 14:10 >> To: Leif Lindholm >> Cc: Open MPI Dev

Re: [OMPI devel] Compiling OpenMPI 1.5.4 on Debian 6 qemu arm6l

2012-02-28 Thread Jeffrey Squyres
eep the 64-bit atomics in. > > Best Regards, > > Leif > > References: > http://infocenter.arm.com/help/topic/com.arm.doc.ddi0301h/Babfdddg.html > http://infocenter.arm.com/help/topic/com.arm.doc.ddi0301h/Babhejba.html > >> -Original Message- >>

Re: [OMPI devel] Compiling OpenMPI 1.5.4 on Debian 6 qemu arm6l

2012-02-28 Thread Jeffrey Squyres
Ron -- Many thanks! Leif -- can you comment on this? (yes, I'm passing the buck to our ARM Open MPI representative :-) ) On Feb 26, 2012, at 1:22 PM, Ron Broberg wrote: > I would like to report the following information regarding compiling OpenMPI > on Debian ARMv6. I won't submit this as a

Re: [OMPI devel] Odd build breakage seen with 1.5.5rc2

2012-02-27 Thread Jeffrey Squyres
Sorry folks -- I hadn't noticed that several pending 1.5 CMRs hadn't been rolled in yet. I'll ping George. On Feb 26, 2012, at 5:42 PM, Paul Hargrove wrote: > > > On Sun, Feb 26, 2012 at 6:37 AM, Ralph Castain wrote: > [snip] > In the example you gave, the library you

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26039

2012-02-24 Thread Jeffrey Squyres
It is set in opal/config/opal_configure_options.m4 > > > >> -Original Message- >> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] >> On Behalf Of Jeffrey Squyres >> Sent: Friday, February 24, 2012 6:07 AM >> To: de...@open-mpi

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26039

2012-02-24 Thread Jeffrey Squyres
Rolf -- In looking at configure.m4, where does $CUDA_SUPPORT_41 get set? AS_IF([test "x$CUDA_SUPPORT_41" = "x1"] On Feb 23, 2012, at 9:13 PM, ro...@osl.iu.edu wrote: > Author: rolfv > Date: 2012-02-23 21:13:33 EST (Thu, 23 Feb 2012) > New Revision: 26039 > URL:

Re: [hwloc-devel] ship valgrind suppressions?

2012-02-23 Thread Jeffrey Squyres
Sure, let's ship it in the tarball. On Feb 23, 2012, at 1:28 PM, Brice Goglin wrote: > Hello, > > "make check" doesn't report any single memory leak under valgrind > anymore in trunk... except those from external libs such as libpci and > libxml. I created the attached suppressions file to hide

Re: [OMPI devel] v1.5 build failure w/ Solaris Studio 12.2 on Linux

2012-02-23 Thread Jeffrey Squyres
4 when using > the SS12 compilers. > - Addition of that flag leads to the reported error when compiling > ompi/mpi/cxx/file.cc (NOT in VT) > > -Paul > > On 2/23/2012 7:23 AM, Jeffrey Squyres wrote: >> Terry and I talked about this on the phone. Supporting

Re: [OMPI devel] v1.5 build failure w/ Solaris Studio 12.2 on Linux

2012-02-23 Thread Jeffrey Squyres
Terry and I talked about this on the phone. Supporting facts (some of these are repeated from Paul's prior posts): - This happens with the C++ SS 12.2 compiler on supported Linux platforms - The C++ part of the build (VT) is deep within the OMPI build; it works fine with the C compiler all the

[OMPI devel] 1.5.5rc2

2012-02-23 Thread Jeffrey Squyres
We finally have 1.5.5rc2: http://www.open-mpi.org/software/ompi/v1.5/ Given the amount of testing we've had, this rc might actually be pretty close. Lots and lots of changes since rc1; I'm not even going to bother to list them all. Please test! -- Jeff Squyres jsquy...@cisco.com For

Re: [OMPI devel] 1.5 supported systems

2012-02-23 Thread Jeffrey Squyres
-gcc-4.2 >>> /Developer/usr/llvm-gcc-4.2/bin/powerpc-apple-darwin9-llvm-gcc-4.2 >> >> Larry Baker >> US Geological Survey >> 650-329-5608 >> ba...@usgs.gov >> >> On 22 Feb 2012, at 5:55 PM, Paul H. Hargrove wrote: >> >>> Folks at Oracle

Re: [OMPI devel] 1.5 supported systems

2012-02-23 Thread Jeffrey Squyres
On Feb 23, 2012, at 6:05 AM, TERRY DONTJE wrote: > I actually think the systems tested line for Solaris should read: > - Oracle Solaris 10 and 11, 32 and 64 bit (SPARC, i386, x86_64), with > Oracle Solaris Studio 12.2 and 12.3 Done. -- Jeff Squyres jsquy...@cisco.com For corporate legal

Re: [OMPI devel] v1.5 build failure w/ Solaris Studio 12.2 on Linux

2012-02-22 Thread Jeffrey Squyres
Terry / Eugene -- Can you comment? On Feb 22, 2012, at 3:16 PM, Paul H. Hargrove wrote: > I think I have the beginning of a fix for this issue. > > I had not even noticed earlier that the error in event.h is from the C++ > compiler, when compiling file.cxx in the c++ bindings. That makes

[OMPI devel] 1.5 supported systems

2012-02-22 Thread Jeffrey Squyres
Please verify this list of supported systems for the v1.5.5 release: - The run-time systems that are currently supported are: - rsh / ssh - LoadLeveler - PBS Pro, Open PBS, Torque - Platform LSF (v7.0.2 and later) - SLURM - Cray XT-3, XT-4, and XT-5 - Oracle Grid Engine (OGE) 6.1,

Re: [OMPI devel] v1.5 r25914 DOA

2012-02-21 Thread Jeffrey Squyres
What's the output of running lstopo from hwloc 1.3.2? (this is the version that's in the OMPI trunk and v1.5 branches) http://www.open-mpi.org/software/hwloc/v1.3/ Is there any difference from v1.4 hwloc? http://www.open-mpi.org/software/hwloc/v1.4/ On Feb 21, 2012, at 7:20 PM,

Re: [OMPI devel] [EXTERNAL] Re: trunk build failure on Altix [w/WORK AROUND]

2012-02-21 Thread Jeffrey Squyres
CMR filed; custom v1.5 patch attached: https://svn.open-mpi.org/trac/ompi/ticket/3024 On Feb 20, 2012, at 4:52 PM, Jeff Squyres (jsquyres) wrote: > Yo Brian -- > > Do we need to bring this to v1.5, too? > > > On Feb 20, 2012, at 11:49 AM, Barrett, Brian W wrote: > > > Hi Paul - > > > >

Re: [OMPI devel] excessive warnings on some BSDs [w/ PATCH]

2012-02-21 Thread Jeffrey Squyres
Committed and CMR'ed. Thanks! On Feb 17, 2012, at 3:22 PM, Paul H. Hargrove wrote: > When building trunk or 1.5.x on OpenBSD-5.0 (and maybe others), I get *LOTS* > of the following: >> /usr/include/arpa/inet.h:74: warning: 'struct in_addr' declared inside >> parameter list >>

Re: [OMPI devel] Solaris/SOS build failure in trunk

2012-02-21 Thread Jeffrey Squyres
Should be fixed on the trunk in r25982. On Feb 18, 2012, at 7:39 AM, Paul Hargrove wrote: > Same has been seen on Solaris11/x86-64 w/ the Studio 12.3 compiler. > However, a gcc build on the same system was fine. > > -Paul > > On Fri, Feb 17, 2012 at 10:49 AM, Paul H. Hargrove

Re: [OMPI devel] non-portable code in examples/Makefile

2012-02-21 Thread Jeffrey Squyres
Right. The revamped fortran stuff is *coming* -- it's off in a Mercurial bitbucket right now. It's not on the trunk yet. It's here, if you care: https://bitbucket.org/jsquyres/mpi3-fortran On Feb 21, 2012, at 7:57 AM, Paul H. Hargrove wrote: > > > On 2/21/2012 2:55 AM, Jeff Squyres

Re: [OMPI devel] non-portable code in examples/Makefile

2012-02-21 Thread Jeffrey Squyres
On Feb 21, 2012, at 6:39 AM, TERRY DONTJE wrote: >> Heads up that in the upcoming fortran revamp, we *only* use FC. I.E., >> there's only mpifort wrapper compiler (mpif77 and mpif90 still exist, but >> only as sym links to mpifort, signifying that mpifort is the way of the >> future). >> >>

Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn] svn:open-mpi r25966

2012-02-20 Thread Jeffrey Squyres
012, at 4:03 PM, Jeffrey Squyres wrote: > >> FWIW, I think we're still going to have another problem with "make dist" -- >> some of the Java header files are generated. I'm not sure we have that >> right in the Makefile.am. >> >> I committed

Re: [OMPI devel] Invalid format strings in ROMIO

2012-02-20 Thread Jeffrey Squyres
We've been forwarding all of these kinds of fixes upstream. On Feb 20, 2012, at 7:23 PM, Paul H. Hargrove wrote: > Both the v1.5 branch and trunk are getting lots of warnings from Clang like > the following: >> CC ad_coll_exch_new.lo >>

Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn] svn:open-mpi r25966

2012-02-20 Thread Jeffrey Squyres
ay not get this done by the tarball generation tonight. On Feb 20, 2012, at 5:59 PM, Jeffrey Squyres wrote: > Added… thanks. > > > On Feb 20, 2012, at 5:41 PM, Barrett, Brian W wrote: > >> That's because Jeff forgot to copy the line: >> >> AC_CONFIG_FILES(

Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn] svn:open-mpi r25966

2012-02-20 Thread Jeffrey Squyres
Added… thanks. On Feb 20, 2012, at 5:41 PM, Barrett, Brian W wrote: > That's because Jeff forgot to copy the line: > > AC_CONFIG_FILES([ompi/mca/fbtl/posix/Makefile]) > >> From whatever configure.m4 script he used as the base for his new macro :). > > Brian > > On 2/20/12 3:36 PM, "Ralph

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25966

2012-02-20 Thread Jeffrey Squyres
I'll dig… On Feb 20, 2012, at 5:36 PM, Ralph Castain wrote: > I'm afraid this commit breaks the ability to build from a tarball. I created > a tarball from the trunk and then did a configure followed by "make clean". > The make command failed to execute because it could not "make clean" in the

Re: [OMPI devel] [EXTERNAL] Re: trunk build failure on Altix [w/ WORK AROUND]

2012-02-20 Thread Jeffrey Squyres
Yo Brian -- Do we need to bring this to v1.5, too? On Feb 20, 2012, at 11:49 AM, Barrett, Brian W wrote: > Hi Paul - > > Thanks for noticing this. I guess we don't have many Altix developers. I > think I've fixed it on the trunk with r25968, plus r25967 to make sure the > Altix component

Re: [OMPI devel] non-portable test operator in configure

2012-02-20 Thread Jeffrey Squyres
>"$orte_mca_ess_alps_have_cnos" = 1], > [$1], > [$2]) > > > That is sufficient to let "dash" on an Ubuntu system make it through > configure. > I'll report back ASAP on my slowlaris10 results. > > NOTE: this is NOT prese

Re: [OMPI devel] non-portable test operator in configure

2012-02-20 Thread Jeffrey Squyres
t; and "-o" together with variables > that may expand to the empty string, and I suspect that is the new problem I > am hitting. I hope to know soon. > > -Paul > > > On 2/20/2012 12:41 PM, Jeffrey Squyres wrote: >> grep == configure | grep test >>

  1   2   >