[OMPI devel] rfc: backport of orte debugger framework to 1.5 branch

2010-12-07 Thread Nathan Hjelm
I backported the debugger framework to 1.5. The repository can be found here: http://bitbucket.org/hjelmn/ompi-1.5-fifo . The backport has been tested with launchmon 0.7.2 (using -mca debugger mpirx) and totalview 8.7.0. Please test and let me know if there are any problems. Thanks, -Nathan

[OMPI devel] RFC: use ISO C99 style struct initialization

2011-01-19 Thread Nathan Hjelm
I don't know if this has been discussed before or if this will break Windows (or some obscure platform) support but I would like to start using the ISO C99 style for struct initialization (see section 6.7.8, example 10 in http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf). Using this st

Re: [OMPI devel] RFC: use ISO C99 style struct initialization

2011-01-19 Thread Nathan Hjelm
d leave it to MTT to figure out where is breaks! george. On Jan 19, 2011, at 14:23 , Nathan Hjelm wrote: I don't know if this has been discussed before or if this will break Windows (or some obscure platform) support but I would like to start using the ISO C99 style for struct initialization (s

Re: [OMPI devel] RFC: use ISO C99 style struct initialization

2011-01-20 Thread Nathan Hjelm
warning: ISO C90 forbids specifying subobject to initialize george. On Jan 19, 2011, at 20:36 , Terry Dontje wrote: Hopefully we'll find out tomorrow but I think I vaguely remember an issue with the Studio compilers and this type of initialization style. --td On 01/19/2011 05:22 PM

Re: [OMPI devel] RFC: use ISO C99 style struct initialization

2011-01-20 Thread Nathan Hjelm
e Studio compilers and this type of initialization style. --td On 01/19/2011 05:22 PM, Nathan Hjelm wrote: Done. I added the module orte/mca/debugger/dummy and I will remove it tomorrow. -Nathan HPC-3, LANL On Wed, 19 Jan 2011, Jeff Squyres wrote: +1 on Ralph and George's comments. Want

Re: [OMPI devel] dummy component warnings

2011-01-24 Thread Nathan Hjelm
No, they didn't get added (adding them now). I didn't get a chance to add them over the weekend. -Nathan On Mon, 24 Jan 2011, Jeff Squyres wrote: I'm getting these: CC dummy_component.lo dummy_component.c:25: warning: ISO C90 forbids specifying subobject to initialize dummy_component.c

Re: [OMPI devel] dummy component warnings

2011-01-25 Thread Nathan Hjelm
auses configure to abort because all the assembly tests looking for the global symbols error out due to the # token. So I think we either need to find a workaround for this assembly test or whack the idea of the C99 stuff. :-( On Jan 24, 2011, at 10:29 AM, Nathan Hjelm wrote: No, they didn&

Re: [OMPI devel] dummy component warnings

2011-01-25 Thread Nathan Hjelm
0 configure:27455: gcc -std=gnu99 -O3 -DNDEBUG -finline-functions -fno-strict-aliasing conftest_c.o conftest.o -o conftest    > conftest.link 2>&1 configure:27458: $? = 0 configure:27496: result: On 1/25/2011 2:19 PM, Paul H. Hargrove wrote: I have gcc-4.0.0 on Linux built from

Re: [OMPI devel] dummy component warnings

2011-01-25 Thread Nathan Hjelm
:5:3: error: invalid preprocessing directive #_gsym_test_func $ gcc -std=c99 -c -xassembler foo.s [no output] -Paul On 1/25/2011 2:48 PM, Nathan Hjelm wrote: Ok, then there are two possible simple fixes:  - Strip -std from CCASFLAGS if Apple's gcc 4.0 is encountered, or  - Always strip

Re: [OMPI devel] dummy component warnings

2011-01-25 Thread Nathan Hjelm
c99 -c foo.s foo.s:2:3: error: invalid preprocessing directive #_gsym_test_func foo.s:5:3: error: invalid preprocessing directive #_gsym_test_func $ gcc -std=c99 -c -xassembler foo.s [no output] -Paul On 1/25/2011 2:48 PM, Nathan Hjelm wrote: Ok, then there are two possible simple fixes:

[OMPI devel] hwloc causes compilation to fail

2011-02-09 Thread Nathan Hjelm
/include/pthread.h:580: multiple definition of `__pthread_cleanup_routine' mca/paffinity/hwloc/.libs/libmca_paffinity_hwloc.a(paffinity_hwloc_component.o):/usr/include/pthread.h:580: first defined here -Nathan Hjelm HPC-3, LANL

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25234

2011-10-05 Thread Nathan Hjelm
On Wed, 5 Oct 2011, Barrett, Brian W wrote: On 10/5/11 2:22 PM, "Ralph Castain" wrote: I thought I already had a check pmi m4 somewhere? Should have been in that pmi component I committed a few months ago. I can check next week. You did :). LANL's moving some code around so that we can e

[OMPI devel] RFC: upgrade to libevent 2.0.13 (removing 2.0.7)

2011-10-19 Thread Nathan Hjelm
WHAT: upgrade to libevent 2.0.13 WHY: libevent bug fixes WHEN: Nov 2, 2011 TIMEOUT: 2 weeks *** Jeff, Ralph, and I have been using the libevent2013 component for the last month without issue. In 2 weeks I will: - remove opal/mca/event/libev

Re: [OMPI devel] [OMPI svn] svn:open-mpi r25350

2011-10-21 Thread Nathan Hjelm
Does this failure path exist in 1.5? -Nathan On Fri, 21 Oct 2011, r...@osl.iu.edu wrote: Author: rhc Date: 2011-10-21 10:44:48 EDT (Fri, 21 Oct 2011) New Revision: 25350 URL: https://svn.open-mpi.org/trac/ompi/changeset/25350 Log: Fix a minor issue seen by Jeff in specific failure pathway Te

[OMPI devel] RFC: new btl descriptor flags

2011-11-29 Thread Nathan Hjelm
We need an accurate way to detect if prepare_src/prepare_dst are being called for a get or a put operation. I propose adding two new flags to the btl descriptor (and passing them from ob1/csum/etc): #define MCA_BTL_DES_PUT 0x0010 #define MCA_BTL_DES_GET 0x0020 Comments? Suggestions? Objections

Re: [OMPI devel] RFC: new btl descriptor flags

2011-11-29 Thread Nathan Hjelm
On Tue, 29 Nov 2011, George Bosilca wrote: These two functions target at defining a memory layout (contiguous or not) that can be target for a one-sided communication. I don't see why there is a need to know what type of communication that will be … What is so different in the xpmem that re

[OMPI devel] RFC: Fix for 2157 (mpool/rdma change)

2011-11-30 Thread Nathan Hjelm
Attached is a fix for ticket 2157. Changes: - Rename the mru_list to lru_list. lru_list make more sense as it is a list of the Least Recently Used cached registrations. - If a memory registration fails because we are out of resources deregister the least recently used cached registration and t

[OMPI devel] Totalview broken with 1.5/trunk

2011-12-14 Thread Nathan Hjelm
There still seems to be an issue with using mpirun --debug with totalview. For some reason totalview is not breaking on MPIR_Breakpoint. Removing the foo = MPIR_Breakpoint line from orterun.c fixes this issue. Is there any reason I shouldn't remove that line? Any other debuggers that might bre

Re: [OMPI devel] Totalview broken with 1.5/trunk

2011-12-15 Thread Nathan Hjelm
On Wed, 14 Dec 2011, Ralph Castain wrote: Yes - we were having problems making symbols in orterun visible for the "stat" debugger when built dynamically. The symbols are actually instantiated in the debugger base, but they need to be "seen" in orterun prior to us calling orte_init. So, we ha

Re: [OMPI devel] Totalview broken with 1.5/trunk

2011-12-15 Thread Nathan Hjelm
Your changes don't break anything but they also don't cause MPIR_Breakpoint to appear in orterun: ct-login1:/scratch2/hjelmn hjelmn$ nm `type -p orterun` | grep MPIR 0060b0e0 B MPIR_attach_fifo 0060b2e0 B MPIR_being_debugged 0060b7b0 B MPIR_debug_state 0060ada0 B M

Re: [OMPI devel] MPIR attach from padb broken (1.5.5rc1)

2011-12-15 Thread Nathan Hjelm
That appears to be a similar problem to the MPIR_Breakpoint bug. Let me play around and see if I can find a fix. -Nathan Hjelm HPC-3, LANL On Thu, 15 Dec 2011, Ashley Pittman wrote: There is a problem with 1.5.5rc1 that prevents padb from loading the process table start from the orterun

Re: [OMPI devel] MPIR attach from padb broken (1.5.5rc1)

2011-12-15 Thread Nathan Hjelm
orte/tools/orterun/debuggers.c does not exist anymore (its not in the 1.5.5rc1 tarball). I don't know why the symbols are showing up in section B of orterun. Investigating now. -Nathan Hjelm HPC-3, LANL On Thu, 15 Dec 2011, George Bosilca wrote: On Dec 15, 2011, at 16:55 , Ashley Pi

Re: [OMPI devel] MPIR attach from padb broken (1.5.5rc1)

2011-12-15 Thread Nathan Hjelm
Whats odd is totalview, STAT, and GDB see the correct values despite them being in the B section. What does padb do differently? This is a dynamic, optimized build of 1.5.5rc1. -Nathan Hjelm HPC-3, LANL On Thu, 15 Dec 2011, Ashley Pittman wrote: If I add a new symbol to orte/mca/debugger

Re: [OMPI devel] MPIR attach from padb broken (1.5.5rc1)

2011-12-15 Thread Nathan Hjelm
using gdb alone using just the trace I sent when I started this thread. Perhaps the difference is in versions of gdb, I could give you a login to my test machine if you need? Ashley. On 15 Dec 2011, at 22:49, Nathan Hjelm wrote: Whats odd is totalview, STAT, and GDB see the correct values

[OMPI devel] RFC: Allocate free list payload if free list isn't specified

2012-02-21 Thread Nathan Hjelm
case. Thoughts? Patch is attached. -Nathan Hjelm HPC-3, LANLdiff --git a/ompi/class/ompi_free_list.c b/ompi/class/ompi_free_list.c index d468a70..e3c0988 100644 --- a/ompi/class/ompi_free_list.c +++ b/ompi/class/ompi_free_list.c @@ -1,4 +1,4 @@ -/* -*- Mode: C; c-basic-offset:4 ; -*- */ +/* -*- Mo

Re: [OMPI devel] RFC: Allocate free list payload if free list isn't specified

2012-02-21 Thread Nathan Hjelm
Opps, screwed up the title. Should be: RFC: Allocate requested free list payload even if an mpool isn't specified. -Nathan On Tue, 21 Feb 2012, Nathan Hjelm wrote: What: Allocate free list payload even if a payload size is specified even if no mpool is specified. When: Thursday, F

Re: [OMPI devel] RFC: Allocate free list payload if free list isn't specified

2012-02-21 Thread Nathan Hjelm
On Tue, 21 Feb 2012, Rolf vandeVaart wrote: I think I am OK with this. Alternatively, you could have done something like is done in the TCP BTL where the payload and header are added together for the frag size? To state more clearly, I was trying to say you could do something similar to what

[OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)

2012-03-01 Thread Nathan Hjelm
Found a pretty nasty frag leak (and a minor one) in ob1 (see commit below). If this fix addresses some hangs we are seeing on infiniband LANL might want a 1.4.6 rolled (or a faster rollout for 1.6.0). -Nathan -- Forwarded message -- List-Post: devel@lists.open-mpi.org Date: Thu

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)

2012-03-01 Thread Nathan Hjelm
On Thu, 1 Mar 2012, Jeffrey Squyres wrote: ...or in 1.5.5. Well, we want a "stable" release to deploy on the affected cluster. How soon will you be able to tell if it fixes some hangs? I will know in a couple of hours. Tested the fix in 1.4.5 and it appears to elimiate my IMB hang! I stil

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)

2012-03-01 Thread Nathan Hjelm
limit doesn't eliminate the hang. I will continue to investigate next week. -Nathan On Thu, 1 Mar 2012, Jeffrey Squyres wrote: ...or in 1.5.5. How soon will you be able to tell if it fixes some hangs? On Mar 1, 2012, at 10:56 AM, Nathan Hjelm wrote: Found a pretty nasty frag leak (and a min

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26106

2012-03-09 Thread Nathan Hjelm
Not exactly, the PML invokes the mpool which invokes the registration function. If registration fails the mpool will deregister from its lru (if possible) and try again. So, it is not an error if ibv_reg_mr fails unless it fails because the process is starved of registered memory (or truely run

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26106

2012-03-09 Thread Nathan Hjelm
On Fri, 9 Mar 2012, Jeffrey Squyres wrote: On Mar 9, 2012, at 1:14 PM, George Bosilca wrote: The hang occurs because there is nothing on the lru to deregister and ibv_reg_mr (or GNI_MemRegister in the uGNI case) fails. The PML then puts the request on its rdma pending list and continues. I

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26106

2012-03-09 Thread Nathan Hjelm
On Fri, 9 Mar 2012, Jeffrey Squyres wrote: On Mar 9, 2012, at 1:32 PM, Nathan Hjelm wrote: An mpool that is aware of local processes lru's will solve the problem in most cases (all that I have seen) I agree -- don't let words in my emails make you think otherwise. I think thi

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26106

2012-03-09 Thread Nathan Hjelm
On Fri, 9 Mar 2012, George Bosilca wrote: On Mar 9, 2012, at 14:23 , Nathan Hjelm wrote: BTW, can anyone tell me why each mpool defines mca_mpool_base_resources_t instead of defining mca_mpool_blah_resources_t. The current design makes it impossible to support more than one mpool in a

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26106

2012-03-09 Thread Nathan Hjelm
I tested my grdma mpool with the openib btl and IMB Alltoall/Alltoallv on a system that consistently hangs. If I give the connection module the ability to evict from the lru grdma prevents both the out of registered memory hang AND problems creating QPs (due to exhaustion of registered memory).

[OMPI devel] RFC: ob1: fallback on put/send on rget failure

2012-03-15 Thread Nathan Hjelm
. Please take a look at the attached patch. Feedback and constructive criticism is needed! -Nathan Hjelm HPC-3, LANL ompi_trunk_ob1_get_fallback.patch.gz Description: GNU Zip compressed data

[OMPI devel] RFC: change default for tuned alltoallv to pairwise

2012-03-21 Thread Nathan Hjelm
What: Change coll tuned default to pairwise exchange Why: The linear algorithm does not scale to any reasonable number of PEs When: Timeout in 2 days (Fri) Is there any reason the default should not be changed? -Nathan HPC-3, LANL

Re: [OMPI devel] RFC: change default for tuned alltoallv to pairwise

2012-03-22 Thread Nathan Hjelm
On Thu, 22 Mar 2012, Shamis, Pavel wrote: What: Change coll tuned default to pairwise exchange Why: The linear algorithm does not scale to any reasonable number of PEs When: Timeout in 2 days (Fri) Is there any reason the default should not be changed? Nathan, I can see why people thin

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26329

2012-04-24 Thread Nathan Hjelm
This was RFC'd last month. No one objected :) -Nathan On Tue, 24 Apr 2012, Jeffrey Squyres wrote: There's some pretty extensive ob1 changes in here. Can we get these reviewed? Brian / George? On Apr 24, 2012, at 4:18 PM, hje...@osl.iu.edu wrote: Author: hjelmn Date: 2012-04-24 16:18:56 E

[OMPI devel] libevent socket code

2012-04-25 Thread Nathan Hjelm
Anyone object if I #if 0 out all the socket code in libevent. We see lots of static compilation warnings because of that code and nothing in openmpi uses it. -Nathan

Re: [OMPI devel] libevent socket code

2012-04-25 Thread Nathan Hjelm
Let me take a look. The code in question is in evutil.c and bufferevent_sock.c . If there is no option we might be able to get away with just removing these files from the Makefile.am. -Nathan On Wed, 25 Apr 2012, Jeff Squyres wrote: On Apr 25, 2012, at 12:50 PM, Ralph Castain wrote: Can't

Re: [OMPI devel] Potential ob1 bug

2012-05-07 Thread Nathan Hjelm
George, thanks for taking a look. Your patch looks good and I can confirm it fixes the hang I am seeing on our XE6. -Nathan On Thu, 3 May 2012, George Bosilca wrote: Nathan, You're right, when we loop trying to restart a failed request we must reset the convertor. However: 1. the position i

[OMPI devel] RFC: hide btl segment keys within btl

2012-06-13 Thread Nathan Hjelm
components: - ob1 - csum - bfo - ugni (now works with MPI one-sides) - sm - vader - openib (in progress) Brian and Rolf, please take a look at your components and let me know if I screwed anything up. -Nathan Hjelm HPC-3, LANL

Re: [OMPI devel] Barrier/coll_tuned/pml_ob1 segfault for derived data types

2012-06-15 Thread Nathan Hjelm
Seems like either a bug in the converter code or in setting up the send request. r26597 ensures correctness in the case the btl's sendi does all three of the following: returns an error, changes the converter, and returns a descriptor. Until we can find the root cause I pushed a change that pro

Re: [OMPI devel] RFC: hide btl segment keys within btl

2012-06-18 Thread Nathan Hjelm
ents > >compared with the available space. And add a huge comment in the btl.h > >about the fact that mca_btl_base_segment_t should be used with extreme > >care. > > > > george. > > > >On Jun 14, 2012, at 18:42 , Jeff Squyres wrote: > > > >> This sounds like

Re: [OMPI devel] RFC: add asynchronous copies for large GPU buffers

2012-06-27 Thread Nathan Hjelm
Can you make your repository public or add me to the access list? -Nathan On Wed, Jun 27, 2012 at 03:12:34PM -0700, Rolf vandeVaart wrote: > WHAT: Add support for doing asynchronous copies of GPU memory with larger > messages. > WHY: Improve performance for sending/receiving of larger GPU messag

[OMPI devel] Trunk compilation broken

2012-07-02 Thread Nathan Hjelm
With platform contrib/platform/lanl/tlss/debug-panasus I get an error: make[2]: Entering directory `/panfs/scratch/vol7/hjelmn/turing/ompi-trunk-git/ompi/tools/ompi_info' CCLD ompi_info ../../../ompi/.libs/libmpi.so: undefined reference to `NBC_Operation' Brian, can you take a look? -Nathan

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26707 - in trunk/ompi: config mca/btl/ofud mca/btl/openib mca/common/ofacm mca/common/ofautils mca/dpm

2012-07-02 Thread Nathan Hjelm
Nice! Are we moving this to 1.7 as well? -Nathan On Mon, Jul 02, 2012 at 11:20:12AM -0400, svn-commit-mai...@open-mpi.org wrote: > Author: pasha (Pavel Shamis) > Date: 2012-07-02 11:20:12 EDT (Mon, 02 Jul 2012) > New Revision: 26707 > URL: https://svn.open-mpi.org/trac/ompi/changeset/26707 > > L

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26707 - in trunk/ompi: config mca/btl/ofud mca/btl/openib mca/common/ofacm mca/common/ofautils mca/dpm

2012-07-02 Thread Nathan Hjelm
plication Performance Tools Group > Computer Science and Math Division > Oak Ridge National Laboratory > > > > > > > On Jul 2, 2012, at 11:20 AM, Nathan Hjelm wrote: > > > Nice! Are we moving this to 1.7 as well? > > > > -Nathan > > &

[OMPI devel] Opal wrappers question

2012-07-09 Thread Nathan Hjelm
I am playing with the Open MPI wrappers and had a couple of questions: - If an argument is used to select a section (compiler_args) is that option supposed to be passed to the compiler (because it is)? - If the above is the intended behavior is there any objection to an additional wrapper optio

[OMPI devel] RFC: enable the use of source in platform files

2012-07-09 Thread Nathan Hjelm
When: Thurs, Jul 12, 5 PM MDT Why: Useful feature. Enabling source may cut down on the maintenance required to keep platform files up to date. How: Change directories to the platform file's directory before sourcing it (not after). diff --git a/config/ompi_load_platform.m4 b/config/ompi_load_p

Re: [OMPI devel] RFC: enable the use of source in platform files

2012-07-09 Thread Nathan Hjelm
09, 2012 at 02:21:28PM -0700, Ralph Castain wrote: > I'm confused - how does this help maintain a platform file??? > > > On Jul 9, 2012, at 2:09 PM, Nathan Hjelm wrote: > > > When: Thurs, Jul 12, 5 PM MDT > > > > Why: Useful feature. Enabling source may cut d

Re: [OMPI devel] RFC: enable the use of source in platform files

2012-07-09 Thread Nathan Hjelm
On Mon, Jul 09, 2012 at 03:31:33PM -0700, Ralph Castain wrote: > So if I understand this right, you would have multiple platform files, each > "sourcing" a common one that contains the base directives? It sounds to me > like you need more than the change below to make that work - you would need

[OMPI devel] Summary of the problem with r26626

2012-07-12 Thread Nathan Hjelm
After some digging Terry and I discovered the problem with r26626. To perform an rdma transaction pmls used to explicitly promote the seg_addr from prepare_src/dst to 64-bits before sending it over the wire. The other end would then (inconsistently) use the lval to perform the get/put. Segments

Re: [OMPI devel] RFC: enable the use of source in platform files

2012-07-12 Thread Nathan Hjelm
. > > > > Am I missing something here? It doesn't sound like you've really even tried > > this yet - sure, chaining "source" commands will work, but do you actually > > get the desired configuration?? > > > > Hence my comment about needing to mod

Re: [OMPI devel] RFC: OMPI git mirror on github.com

2012-08-20 Thread Nathan Hjelm
On Sat, Aug 18, 2012 at 07:16:00AM -0700, Ralph Castain wrote: > Yeah, even if someone volunteered to do the conversion work, we wouldn't get > agreement on making such a change. Some of us hate git (myself included), > some feel similarly about mercurial, etc. > > Unfortunately, we've seen enou

[OMPI devel] RFC: Remove deprecated functions from mca_base_param

2012-10-11 Thread Nathan Hjelm
agree that is long enough to warn developers. This will be CMRd to 1.7. -Nathan Hjelm HPC-3, LANL

Re: [OMPI devel] RFC: Remove deprecated functions from mca_base_param (w patch)

2012-10-11 Thread Nathan Hjelm
Patch attached this time. -Nathan Hjelm HPC-3, LANL On Thu, Oct 11, 2012 at 10:59:38AM -0600, Nathan Hjelm wrote: > What: Remove deprecated functions. This includes removing ocl_mca_type_name > and ocl_mca_component_name from opal_cmd_line_init_t and removing the > following funct

Re: [OMPI devel] [OMPI svn] svn:open-mpi r27451 - in trunk: ompi/mca/allocator/bucket ompi/mca/bcol/basesmuma ompi/mca/bml/base ompi/mca/btl ompi/mca/btl/base ompi/mca/btl/openib ompi/mca/btl/sm ompi/

2012-10-30 Thread Nathan Hjelm
; ompi/mca/btl ompi/mca/btl/baseompi/mca/btl/openib ompi/mca/btl/sm > >> ompi/mca/btl/smcuda ompi/mca/btl/template ompi/mca/btl/va... > >> > >> Hmmm...this didn't just remove deprecated functions. It actually changed > >> the way the cmd line

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r27526 - trunk/orte/mca/plm/rsh

2012-10-30 Thread Nathan Hjelm
On Tue, Oct 30, 2012 at 01:16:17PM -0700, Ralph Castain wrote: > Actually, now that I look at it, I'm not sure what Jeff is talking about here > is correct. I think Nathan's patch is in fact right. > > Nathan's change doesn't in any way impact what gets passed to remote procs. > All it does is m

[OMPI devel] RFC: fix frameworks usage of opal_output

2012-11-01 Thread Nathan Hjelm
t timeout: tomorrow (Nov 2), 12:00pm MDT. Questions? Comments? -Nathan Hjelm HPC-3, LANL diff --git a/ompi/mca/btl/base/btl_base_close.c b/ompi/mca/btl/base/btl_base_close.c index b3632ab..21ffb90 100644 --- a/ompi/mca/btl/base/btl_base_close.c +++ b/ompi/mca/btl/base/btl_base_close.c @@ -58,11 +5

Re: [OMPI devel] RFC: fix frameworks usage of opal_output

2012-11-01 Thread Nathan Hjelm
f mca_topo_base_components_opened_valid is false it isn't safe to call mca_base_components_close. It is a little confusing and I don't know why the author if topo decided to do it that way. -Nathan Hjelm HPC-3, LANL

Re: [OMPI devel] RFC: fix frameworks usage of opal_output

2012-11-01 Thread Nathan Hjelm
On Thu, Nov 01, 2012 at 06:50:30PM -0400, George Bosilca wrote: > > On Nov 1, 2012, at 16:18 , Nathan Hjelm wrote: > > > On Thu, Nov 01, 2012 at 04:07:32PM -0400, George Bosilca wrote: > >> Nathan, > >> > >> Here is a quick question regar

[OMPI devel] RFC: fix frameworks usage of opal_output (updated)

2012-11-05 Thread Nathan Hjelm
On Thu, Nov 01, 2012 at 07:22:42PM -0400, George Bosilca wrote: > > On Nov 1, 2012, at 19:07 , Nathan Hjelm wrote: > > > I was going to address this second inconsistency with another patch but now > > seems like a good time to get a see if anyone has an opinion about ho

[OMPI devel] RFC: fix various leaks in trunk (touches coll/ml, vprotocol, pml/v, btl/openib, and mca/base)

2012-11-05 Thread Nathan Hjelm
Why: Always a good idea to clean up all allocated memory. With this patch and some others I have in the pipeline valgrind no longer reports and "possibly leaked" or "definitely leaked" blocks in ompi_info. -Nathan Hjelm HPC-3, LANL Index: ompi/mca/

Re: [OMPI devel] [OMPI svn] svn:open-mpi r27574 - trunk/orte/mca/rmaps/rank_file

2012-11-07 Thread Nathan Hjelm
Hmm, not sure why I didn't see an error when I tested the change. It looks like in this case yyterminate should have been defined as orte_rmaps_rank_file_lex_destroy(). Looked a little deeper and it looks like the default action for yyterminate is to call the *lex_destroy function so we don't n

Re: [OMPI devel] [OMPI svn] svn:open-mpi r27574 - trunk/orte/mca/rmaps/rank_file

2012-11-07 Thread Nathan Hjelm
Ok, looks like the default yyterminate does not clean up the lex state. The definition in rmaps/rankfile/rmaps_rank_file_lex.l should be #define yyterminate() return orte_rmaps_rank_file_lex_destroy() I can fix it if you want. -Nathan On Wed, Nov 07, 2012 at 08:34:59AM -0700, Nathan Hjelm

[OMPI devel] RFC: make mca_base_param_deregister actually work

2012-11-19 Thread Nathan Hjelm
MPI_T standard. Please review the patch for correctness, bugs, etc. -Nathan Hjelm HPC-3, LANL Index: opal/mca/base/mca_base_param.c === --- opal/mca/base/mca_base_param.c (revision 27624) +++ opal/mca/base/mca_base_param.c

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-25 Thread Nathan Hjelm
Something is wrong with the wrappers. A number of libraries (-lxpmem, -lugni, etc) are missing from libs_static. Might be a similar issue with eh missing -llustreapi. Going to create a critical bug to track this issue. Works in 1.7 :-/ ... If you add -lnuma to libs_static in mpicc-wrapper-data.t

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-25 Thread Nathan Hjelm
i_gnu_mp -lstdc++ -lgfortran -lmpich_gnu_47 -lmpl -lrt -lsma -lxpmem > > -ldmapp -lugni -lpmi -lalpslli -lalpsutil -lalps -ludreg -lpthread -lm > > --end-group -lgomp -lpthread --start-group -lgcc -lgcc_eh -lc --end-group > > /opt/gcc/4.7.2/snos/lib/gcc/x86_64-suse-linux/4.7.2/crtend.o &

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-28 Thread Nathan Hjelm
system. > > > > Getting OMPI running on our XC30 is of exactly ZERO importance beyond my > > own edification. > > So, I am likely to stop fighting this battle soon. > > > > -Paul > > > > > > On Fri, Jan 25, 2013 at 3:21 PM, Nathan Hjelm wrot

Re: [OMPI devel] Open MPI on Cray XC30 - suspicous configury

2013-01-29 Thread Nathan Hjelm
Opps, that was my mistake. I wrote a fix for the CLE5 and --with-alps= code but I never pushed it. r27962 should fix the issue. -Nathan On Mon, Jan 28, 2013 at 09:05:32PM -0800, Ralph Castain wrote: > Thanks Paul - appreciate the help! I chatted with Nathan this evening and now > have a much be

[OMPI devel] RFC: opal_list iteration macros

2013-01-29 Thread Nathan Hjelm
row (Wed 01/29/13) around 12:00 PM MST. Thoughs? Comments? -Nathan Hjelm HPC-3, LANL

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-30 Thread Nathan Hjelm
Try configuring with --with-platform=contrib/platform/lanl/cray_xe6/optimized-nopanasas . That might work. If it doesn't try the optimized-lustre file from the same directory in trunk. -Nathan On Wed, Jan 30, 2013 at 04:28:14PM +0100, Jure Pe??ar wrote: > On Mon, 28 Jan 2013 12:28:34 -0800 > P

Re: [OMPI devel] RFC: opal_list iteration macros

2013-01-30 Thread Nathan Hjelm
mple addition but I wanted to give a heads up on the > > devel list because these macros are different from what we usually provide > > (though they should look familiar to those familiar with the Linux kernel). > > I intend to commit these macros to the truck (and CMR for 1.7.1) tomorrow > > (Wed 01/29/13) around 12:00 PM MST. > > > > Thoughs? Comments? > > > > -Nathan Hjelm > > HPC-3, LANL > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] Open MPI (not quite) on Cray XC30

2013-01-30 Thread Nathan Hjelm
Use the optimized-lustre. optimized-common is not intended to be used on its own. -Nathan On Wed, Jan 30, 2013 at 06:31:09PM +0100, Jure Pe??ar wrote: > On Wed, 30 Jan 2013 08:48:02 -0700 > Nathan Hjelm wrote: > > > Try configuring with > > --with-platform=contrib/p

[OMPI devel] RFC: shiny new variable subsystem

2013-01-31 Thread Nathan Hjelm
ave me a lot of time later. -Nathan Hjelm HPC-3, LANL Index: opal/mca/base/Makefile.am === --- opal/mca/base/Makefile.am (revision 28004) +++ opal/mca/base/Makefile.am (working copy) @@ -25,13 +25,15 @@ noinst_LTLIBRARIES =

[OMPI devel] MCA variable system slides and notes

2013-02-05 Thread Nathan Hjelm
Notes: Variable system currently takes ownership of string values. This is done so strings can be freed when overwritten (by mca_base_var_set_value) or when the variable is deregistered. This requires that initial string values be allocated on the heap (not .DATA, heap, etc). Brian raised a goo

[OMPI devel] RFC: update lustre check to use only -llustreapi

2013-02-05 Thread Nathan Hjelm
What: Update OMPI_CHECK_LUSTRE to link against only -llustreapi. From what I can tell we do not need to link against -llustre. Also remove --with-lustre-libs option. Why: Some platforms do not have liblustre and we shouldn't need to link against liblustre anyway. When: I need some feedback fro

Re: [OMPI devel] 1.6.4rc5: final rc

2013-02-20 Thread Nathan Hjelm
On Wed, Feb 20, 2013 at 10:28:56AM -0800, Eugene Loh wrote: > On 02/20/13 07:54, Jeff Squyres (jsquyres) wrote: > >All MTT testing looks good for 1.6.4. There seems to be an MPI dynamics > >problem when --enable-spare-groups is used, but this does not look like a > >regression to me. > > > >I pu

Re: [OMPI devel] openib fragment alignment

2013-02-20 Thread Nathan Hjelm
I talked to Pasha about the change. He suggests fragments are 2-byte aligned to save space. I suspect that on 64-bit platforms the fragment size is already a multiple of 8 bytes so this change will likely only affect 32-bit systems (which is where the bus error is occurring). -Nathan On Wed, F

[OMPI devel] RFC: Remove pml/csum

2013-02-27 Thread Nathan Hjelm
I am finishing the update to move from mca_base_param_* to mca_base_var_* and the less I have to do the better. -Nathan Hjelm HPC-3, LANL

[OMPI devel] RFC: MCA system revamp phase 1

2013-03-20 Thread Nathan Hjelm
ation. I will use git svn dcommit (the master repository is git-svn) to push the 8 commits found on the svn-commit branch. I can break it into more commits if there are any objections. I will then remove the .gitignore file (any any other files not relevant to svn). Questions? Comments? Hate mail? -Na

[OMPI devel] SVN quiet time starting now

2013-03-27 Thread Nathan Hjelm
Will send out an email when I am done. -Nathan

[OMPI devel] MCA revamp phase 1 in place. SVN quiet time over now

2013-03-27 Thread Nathan Hjelm
suffix for integers. Ex. 1k = 1024. - New framework system. All frameworks have been updated to use the new system. - Cleaner ompi_info implementation. - Etc. See individual commit messages for more details. MPI_T_* support will follow in a week or so. -Nathan Hjelm HPC-3, LANL

[OMPI devel] RFC: initial MPI_T support

2013-04-05 Thread Nathan Hjelm
What: Add initial support for the MPI 3.0 tools interface (MPI_T). Inital support includes full support for the MPI_T_cvar and MPI_T_category interfaces. No pvars are available at this time. Support for pvars will be added at a later time. Why: To be MPI 3.0 compliant the MPI_T interface must b

Re: [OMPI devel] RFC: initial MPI_T support

2013-04-05 Thread Nathan Hjelm
rrata what is in the current implementation should be fine for now. -Nathan On Fri, Apr 05, 2013 at 12:52:12PM -0600, Nathan Hjelm wrote: > What: Add initial support for the MPI 3.0 tools interface (MPI_T). Inital > support includes full support for the MPI_T_cvar and MPI_T_category >

Re: [OMPI devel] RFC: initial MPI_T support

2013-04-05 Thread Nathan Hjelm
.did you really mean a timeout of Mon?? Could you delay that a bit so > we can actually have time to look at it? > > > On Apr 5, 2013, at 11:56 AM, Nathan Hjelm wrote: > > > Also, please look at the thread level support. We had some discussion at > > the forum

Re: [OMPI devel] RFC: initial MPI_T support

2013-04-05 Thread Nathan Hjelm
y of this code? > > > On Apr 5, 2013, at 12:44 PM, Ralph Castain wrote: > > > Let's bump it to Thurs - I trust what you say, but haven't had a chance to > > glance at the changes. Likewise, it would be nice to let Jeff return from > > vacation and look a

Re: [OMPI devel] RFC: initial MPI_T support

2013-04-05 Thread Nathan Hjelm
his > > checking for pci/pci.h... no > configure: WARNING: Specified --enable-pci switch, but could not > configure: WARNING: find appropriate support > configure: error: Cannot continue > > > I assure you I didn't specify that switch - it looks old to me. > > >

Re: [OMPI devel] RFC: initial MPI_T support

2013-04-05 Thread Nathan Hjelm
in wrote: > > > This is what I cloned: > > > > git clone git://github.com/hjelmn/ompi-mca-var.git mpit-commit > > > > > > On Apr 5, 2013, at 1:08 PM, Nathan Hjelm wrote: > > > >> Should be. I rebased fro the trunk this morning. Are you sure

Re: [OMPI devel] [EXTERNAL] Developer meeting: mid/late summer?

2013-04-29 Thread Nathan Hjelm
I will likely be at the forum so count me in. -Nathan On Fri, Apr 26, 2013 at 10:21:29PM +, Jeff Squyres (jsquyres) wrote: > Ok, we can probably do this. > > Is anyone else interested? > > > On Apr 24, 2013, at 1:25 PM, "Barrett, Brian W" wrote: > > > I could probably do Monday afternoon

Re: [OMPI devel] [OMPI svn] svn:open-mpi r28435 - in trunk: . conf db db/revprops db/revprops/0 db/revs db/revs/0 db/transactions db/txn-protorevs hooks locks

2013-05-01 Thread Nathan Hjelm
*&&*$# . Can someone undo this. -Nathan On Wed, May 01, 2013 at 12:01:48PM -0400, svn-commit-mai...@open-mpi.org wrote: > Author: hjelmn (Nathan Hjelm) > Date: 2013-05-01 12:01:48 EDT (Wed, 01 May 2013) > New Revision: 28435 > URL: https://svn.open-mpi.org/trac/ompi/chan

Re: [OMPI devel] [OMPI svn] svn:open-mpi r28435 - in trunk: . conf db db/revprops db/revprops/0 db/revs db/revs/0 db/transactions db/txn-protorevs hooks locks

2013-05-01 Thread Nathan Hjelm
Nevermind. Figured it out. -Nathan On Wed, May 01, 2013 at 10:06:08AM -0600, Nathan Hjelm wrote: > *&&*$# . Can someone undo this. > > -Nathan > > On Wed, May 01, 2013 at 12:01:48PM -0400, svn-commit-mai...@open-mpi.org > wrote: > > Author: hjelmn (Nathan Hjel

Re: [OMPI devel] Build warnings in trunk

2013-05-14 Thread Nathan Hjelm
On Tue, May 14, 2013 at 02:30:17PM -0700, Rolf vandeVaart wrote: > I have noticed several warnings while building the trunk. Feel free to fix > anything that you are familiar with. > > CC mca_base_param.lo > ../../../../opal/mca/base/mca_base_param.c: In function 'register_param': > ../

[OMPI devel] RFC: Remove old MCA parameter system from trunk

2013-05-15 Thread Nathan Hjelm
/1.8 release series. When: This RFC is a heads up. I will remove the old API on Monday, May 20, 2013. -Nathan Hjelm HPC-3, LANL

Re: [OMPI devel] RFC: Add static initializer for opal_mutex_t

2013-06-10 Thread Nathan Hjelm
On Sat, Jun 08, 2013 at 12:28:02PM +0200, George Bosilca wrote: > All Windows objects that are managed as HANDLES can easily be modified to > have static initializer. A clean solution is attached to the question at > stackoverflow: > http://stackoverflow.com/questions/3555859/is-it-possible-to-do

Re: [OMPI devel] RFC: Add static initializer for opal_mutex_t

2013-06-11 Thread Nathan Hjelm
On Mon, Jun 10, 2013 at 06:53:36PM +0200, George Bosilca wrote: > > On Jun 10, 2013, at 17:18 , Nathan Hjelm wrote: > > > On Sat, Jun 08, 2013 at 12:28:02PM +0200, George Bosilca wrote: > >> All Windows objects that are managed as HANDLES can easily be modified to >

Re: [OMPI devel] RFC: Add static initializer for opal_mutex_t

2013-06-11 Thread Nathan Hjelm
On Tue, Jun 11, 2013 at 09:13:01AM -0700, Ralph Castain wrote: > > On Jun 11, 2013, at 9:09 AM, Nathan Hjelm wrote: > > > On Mon, Jun 10, 2013 at 06:53:36PM +0200, George Bosilca wrote: > >> > >> On Jun 10, 2013, at 17:18 , Nathan Hjelm wrote: > >>

[OMPI devel] RFC: improve the hash function used by opal_hash_table_t

2013-06-11 Thread Nathan Hjelm
What: Implement a better hash function in opal_hash_table_t. The function is a simple one-at-a-time Jenkin's hash (see http://en.wikipedia.org/wiki/Jenkins_hash_function) and has good collision rates and isn't overly complex or slow. Why: I am preparing an update to the MCA variable system (add

  1   2   3   4   5   >