Re: [OMPI devel] RFC: Deprecate rankfile?

2010-04-16 Thread Terry Dontje
Jeff Squyres wrote: On Apr 16, 2010, at 6:43 AM, Terry Dontje wrote: If you are suggesting that you will make code that breaks a current rankfile feature, note I am not talking about adding a new feature that isn't supported by rankfile but something that used to work, then I think you

Re: [OMPI devel] problem when binding to socket on a single socket node

2010-04-12 Thread Terry Dontje
logically understandable test would help others reading the code. But first we need to resolve the question: should this scenario return an error or not? On Apr 12, 2010, at 1:43 AM, Nadia Derbey wrote: On Fri, 2010-04-09 at 14:23 -0400, Terry Dontje wrote: Ralph Castain wrote: Okay, j

Re: [OMPI devel] problem when binding to socket on a single socket node

2010-04-09 Thread Terry Dontje
the usage of orte_odls_globals.bound in you patch. It would seem to me that the insertion of that conditional would prevent the check it surrounds being done when the process has not been bounded prior to startup which is a common case. --td On Apr 9, 2010, at 9:33 AM, Terry Dontje wrote

Re: [OMPI devel] problem when binding to socket on a single socket node

2010-04-09 Thread Terry Dontje
Nadia Derbey wrote: On Fri, 2010-04-09 at 08:41 -0600, Ralph Castain wrote: Just to check: is this with the latest trunk? Brad and Terry have been making changes to this section of code, including modifying the PROCESS_IS_BOUND test... Well, it was on the v1.5. But I just checked:

Re: [OMPI devel] inquiry about mpirun

2010-04-06 Thread Terry Dontje
N.M. Maclaren wrote: On Apr 6 2010, luyang dong wrote: Regardless of any mpi implementation , there is always a command named mpirun. And correspondingly there is a source file called mpirun.c.(at least in lam/mpi),but i can not find this file in openmpi. can you tell me how to produce

Re: [OMPI devel] RFC: increase default AC/AM/LT requirements

2010-03-25 Thread Terry Dontje
Will these versions of Auto tools work with the 1.4 branch? --td Jeff Squyres wrote: *** LAST CHANCE *** I'm asking yet one more time because it ***WILL HAVE A DIRECT IMPACT ON DEVELOPERS!*** We're past the RFC timeout and no one has objected, and I have a patch ready to commit, but be

[OMPI devel] vt compilation problem

2010-03-19 Thread Terry Dontje
I was trying to compile the trunk head using Linux and Sun Studio compilers and saw the following error. I am not sure that the compiler really is the smoking gun. I see that State.cpp was last modified in r22820 and I wonder if the modification added the usage of "__FUNCTION__" outside an

Re: [OMPI devel] Signals

2010-03-17 Thread Terry Dontje
On 03/17/2010 10:10 AM, Leonardo Fialho wrote: Wow... orte_plm.signal_job points to zero. Is it correct from the PML point of view? It might be because plm's are really only used at launch time not in MPI processes. Note plm != pml. --td Leonardo On Mar 17, 2010, at 2:52 PM, Leonardo

Re: [OMPI devel] Signals

2010-03-17 Thread Terry Dontje
Can you print out what orte_plm.signal_job value is? I bet it is pointing to address 0. So the question is orte_plm actually initialized in an MPI process? My guess would be no but I am sure Ralph will be able to answer more definitively. --td On 03/17/2010 09:52 AM, Leonardo Fialho

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Terry Dontje
ise); #else component_handle = lt_dlopenext(target_file->filename); #endif to use lt_dladvise_global instead of lt_dladvise_local? Leonardo On Mar 5, 2010, at 7:47 PM, Terry Dontje wrote: I would also start nm'ing the .so's you think the U symbols are resolved in to make sure they are e

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Terry Dontje
0916 t mca_vprotocol_receiver_wait_any 098a t mca_vprotocol_receiver_wait_some U ompi_request_null U opal_output 00201440 d p.6113 [lfialho@aoclsb-clus openmpi]$ On Mar 5, 2010, at 7:00 PM, Terry Dontje wrote: Sorry meant to add this, but you

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Terry Dontje
mca_vprotocol_receiver_wait_any 098a t mca_vprotocol_receiver_wait_some U ompi_request_null U opal_output 00201440 d p.6113 [lfialho@aoclsb-clus openmpi]$ On Mar 5, 2010, at 7:00 PM, Terry Dontje wrote: Sorry meant to add this, but you might be able to try

Re: [OMPI devel] RFC: Rename --enable-*-threads and ENABLE*THREAD* (take 2)

2010-03-05 Thread Terry Dontje
A couple comments: 1. I really assume the timeout is March 5th not February. 2. As to keeping the deprecated variables I think you really need to ditch the --enable-mpi-threads because if you synonym it with --enable-mpi-thread-multiple you are not mimicing what it did before but redefining

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r22762

2010-03-03 Thread Terry Dontje
Iain Bason wrote: On Mar 3, 2010, at 3:04 PM, Jeff Squyres wrote: Mmmm... good point. I was thinking specifically of the if_in|exclude behavior in the openib BTL. That uses strcmp, not strncmp. Here's a complete list: ompi_info --param all all --parsable | grep include | grep :value:

Re: [OMPI devel] RFC: Suspend/resume enhancements

2010-01-05 Thread Terry Dontje
This only happens when the orte_forward_job_control MCA flag is set to 1 and the default is that it is set to 0. Which I believe meets Ralph's criteria below. --td Ralph Castain wrote: I don't have any issue with this so long as (a) it is -only- active when someone sets a specific MCA

Re: [OMPI devel] carto vs. hwloc

2009-12-15 Thread Terry Dontje
Kenneth Lloyd wrote: My 2 cents: Carto is a weighted graph structure that describes the topology of the compute cluster, not just locations of nodes. Many view topologies (trees, meshes, torii) to be static - but I've found this an unnecessary and undesirable constraint. The compute fabric may

Re: [OMPI devel] SEGFAULT in mpi_init from paffinity with intel 11.1.059 compiler

2009-12-14 Thread Terry Dontje
I don't really want to throw fud on this list but we've seen all sorts of oddities with OMPI 1.3.4 being built with Intel's 11.1 compiler versus their 11.0 or other compilers (gcc, Sun Studio, pgi, and pathscale). I have not tested your specific failing case but considering your issue doesn't

Re: [OMPI devel] [patch] Verifying the message queue DLL build

2009-12-08 Thread Terry Dontje
Ashley Pittman wrote: On Tue, 2009-12-08 at 07:39 -0500, Terry Dontje wrote: Ashley Pittman wrote: I've seen several cases now where people have functional, installed MPI libraries yet when they've come to use padb they have discovered a build problem with the Message Queue DLL which

Re: [OMPI devel] [patch] Verifying the message queue DLL build

2009-12-08 Thread Terry Dontje
Ashley Pittman wrote: I've seen several cases now where people have functional, installed MPI libraries yet when they've come to use padb they have discovered a build problem with the Message Queue DLL which prevents it from working. The cases I've seen this happen is with the Sun Studio

Re: [OMPI devel] Finalize without Detach???

2009-11-19 Thread Terry Dontje
So is there any reason OMPI should not auto-detach buffers at Finalize? I understand technically we don't have to but there are false performance degradations incurred by us not detaching thus making OMPI look significantly slower compared to other MPIs for no real reason. So unless there is

Re: [OMPI devel] [Fwd: strong ordering for data registered memory]

2009-11-13 Thread Terry Dontje
Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: On Nov 11, 2009, at 8:13 AM, Terry Dontje wrote: Sun's IB group has asked me to forward the following email to see if anyone has any comments on this email. Tastes great / less filling. :-) I think (assume) we'll be happy to implement

Re: [OMPI devel] [Fwd: strong ordering for data registered memory]

2009-11-11 Thread Terry Dontje
Jeff Squyres wrote: On Nov 11, 2009, at 8:13 AM, Terry Dontje wrote: Sun's IB group has asked me to forward the following email to see if anyone has any comments on this email. Tastes great / less filling. :-) I think (assume) we'll be happy to implement changes like this that come from

[OMPI devel] [Fwd: strong ordering for data registered memory]

2009-11-11 Thread Terry Dontje
Sun's IB group has asked me to forward the following email to see if anyone has any comments on this email. thanks, --td Subject: strong ordering for data registered memory From: David Brean Date: Tue, 10 Nov 2009 15:19:26 -0500 To: linux-rdma

Re: [OMPI devel] A minor nit in the mpirun manpage

2009-10-22 Thread Terry Dontje
Paul H. Hargrove wrote: Sorry if this has been fixed for 1.3.4, but in the manpge for mpirun in 1.3.3 I noticed the following in the "MCA" section of the manpage: Note: The -mca switch is simply a shortcut for setting environment variables. The same effect may be

Re: [OMPI devel] [OMPI users] cartofile

2009-10-13 Thread Terry Dontje
em, has any idea how to do so, or why they should. Looking at the code, it wouldn't appear to have any value on any of the machines at LANL, but I may be missing something - not a lot of help around to understand it. On Oct 13, 2009, at 7:08 AM, Terry Dontje wrote: After rereading the m

Re: [OMPI devel] [OMPI users] cartofile

2009-10-13 Thread Terry Dontje
After rereading the manpage for the umpteenth time I agree with Eugene that the information provided on cartofile is next to useless. Ok, so you describe what your node looks like but what does mpirun or libmpi do with that information? Other than the option to provide the cartofile it

Re: [OMPI devel] binding with MCA parameters: broken or user error?

2009-10-12 Thread Terry Dontje
lied the fix to stop calling register_params twice to 1.3 already, but I can check. No I was asking whether that fix might be causing the orte_process_binding mca param to not be interpreted. But I think from what you say in the first paragraph I guess I probably was wrong. --td On Oct 12, 200

Re: [OMPI devel] segv in coll tuned

2009-10-12 Thread Terry Dontje
Does that test also pass sometimes? I am seeing some random set of tests segv'ing in the SM btl, using a v1.3 derivative. --td Lenny Verkhovsky wrote: Hi, I experience the following error with current trunk r22090. It also occures in 1.3 branch.

Re: [OMPI devel] binding with MCA parameters: broken or user error?

2009-10-12 Thread Terry Dontje
In regards to the "-mca XXX" option not overriding the file setting I thought I saw this working for v1.3. However, I just retested this and I am seeing the same issue of the "-mca" option not affecting orte_process_binding or rmaps_base_schedule_policy. This seems to work under the trunk.

Re: [OMPI devel] OMPI 1.3.4 ETA ? (TLAs FTW)

2009-09-28 Thread Terry Dontje
Ralph Castain wrote: I am not one of the 1.3 release managers, but do serve as gatekeeper. From what I see in the automated nightly tests, we are certainly no earlier than 3-4 weeks from release. Lots of errors in the nightly tests, and no visible high-priority effort under way to identify

Re: [OMPI devel] application hangs with multiple dup

2009-09-23 Thread Terry Dontje
Chris Samuel wrote: Hi Edgar, - "Edgar Gabriel" wrote: it will be available in 1.3.4... That's great, thanks so much! cheers, Chris It's actually is in the 1.3 branch now and has been verified to solve the hanging issues of several members. --td

Re: [OMPI devel] MPIR_Breakpoint visibility

2009-09-21 Thread Terry Dontje
ot relying on the MPIR_Breakpoint to actually stop execution. --td On Sep 21, 2009, at 7:03 AM, Terry Dontje wrote: I was kind of amazed no one else managed to run into this but it was brought to my attention that compiling OMPI with Intel compilers and visibility on that the MPIR_Breakpoint symbol was

[OMPI devel] MPIR_Breakpoint visibility

2009-09-21 Thread Terry Dontje
I was kind of amazed no one else managed to run into this but it was brought to my attention that compiling OMPI with Intel compilers and visibility on that the MPIR_Breakpoint symbol was not being exposed. I am assuming this is due to MPIR_Breakpoint not being ORTE or OMPI_DECLSPEC'd Do

[OMPI devel] Heads up on new feature to 1.3.4

2009-08-13 Thread Terry Dontje
I just wanted to give everyone a heads up if they do not get bugs email. I just submitted a CMR to move over some new paffinity options from the trunk to the v1.3 branch. You can read the gory details in https://svn.open-mpi.org/trac/ompi/ticket/1997 --td

Re: [OMPI devel] [OT] Who's going to Helsinki?

2009-08-04 Thread Terry Dontje
Jeff Squyres wrote: Who's going to Helsinki? Does anyone want to meet up for some sight-seeing and/or have a devel meeting? I know that some of our European developers are not attending, but if we have a day-long devel meeting, perhaps they might be motivated...? I will be attending.

Re: [OMPI devel] default btl eager_limit

2009-07-22 Thread Terry Dontje
this to cover dr and csum. I've received the change from Brian and working on porting it across all the other PMLs. --td On Jul 16, 2009, at 10:10 AM, Terry Dontje wrote: Another way to do this which I am not sure makes sense is to just add sizeof(mca_pml_ob1_hdr_t) to the btl_eager_limit

[OMPI devel] default btl eager_limit

2009-07-16 Thread Terry Dontje
I was playing around with some really silly fragment sizes (sub 72 bytes) when I ran into some asserts in the btl_openib_sendi. I traced the assert to be caused by mca_pml_ob1_send_request_start_btl() calculating the true eager_limit with the following line: size_t eager_limit =

Re: [OMPI devel] OpenMPI, PLPA and Linux cpuset/cgroup support

2009-07-16 Thread Terry Dontje
There are some mailing lists for PLPA at: http://www.open-mpi.org/community/lists/plpa.php --td Ralph Castain wrote: Sounds like a problem in PLPA - I'll have to defer to them. Our primary PLPA person is on vacation this week, so you might not hear back from him until later next week when he

Re: [OMPI devel] sm BTL flow management

2009-06-26 Thread Terry Dontje
Eugene Loh wrote: Brian W. Barrett wrote: All - Jeff, Eugene, and I had a long discussion this morning on the sm BTL flow management issues and came to a couple of conclusions. * Jeff, Eugene, and I are all convinced that Eugene's addition of polling the receive queue to drain acks when

Re: [OMPI devel] why does --rankfile need hostlist?

2009-06-23 Thread Terry Dontje
do you think? Mike On Mon, Jun 22, 2009 at 1:30 PM, Terry Dontje <terry.don...@sun.com <mailto:terry.don...@sun.com>> wrote: Let us think about this some more. We'll try and reply later today. --td Ralph Castain wrote: Had a chance to think about how this m

Re: [OMPI devel] why does --rankfile need hostlist?

2009-06-22 Thread Terry Dontje
problem. I'm willing to give it a try - just trying to make clear why my response was negative. It isn't as simple as it sounds...which is why Len and I didn't pursue it when this was originally developed. Ralph On Sun, Jun 21, 2009 at 5:28 AM, Terry Dontje <terry.don...

Re: [OMPI devel] why does --rankfile need hostlist?

2009-06-21 Thread Terry Dontje
Being a part of these discussions I can understand your reticence to reopen this discussion. However, I think this is a major usability issue with this feature which actually is fairly important in order to get things to run performant. Which IMO is important. That being said I think there

Re: [OMPI devel] Pathscale C++

2009-06-11 Thread Terry Dontje
Terry Dontje wrote: Has anyone successfully run C++ tests on OMPI built with Pathscale compilers? I am seeing aborts on calls to Get_size. --td Nevermind found ticket #1326. --td

[OMPI devel] Pathscale C++

2009-06-11 Thread Terry Dontje
Has anyone successfully run C++ tests on OMPI built with Pathscale compilers? I am seeing aborts on calls to Get_size. --td

Re: [OMPI devel] [RFC] Low pressure OPAL progress

2009-06-09 Thread Terry Dontje
Sylvain Jeaugey wrote: Hi Ralph, I'm entirely convinced that MPI doesn't have to save power in a normal scenario. The idea is just that if an MPI process is blocked (i.e. has not performed progress for -say- 5 minutes (default in my implementation), we stop busy polling and have the process

Re: [OMPI devel] problem in the ORTE notifier framework

2009-05-28 Thread Terry Dontje
Nadia Derbey wrote: On Wed, 2009-05-27 at 14:25 -0400, Jeff Squyres wrote: Excellent points; Ralph and I chatted about this on the phone today -- we concur with George. Bull -- would peruse work for you? I think you mentioned before that it didn't seem attractive to you. Well,

Re: [OMPI devel] totalview with OMPI 1.3 on rh5 linux

2009-05-20 Thread Terry Dontje
branches. doh, --td Ashley Pittman wrote: On Tue, 2009-05-19 at 13:21 -0400, Terry Dontje wrote: Actually playing with Ashley´s program shown that RTLD_NOW does error out in the exact same way. So could the problem be that totalview dlopen´s the plugin with RTLD_NOW passed into dlopen

Re: [OMPI devel] totalview with OMPI 1.3 on rh5 linux

2009-05-19 Thread Terry Dontje
Actually playing with Ashley´s program shown that RTLD_NOW does error out in the exact same way. So could the problem be that totalview dlopen´s the plugin with RTLD_NOW passed into dlopen. I would have thought if this was the case we would have seen this problem sooner. --td George

[OMPI devel] totalview with OMPI 1.3 on rh5 linux

2009-05-15 Thread Terry Dontje
Has anyone tried to run totalview with OMPI 1.3 on a RH5 linux system. I am seeing totalview unable to load libompi_dbg_msgq.so because ompi_free_list_grow is not found. What's interesting is this same symbol is undefined for Solaris but things work. Is ompi_free_list_grow actually used by

Re: [OMPI devel] OMPI 1.3 branch

2009-05-14 Thread Terry Dontje
Ralph Castain wrote: Hi folks I encourage people to please look at your MTT outputs. As we are preparing to roll the 1.3.3 release, I am seeing a lot of problems on the branch: 1. timeouts, coming in two forms: (a) MPI_Abort hanging, and (b) collectives hanging (this is mostly on Solaris)

Re: [OMPI devel] RFC: MPI Interface Extensions Infrastructure

2009-05-12 Thread Terry Dontje
I like this, however wouldn't it possibly be nice to have a the mpi-ext.h pulled in by mpi.h when the -enable-ext configure option is used? That way one would be able to compile and run current tests for regressions without having to change the code. --td Jeff Squyres wrote: I'm

Re: [OMPI devel] Revise paffinity method?

2009-05-08 Thread Terry Dontje
Ralph Castain wrote: I think that's the way to go then - it also follows our "the user is always right - even when they are wrong" philosophy. I'll probably have to draw on others to help ensure that the paffinity modules all report appropriately. Yeah, that sounds like the right way to do

Re: [OMPI devel] Revise paffinity method?

2009-05-08 Thread Terry Dontje
modules between the fork and exec, as Brian suggested. On May 7, 2009, at 12:43 PM, Terry Dontje wrote: Brian W. Barrett wrote: On Wed, 6 May 2009, Ralph Castain wrote: Any thoughts on this? Should we change it? Yes, we should change this (IMHO) :). Me too. If so, who wants to be involved

Re: [OMPI devel] Revise paffinity method?

2009-05-07 Thread Terry Dontje
Brian W. Barrett wrote: On Wed, 6 May 2009, Ralph Castain wrote: Any thoughts on this? Should we change it? Yes, we should change this (IMHO) :). Me too. If so, who wants to be involved in the re-design? I'm pretty sure it would require some modification of the paffinity framework, plus

[OMPI devel] [Fwd: Re: Fwd: Purify found bugs inside open-mpi library]

2009-05-02 Thread Terry Dontje
mory that leaked despite me calling MPI_Finalize(). Let me know if you need me to try something else or to produce any additional output. Thanks again, Brian On Thu, Apr 30, 2009 at 10:11 AM, Terry Dontje <terry.don...@sun.com> wrote: > So I've been kibitzing with Jeff on the below. If yo

Re: [OMPI devel] vampirtrace on v1.3 branch

2009-04-30 Thread Terry Dontje
Andreas Knüpfer wrote: On Tuesday 28 April 2009, Terry Dontje wrote: Has anyone tested running a simple program compiled with mpicc-vt that was built on RHEL 5.1 or SLES-10 with gcc under 32 bits? I am seeing the following errors when running compiled code: VampirTrace: BFD

Re: [OMPI devel] Fwd: Purify found bugs inside open-mpi library

2009-04-30 Thread Terry Dontje
Jeff Squyres wrote: On Apr 29, 2009, at 5:03 PM, Brian Blank wrote: Purify did find some other UMR (unitialize memory read) errors though, but they don't seem to be negativley impacting my application right now. Nonetheless, I'll post them later today in case anyone is interested in them.

[OMPI devel] vampirtrace on v1.3 branch

2009-04-28 Thread Terry Dontje
Has anyone tested running a simple program compiled with mpicc-vt that was built on RHEL 5.1 or SLES-10 with gcc under 32 bits? I am seeing the following errors when running compiled code: VampirTrace: BFD: bfd_get_file_flags(): failed Note the trunk seems to be working fine and I have

Re: [OMPI devel] predefined ompi_t types break strict-aliasing rules

2009-04-27 Thread Terry Dontje
Jeff Squyres wrote: On Apr 24, 2009, at 1:24 PM, Number Cruncher wrote: The goal is admirable and a stalwart of good open source practice which avoids "DLL-Hell". However, I simply don't understand how my compiler can *ever* know the size of your opaque ompi_communicator_t? I'm not enough

Re: [OMPI devel] predefined ompi_t types break strict-aliasing rules

2009-04-24 Thread Terry Dontje
Number Cruncher wrote: Many thanks for the informative explanation, Jeff. I appreciate this issue has been the cause of some deliberation! This was the changeset where we did the ABI fixes -- ensuring that if you compile/link against Open MPI vA.B.C, you will be able to just change your

Re: [OMPI devel] access to tests

2009-04-06 Thread Terry Dontje
Eugene Loh wrote: Do I need to buy someone a beer to get access to the test suites? [eloh@milliways]$ svn co https://svn.open-mpi.org/svn/ompi/trunk [... successful ...] [eloh@milliways]$ svn co https://svn.open-mpi.org/svn/ompi-tests/trunk/intel_tests svn: PROPFIND request failed on

Re: [OMPI devel] OMPI vs Scali performance comparisons

2009-03-18 Thread Terry Dontje
copy was not initiated until a large (64K) memcpy. --td On Mar 18, 2009, at 06:43 , Terry Dontje wrote: George Bosilca wrote: The default values for the large message fragments are not optimized for the new generation processors. This might be something to investigate, in order to see if we

Re: [OMPI devel] OMPI vs Scali performance comparisons

2009-03-18 Thread Terry Dontje
George Bosilca wrote: The default values for the large message fragments are not optimized for the new generation processors. This might be something to investigate, in order to see if we can have the same bandwidth as they do or not. Are you suggesting bumping up the btl_sm_max_send_size

Re: [OMPI devel] Meta Question -- Open MPI: Is it a dessert topping or is it a floor wax?

2009-03-12 Thread Terry Dontje
Sun's participation in this community was to obtain a stable and performant MPI implementation that had some research work done on the side to improve those goals and the introduction of new features. We don't have problems with others using and improving on the OMPI code base but we need to

Re: [OMPI devel] 1.3.1 -- bad MTT from Cisco

2009-03-11 Thread Terry Dontje
with hanging collectives, frankly - and we don't know how the sm changes will affect this problem, if at all. On Mar 11, 2009, at 7:50 AM, Terry Dontje wrote: > Jeff Squyres wrote: >> So -- Brad/George -- this technically isn't a regression against >> v1.3.0 (do we know if this can ha

Re: [OMPI devel] 1.3.1 -- bad MTT from Cisco

2009-03-11 Thread Terry Dontje
have -fixed- the problem. :-) On Mar 11, 2009, at 4:34 AM, Terry Dontje wrote: > I forgot to mention that since I ran into this issue so long ago I > really doubt that Eugene's SM changes has caused this issue. > > --td > > Terry Dontje wrote: >> Hey!!! I ran into this

Re: [OMPI devel] 1.3.1 -- bad MTT from Cisco

2009-03-11 Thread Terry Dontje
I forgot to mention that since I ran into this issue so long ago I really doubt that Eugene's SM changes has caused this issue. --td Terry Dontje wrote: Hey!!! I ran into this problem many months ago but its been so elusive that I've haven't nailed it down. First time we saw this was last

Re: [OMPI devel] 1.3.1 -- bad MTT from Cisco

2009-03-11 Thread Terry Dontje
Hey!!! I ran into this problem many months ago but its been so elusive that I've haven't nailed it down. First time we saw this was last October. I did some MTT gleaning and could not find anyone but Solaris having this issue under MTT. What's interesting is I gleaned Sun's MTT results and

Re: [OMPI devel] calling sendi earlier in the PML

2009-03-04 Thread Terry Dontje
I didn't see exchange between you and Jeff at the end of this email. It basically nullifies my half-baked concern. thanks, --td Eugene Loh wrote: Terry Dontje wrote: Eugene Loh wrote: I'm on the verge of giving up moving the sendi call in the PML. I will try one or two last things

Re: [OMPI devel] calling sendi earlier in the PML

2009-03-03 Thread Terry Dontje
Eugene Loh wrote: I'm on the verge of giving up moving the sendi call in the PML. I will try one or two last things, including this e-mail asking for feedback. The idea is that when a BTL goes over a very low-latency interconnect (like sm), we really want to shave off whatever we can from

Re: [OMPI devel] workspace management question

2009-02-19 Thread Terry Dontje
Eugene Loh wrote: Okay, thanks for all the feedback. New version is at: https://svn.open-mpi.org/trac/ompi/wiki/UsingMercurial#Developmentcycle If everyone is happy with that, I'll remove the old version, along with the diagram. So I like the new text much better than the old, but I think

Re: [OMPI devel] workspace management question

2009-02-19 Thread Terry Dontje
Eugene Loh wrote: Jeff Squyres wrote: Here's what I typically run to bring down changes from SVN to HG: # Ensure all the latest hg repo changes are in the working dir hg up # Bring in all the SVN changes svn up # Refresh the .hgignore file (may change due to the svn up)

[OMPI devel] OMPI Developer meeting on 02/12/09

2009-02-05 Thread Terry Dontje
I've started a wiki page to keep track of the agenda for the OMPI Developer meeting that will occur after the MPI Forum meeting. The page is at: https://svn.open-mpi.org/trac/ompi/wiki/Feb09Meetingsjc Feel free to add other topics. --td

Re: [OMPI devel] RFC: Move of ompi_bitmap_t

2009-01-30 Thread Terry Dontje
I second Brian's concern. So unless this is just an announcement that this is being done on a tmp branch only until everything is in order I think we need further discussions. --td Brian Barrett wrote: So once again, I bring up my objection of this entire line of moving until such time as

Re: [OMPI devel] RFC: make predefined handles extern to pointers

2009-01-30 Thread Terry Dontje
nothing really has changed). So with this latest information, I am going to start coding the other predefines with padding. I'll post the hg workspace before putting back to the trunk. --td Terry Dontje wrote: Per yesterday's concall I did some experiments with the padding changes

Re: [OMPI devel] RFC: make predefined handles extern to pointers

2009-01-28 Thread Terry Dontje
that can be externed in mpi.h. To me this seems gross however I wonder does it actually makes sense to print out an MPI communicator before MPI_Init is called? The values of the field should be either 0 or garbage. So I am really curious if the above is a problem anyways. --td Terry Dontje

Re: [OMPI devel] RFC: make predefined handles extern to pointers

2009-01-26 Thread Terry Dontje
to not overrun the bss section. I would like to discuss any objections to this solution on tomorrow's OMPI concall. thanks, --td Terry Dontje wrote: Just wanted to give an update. On a workspace with just the predefined communicators converted to opaque pointers I've ran netpipe and hpcc performance

Re: [OMPI devel] RFC: sm Latency

2009-01-20 Thread Terry Dontje
Richard Graham wrote: > First, the performance improvements look really nice. > A few questions: > - How much of an abstraction violation does this introduce ? This > looks like the btl needs to start “knowing” about MPI level semantics. > Currently, the btl purposefully is ulp agnostic. I ask for

Re: [OMPI devel] RFC: make predefined handles extern to pointers

2009-01-16 Thread Terry Dontje
Just wanted to give an update. On a workspace with just the predefined communicators converted to opaque pointers I've ran netpipe and hpcc performance tests and compared the results before and after the changes. The differences in performance with 10 sample run was undetectable. I've also

Re: [OMPI devel] LOCK_SHARED?

2009-01-05 Thread Terry Dontje
Jim Langston wrote: Hi Rolf, Thanks for the pointers, they are very clear and concise. I followed the general flow of what was done to fix the issue in 1.3 and did something similar for 1.2.9. In mpicxx.cc, I did this change: #include #ifdef LOCK_SHARED static const int

Re: [OMPI devel] RFC: make predefined handles extern to pointers

2008-12-18 Thread Terry Dontje
Terry Dontje wrote: Richard Graham wrote: Terry, Is there any way you can quantify the cost ? This seems reasonable, but would be nice to get an idea what the performance cost is (and not within a tight loop where everything stays in cache). Rich Ok, I guess that would eliminate any

Re: [OMPI devel] RFC: make predefined handles extern to pointers

2008-12-18 Thread Terry Dontje
Richard Graham wrote: Terry, Is there any way you can quantify the cost ? This seems reasonable, but would be nice to get an idea what the performance cost is (and not within a tight loop where everything stays in cache). Rich Ok, I guess that would eliminate any of the simple perf

Re: [OMPI devel] Forwarding SIGTSTP and SIGCONT

2008-12-11 Thread Terry Dontje
Jeff Squyres wrote: On Dec 8, 2008, at 11:11 AM, Ralph Castain wrote: It sounds reasonable to me. I agree with Ralf W about having mpirun send a STOP to itself - that would seem to solve the problem about stopping everything. It would seem, however, that you cannot similarly STOP the

Re: [OMPI devel] BTL move - the notion

2008-12-05 Thread Terry Dontje
Richard Graham wrote: Let me start the e-mail conversation, and see how far we get. Goal: The goal several of us have is to be able to use the btl’s outside of the MPI layer in Open MPI. The layer itself is generic, w/o specific knowledge of Upper Level Protocols, so is well suited for this

Re: [OMPI devel] Preparations for moving the btl's

2008-12-03 Thread Terry Dontje
Ahh, and then I woke up This might not be an issue (or a big one), but I have some code I am working on that replaces memcpy with an opal memcpy routine. Does your change below remove the ability of the BTLs to call opal routines? --td Richard Graham wrote: Now that 1.3 will be

Re: [OMPI devel] Preparations for moving the btl's

2008-12-03 Thread Terry Dontje
I don't have any *strong* objections. However, I know that Eugene and George B have been working on some Fastpath code changes that we should make sure neither project obliterates the other. --td Richard Graham wrote: Now that 1.3 will be released, we would like to go ahead with the plan to

Re: [OMPI devel] RFC: Add SunStudio/Libtool helper script for post-configure

2008-11-21 Thread Terry Dontje
Ethan Mallove wrote: I'm still running into the Cstd/stlport4 issue with 2.2.6. That is, this line appears in the libtool script: postdeps="-library=Cstd -library=Crun" Do you have the string " -library=stlport4 " in $CXX $CXXFLAGS? If not, then how can Libtool detect that you use

Re: [OMPI devel] Dropped message for the non-existing communicator

2008-11-08 Thread Terry Dontje
for the error messages even before that, it just exposes it more frequently... Thanks Edgar Terry Dontje wrote: I am seeing the message "Dropped message for the non-existing communicator" when running hpcc with np=124 against r19845. This seems to be pretty reproducible at np=124. Wh

Re: [OMPI devel] RFC: libopen-rte --> libompi-rte

2008-11-07 Thread Terry Dontje
I do not see the real value in doing this name change. The name "OMPI Run Time Environment" and libopen_rte.so are not that far from each other. Changing a bunch of Makefile.am's at this point in the game for what I consider a minor cosmetic difference just makes little sense to me. On the

Re: [OMPI devel] OFED release schedule

2008-10-08 Thread Terry Dontje
Jeff Squyres wrote: Per the call today, I was supposed to find out the release schedule for OFED v1.4 (to know what the gate deadline is for v1.2.8). OFED v1.4 RC3 is due out early next week. So getting OMPI 1.2.8 out *this week* would be best. Thanks! I would like to get an RC for 1.2.8

Re: [OMPI devel] [OMPI svn] svn:open-mpi r19600

2008-09-23 Thread Terry Dontje
Jeff Squyres wrote: I think the point is that as a group, we consciously, deliberately, and painfully decided not to support multi-cluster. And as a result, we ripped out a lot of supporting code. Starting down this path again will likely result in a) re-opening all the discussions, b)

Re: [OMPI devel] SM initialization race condition

2008-08-21 Thread Terry Dontje
rge. On Aug 21, 2008, at 1:22 PM, Terry Dontje wrote: I've been seeing an intermittent (once every 4 hours looping on a quick initialization program) segv with the following stack trace. =>[1] mca_btl_sm_add_procs(btl = 0xfd7ffdb67ef0, nprocs = 2U, procs = 0x591560, peers

[OMPI devel] SM initialization race condition

2008-08-21 Thread Terry Dontje
I've been seeing an intermittent (once every 4 hours looping on a quick initialization program) segv with the following stack trace. =>[1] mca_btl_sm_add_procs(btl = 0xfd7ffdb67ef0, nprocs = 2U, procs = 0x591560, peers = 0x591580, reachability = 0xfd7fffdff000), line 519 in "btl_sm.c"

Re: [OMPI devel] memcpy MCA framework

2008-08-16 Thread Terry Dontje
to this point before anyone else I probably will volunteer. --td george. On Aug 16, 2008, at 3:19 PM, Terry Dontje wrote: Hi Tim, Thanks for bringing the below up and asking for a redirection to the devel list. I think looking/using the MCA memcpy framework would be a good thing to do and maybe

[OMPI devel] memcpy MCA framework

2008-08-16 Thread Terry Dontje
odify OMPI to actually use opal_memcpy where ti makes sense. Terry, I presume what you suggest could be dealt with similarly when we are running/building on SPARC. Any followup discussion on this should probably happen on the developer mailing list. On Thu, Aug 14, 2008 at 12:19 PM, Terr

Re: [OMPI devel] [OMPI bugs] [Open MPI] #1435: Crash on PPC (with SMT off) when using mpi_paffinity alone

2008-08-07 Thread Terry Dontje
No problem Lenny, I am looking at this now. --td Lenny Verkhovsky wrote: I really would like to help, but I am not sure how much time will I have in the very near future ( we are expecting a babygirl delivery ). On 8/6/08, *Open MPI* > wrote:

Re: [OMPI devel] if btl->add_procs() fails...?

2008-08-02 Thread Terry Dontje
Jeff Squyres wrote: On Aug 1, 2008, at 11:39 PM, Brian Barrett wrote: My thought is that if add_procs fails, then that BTL should be removed (as if init failed) and things should continue on. If that BTL was the only way to reach another process, we'll catch that later and abort. There

Re: [OMPI devel] trunk hangs since r19010

2008-07-28 Thread Terry Dontje
Jeff Squyres wrote: On Jul 28, 2008, at 12:03 PM, George Bosilca wrote: Interesting. The self is only used for local communications. I don't expect that any benchmark execute such communications, but apparently I was wrong. Please let me know the failing test, I will take a look this

Re: [OMPI devel] 1.3 branch

2008-07-24 Thread Terry Dontje
es) is a good decision metric. george. On Jul 24, 2008, at 3:55 PM, Terry Dontje wrote: It might be worthwhile to spell out the conditions of when someone should let changes soak or not. Considering your changeset 19011 was putback without much soak time. I am not saying 19011 needed more soak

Re: [OMPI devel] 1.3 branch

2008-07-24 Thread Terry Dontje
It might be worthwhile to spell out the conditions of when someone should let changes soak or not. Considering your changeset 19011 was putback without much soak time. I am not saying 19011 needed more soak time just that I think it adds potential confusion as to what one really needs to do

<    1   2   3   >