Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Terry Dontje
Jeff Squyres wrote: On Jul 23, 2008, at 10:37 AM, Terry Dontje wrote: This seems to work for me too. What is interesting is my experiments have shown that if you run on RH5.1 you don't need to set mpi_yield_when_idle to 0. Yes, this makes sense -- on RHEL5.1, it's a much newer Linux kernel

Re: [OMPI devel] SM latency regression

2008-07-14 Thread Terry Dontje
is the architecture where you noticed the performance degradation ? This is on a 2 year old system with 2 chips single AMD cores each chip running both Solaris and Linux under 64 bit addressing. --td Thanks, george. On Jul 11, 2008, at 1:32 PM, Terry Dontje wrote: Has anyone else seen the trunk incur

[OMPI devel] SM latency regression

2008-07-11 Thread Terry Dontje
Has anyone else seen the trunk incur approximately a 10% increase in latency? I think this has happened in the last couple weeks. I have verified that it isn't due to the recheck put into the sm_component_progress. I am about ready to try and track this down but wanted to throw this out

Re: [OMPI devel] v1.3 RM: need a ruling

2008-07-11 Thread Terry Dontje
Ralph H Castain wrote: On 7/11/08 7:48 AM, "Terry Dontje" <terry.don...@sun.com> wrote: Jeff Squyres wrote: Check that -- Ralph and I talked more about #1383 and have come up with a decent/better solution that a) is not wonky and b) does not involve MCA parameter

Re: [OMPI devel] v1.3 RM: need a ruling

2008-07-11 Thread Terry Dontje
Jeff Squyres wrote: Check that -- Ralph and I talked more about #1383 and have come up with a decent/better solution that a) is not wonky and b) does not involve MCA parameter synonyms. We're working on it in an hg and will put it back when done (probably within a business day or three). So

Re: [OMPI devel] IOF repair

2008-07-10 Thread Terry Dontje
to repair this new hole will quite likely open another one somewhere else. So even if we can "fix" the duplicate stdin problem...did the kid really improve the situation? On 7/10/08 7:05 AM, "Terry Dontje" <terry.don...@sun.com> wrote: I see that Jeff has updated

Re: [OMPI devel] IOF repair

2008-07-10 Thread Terry Dontje
I see that Jeff has updated the ticket saying that he is looking at the code to see if he can generate a fix so the below may be superfluous. Anyways, what were the issues fixed in 1.3? I really comes down to how much more pain are we giving our users by rolling back to 1.2 or not. Note, I

Re: [OMPI devel] latency and increasing number of processes

2008-07-07 Thread Terry Dontje
Brian W. Barrett wrote: On Mon, 7 Jul 2008, Terry Dontje wrote: Just curious has anyone done comparisons of latency measurements as one changes the size of a job. That is changing the size of the job (and number of nodes used) and just taking the half roundtrip latency of two

[OMPI devel] latency and increasing number of processes

2008-07-07 Thread Terry Dontje
Just curious has anyone done comparisons of latency measurements as one changes the size of a job. That is changing the size of the job (and number of nodes used) and just taking the half roundtrip latency of two of the processes in the job. I am roughly seeing an addition of 5% to the

Re: [OMPI devel] MPI_Iprobe and mca_btl_sm_component_progress

2008-06-19 Thread Terry Dontje
ng the ack packets. --td george. On Jun 19, 2008, at 2:16 PM, Terry Dontje wrote: Galen, George and others that might have SM BTL interest. In my quest of looking at MPI_Iprobe performance I found what I think is an issue. If you have an application that is using the SM BTL and does a sma

Re: [OMPI devel] iprobe and opal_progress

2008-06-18 Thread Terry Dontje
ifs from the critical path for receives ... george. On Jun 18, 2008, at 3:57 PM, Brian W. Barrett wrote: On Wed, 18 Jun 2008, Terry Dontje wrote: Jeff Squyres wrote: Perhaps we did that as a latency optimization...? George / Brian / Galen -- do you guys know/remember why this was done

Re: [OMPI devel] iprobe and opal_progress

2008-06-18 Thread Terry Dontje
on the unexpected queue or do I need to FINI the request and regenerate it again? --td On Jun 17, 2008, at 11:43 AM, Terry Dontje wrote: I've ran into an issue while running hpl where a message has been sent (in shared memory in this case) and the receiver calls iprobe but doesn't see said

[OMPI devel] iprobe and opal_progress

2008-06-17 Thread Terry Dontje
I've ran into an issue while running hpl where a message has been sent (in shared memory in this case) and the receiver calls iprobe but doesn't see said message the first call to iprobe (even though it is there) but does see it the second call to iprobe. Looking at mca_pml_ob1_iprobe

Re: [OMPI devel] SLES 9 compilation error

2008-06-16 Thread Terry Dontje
, Terry Dontje wrote: When compiling the latest trunk under SLES 9 I am seeing the following error: ../../../../../opal/mca/maffinity/libnuma/maffinity_libnuma_module.c:118: error: `MPOL_MF_MOVE' undeclared (first use in this function) Looks like SLES 9 numaif.h does not support MPOL_MF_MOVE

[OMPI devel] SLES 9 compilation error

2008-06-16 Thread Terry Dontje
When compiling the latest trunk under SLES 9 I am seeing the following error: ../../../../../opal/mca/maffinity/libnuma/maffinity_libnuma_module.c:118: error: `MPOL_MF_MOVE' undeclared (first use in this function) Looks like SLES 9 numaif.h does not support MPOL_MF_MOVE. Can we somehow

Re: [OMPI devel] Memory hooks stuff

2008-05-23 Thread Terry Dontje
Jeff Squyres wrote: Brian and I were chatting the other day about random OMPI stuff and the topic of the memory hooks came up again. Brian was wondering if we should [finally] revisit this topic -- there's a few things that could be done to make life "better". Two things jump to mind: -

Re: [OMPI devel] RFC: Linuxes shipping libibverbs

2008-05-22 Thread Terry Dontje
Jeff Squyres wrote: On May 22, 2008, at 6:50 AM, Terry Dontje wrote: Brian and I chatted a bit about this off-list, and I think we're in agreement now: - do not change the default value or meaning of btl_base_want_component_unsed. - major point of confusion: the openib BTL is actually

Re: [OMPI devel] RFC: Linuxes shipping libibverbs

2008-05-22 Thread Terry Dontje
Jeff Squyres wrote: Brian and I chatted a bit about this off-list, and I think we're in agreement now: - do not change the default value or meaning of btl_base_want_component_unsed. - major point of confusion: the openib BTL is actually fairly unique in that it can (and does) tell the

Re: [OMPI devel] RFC: Linuxes shipping libibverbs

2008-05-21 Thread Terry Dontje
So are you proposing to set btl_base_warn_component_unused to 0 or something more BTL specific? --td Jeff Squyres wrote: What: Change default in openib BTL to not complain if no OpenFabrics devices are found Why: Many linuxes are shipping libibverbs these days, but most users still don't

Re: [OMPI devel] v1.3 Feature Freeze in effect

2008-05-14 Thread Terry Dontje
I am right to assume that bug fixes are allowed. --td Brad Benton wrote: All: As of today (May 13, 2008), the trunk is under v1.3 feature freeze until it is stabilized and branched (targeted for May 23, 2008). Here are the guidelines for activity in the trunk while we are under the v1.3

Re: [OMPI devel] OMPI Mercurial read-only mirror

2008-05-05 Thread Terry Dontje
Roland Dreier wrote: > > > Can I make a /tmp branch from the hg read-only branch that is not tied > > > to the svn /tmp branches. > > Why do you want to do that? > > > > Mercurial is a fully distributed system, so you could just start > > committing to one of your local copies of the

Re: [OMPI devel] OMPI Mercurial read-only mirror

2008-05-03 Thread Terry Dontje
Ralph Castain wrote: Sure: hg clone http://www.open-mpi.org/hg/hgwebdir.cgi/ompi-svn-mirror my-tmp I want the tmp to reside on www.open-mpi.org not in my own directory. --td On 5/2/08 9:57 AM, "Terry Dontje" <terry.don...@sun.com> wrote: Jeff Squyres wrote:

Re: [OMPI devel] OMPI Mercurial read-only mirror

2008-05-03 Thread Terry Dontje
Roland Dreier wrote: > Can I make a /tmp branch from the hg read-only branch that is not tied > to the svn /tmp branches. Why do you want to do that? Mercurial is a fully distributed system, so you could just start committing to one of your local copies of the repository, and I can't see

Re: [OMPI devel] OMPI Mercurial read-only mirror

2008-05-02 Thread Terry Dontje
Jeff Squyres wrote: On May 2, 2008, at 11:04 AM, Terry Dontje wrote: Is there a way to make a hg specific /tmp branch? I'm not sure what you're asking...? Can I make a /tmp branch from the hg read-only branch that is not tied to the svn /tmp branches. --td

Re: [OMPI devel] OMPI Mercurial read-only mirror

2008-05-02 Thread Terry Dontje
Jeff Squyres wrote: Taking steps towards Mercurial, we have setup a read-only Mercurial mirror of the official OMPI SVN repository: http://www.open-mpi.org/hg/hgwebdir.cgi/ompi-svn-mirror/ Anything you commit to SVN should show up on the HG mirror within 30 minutes. This mirror is

Re: [OMPI devel] RMAPS rank_file component patch and modifications for review

2008-03-31 Thread Terry Dontje
Jeff Squyres wrote: On Mar 27, 2008, at 8:02 AM, Lenny Verkhovsky wrote: - I don't think we can delete the MCA param ompi_paffinity_alone; it exists in the v1.2 series and has historical precedent. It will not be deleted, It will just use the same infrastructure ( slot_list parameter

Re: [OMPI devel] vt configuration issues

2008-02-28 Thread Terry Dontje
Jeff Squyres wrote: I can't remember if I posted about this before or not -- should we disable trunk/VT building by default while the configuration issues are being worked out? If we want to guarrantee things to build by default I would think the answer to the above would be yes. --td

Re: [OMPI devel] [RFC] Non-blocking collectives (LibNBC) merge to trunk

2008-02-07 Thread Terry Dontje
increasing number of such "included" packages, how complex is -that- release discussion going to get?!? On 2/7/08 4:48 AM, "Terry Dontje" <terry.don...@sun.com> wrote: Torsten Hoefler wrote: Hi Brian, Let me start by reminding everyone that I have

Re: [OMPI devel] Making an embeddable libev

2007-12-20 Thread Terry Dontje
an impasse. I think both sides need to go to their respective corners, count to ten and then maybe the respective communities should consider trying to have two other people come together to discuss the issues and concerns. --td Terry Dontje - Senior Staff Engineer Sun Microsystems, Inc. Marc

Re: [OMPI devel] GROUP_EMPTY fixes break intel tests :-(

2007-12-06 Thread Terry Dontje
Jeff Squyres wrote: I should also note the following: - LAM/MPI does the same thing (increments refcount when GROUP_EMPTY is returned to the user, and allows GROUP_EMPTY in GROUP_FREE) - MPICH2 has the following comment in GROUP_FREE (and code to match): /* Cannot free the

Re: [OMPI devel] THREAD_MULTIPLE

2007-11-28 Thread Terry Dontje
If the guidelines are made for the BTLs Sun will handle the udapl btl. We can also help in testing too. --td George Bosilca wrote: Yes, "us" means UTK. Our math folks are pushing hard for this. I'll gladly accept any help, even if it's only for testing. For development, I dispose of some of

Re: [OMPI devel] THREAD_MULTIPLE

2007-11-28 Thread Terry Dontje
Jeff Squyres wrote: The MPICH guys presented TCP results with THREAD_MULTIPLE at Euro PVM/ MPI and frankly, I was amazed that it worked at all. I seriously doubt that we're going to advance the state of threading on the 1.2 series (which is nowhere as close as it is on the 1.3 series).

[OMPI devel] vt integration

2007-11-28 Thread Terry Dontje
I haven't tried to debug the following but I am getting the following errors when building the vt-integration tmp branch on Solaris. So I don't think the branch is ready for putback yet. --td cc -DHAVE_CONFIG_H -I. -I..

[OMPI devel] OMPI Bug Status

2007-11-20 Thread Terry Dontje
At today's OMPI concall meeting it was decided some bug management was appropriate. In order to do this we would like the community to do the following two tasks: 1. Review all 1.2.X bugs under trac send email to the devel alias as to the bugs you think should be fixed for 1.2.5 and beyond.

Re: [OMPI devel] initial SCTP BTL commit comments?

2007-11-14 Thread Terry Dontje
Brad Penoff wrote: On Nov 12, 2007 3:26 AM, Jeff Squyres wrote: I have no objections to bringing this into the trunk, but I agree that an .ompi_ignore is probably a good idea at first. I'll try to cook up a commit soon then! One question that I'd like to have

Re: [OMPI devel] accessors to context id and message id's

2007-11-08 Thread Terry Dontje
George Bosilca wrote: On Nov 6, 2007, at 8:38 AM, Terry Dontje wrote: George Bosilca wrote: If I understand correctly your question, then we don't need any extension. Each request has a unique ID (from PERUSE perspective). However, if I remember well this is only half implemented in our

[OMPI devel] accessors to context id and message id's

2007-11-05 Thread Terry Dontje
Currently in order to do message tracing one either has to rely on some error prone postprocessing of data or replicating some MPI internals up in the PMPI layer. It would help Sun's tools group (and I believe U Dresden also) if Open MPI would create a couple APIs that exoposed the following:

Re: [OMPI devel] RFC: versioning OMPI libraries

2007-10-15 Thread Terry Dontje
Christian Bell wrote: On Mon, 15 Oct 2007, Brian Barrett wrote: No! :) It would be good for everyone to read the Libtool documentation to see why versioning on the release number would be a really bad idea. Then comment. But my opinion would be that you should change based on

Re: [OMPI devel] [RFC] change wrapper compilers from binaries to shell scripts

2007-10-12 Thread Terry Dontje
Will these new scripts be using the same wrapper config files as the binaries were? --td Richard Graham wrote: What: Change the mpicc/mpicxx/mpif77/mpif90 from being binaries to being shell scripts Why: Our build environment assumes that wrapper compilers will use the same binary format

Re: [OMPI devel] DDT for v1.2 branch

2007-10-10 Thread Terry Dontje
Jeff Squyres wrote: George has proposed to bring the DDT over from the trunk to the v1.2 branch before v1.2.5 in order to fix some pending bugs. What does this entail (ie does this affect the pml interface at all)? Also by saying "before v1.2.5" I am assuming you mean this fix is to be

Re: [OMPI devel] Message Queue debugging support for1.2.4

2007-09-20 Thread Terry Dontje
a new 1.2 RC. thanks, --td Terry Dontje wrote: Nikolay and Community, Sorry to be so late in responding to your email but I've been working with Pak to determine whether my hasty decision as RM yesterday was hasty or not. To answer your question, we are still trying to determine

Re: [OMPI devel] Message Queue debugging support for1.2.4

2007-09-19 Thread Terry Dontje
Nikolay and Community, Sorry to be so late in responding to your email but I've been working with Pak to determine whether my hasty decision as RM yesterday was hasty or not. To answer your question, we are still trying to determine if the message queue support can go in or not and the below

Re: [OMPI devel] Prep for 1.2.4 release

2007-09-19 Thread Terry Dontje
Jeff Squyres wrote: All organizations should review the README file (particularly the list of supported systems, etc.) to ensure that it is good-to-go and accurate for the 1.2.4. release. Tim posted 1.2.4rc1 yesterday: http://www.open-mpi.org/software/ompi/v1.2/ I would like to

Re: [OMPI devel] Which tests for larger cluster testing

2007-09-17 Thread Terry Dontje
less theres something specific I'm after, ie benchmarks or apps I'm using as a benchmark, rather than test suites. You might look at some of the purple benchmarks: http://www.llnl.gov/asci/platforms/purple/rfp/benchmarks/limited/ code_list.html Andrew Terry Dontje wrote: What about Sandi

Re: [OMPI devel] Which tests for larger cluster testing

2007-09-17 Thread Terry Dontje
What about Sandia and LANL? Is there anything that is ran on their large clusters to confirm things seem to work at high np's? --td Jeff Squyres wrote: Cisco is not yet testing that large, but we plan to shortly start testing at np>=128 (I'm waiting for an internal cluster within Cisco to

Re: [OMPI devel] SM BTL hang issue

2007-08-30 Thread Terry . Dontje
Li-Ta Lo wrote: On Thu, 2007-08-30 at 12:25 -0400, terry.don...@sun.com wrote: Li-Ta Lo wrote: On Wed, 2007-08-29 at 14:06 -0400, Terry D. Dontje wrote: hmmm, interesting since my version doesn't abort at all. Some problem with fortran compiler/language

Re: [OMPI devel] SM BTL hang issue

2007-08-30 Thread Terry . Dontje
Li-Ta Lo wrote: On Wed, 2007-08-29 at 14:06 -0400, Terry D. Dontje wrote: hmmm, interesting since my version doesn't abort at all. Some problem with fortran compiler/language binding? My C translation doesn't have any problem. [ollie@exponential ~]$ mpirun -np 4 a.out 10 Target

<    1   2   3