Re: [OMPI devel] trac 1857: SM btl hangs when msg >=4k

2009-04-06 Thread Eugene Loh
George Bosilca wrote: I got some free time (yeh haw) and took a look at the OB1 PML in order to fix the issue. I think I found the problem, as I'm unable to reproduce this error. Sorry, this sentence has me baffled. Are you unable to reproduce the problem before the fixes or afterwards? T

Re: [OMPI devel] trac 1857: SM btl hangs when msg >=4k

2009-04-06 Thread Eugene Loh
ilure.   PML fix can be done later (IMHO) On Sat, Apr 4, 2009 at 1:46 AM, Eugene Loh <eugene@sun.com> wrote: What's next on this ticket?  It's supposed to be a blocker.  Again, the issue is that osu_bw deluges a receiver with rendezvous messages, but the receiver does not ha

Re: [OMPI devel] trac 1857: SM btl hangs when msg >=4k

2009-04-03 Thread Eugene Loh
re's less headroom to grow the free lists. Possible fixes are: A) Just make the mmap file default size larger (though less overkill than we used to have). B) Fix the PML code that is supposed to deal with cases like this. (At least I think the PML has code that's intended for this pu

[OMPI devel] access to tests

2009-04-03 Thread Eugene Loh
Do I need to buy someone a beer to get access to the test suites? [eloh@milliways]$ svn co https://svn.open-mpi.org/svn/ompi/trunk [... successful ...] [eloh@milliways]$ svn co https://svn.open-mpi.org/svn/ompi-tests/trunk/intel_tests svn: PROPFIND request failed on '/svn/ompi-tests/trunk/intel

[OMPI devel] event library

2009-04-03 Thread Eugene Loh
What is the purpose of the event library? I'd happily RTFM if someone could point me in that direction! :^) I'm guessing it's to check occasionally for "unexpected events", but if someone could confirm/deny and flesh that picture out a little, I'd appreciate it.

Re: [OMPI devel] [OMPI users] Open MPI 2009 released

2009-04-02 Thread Eugene Loh
Ah. George, you should have thought about that. I understand your eagerness to share this exciting news, but perhaps an April-1st announcement detracted from the seriousness of this grand development. Here's another desirable MPI feature. People talk about "error detection/correction". We

[OMPI devel] trac 1857: SM btl hangs when msg >=4k

2009-04-01 Thread Eugene Loh
In osu_bw, process 0 pumps lots of Isend's to process 1, and process 1 in turn sets up lots of matching Irecvs. Many messages are in flight. The question is what happens when resources are exhausted and OMPI cannot handle so much in-flight traffic. Let's specifically consider the case of lon

Re: [OMPI devel] SM init failures

2009-03-31 Thread Eugene Loh
Jeff Squyres wrote: On Mar 31, 2009, at 3:06 PM, Eugene Loh wrote: The thing I was wondering about was memory barriers. E.g., you initialize stuff and then post the FIFO pointer. The other guy sees the FIFO pointer before the initialized memory. We do do memory barriers during that SM

Re: [OMPI devel] SM init failures

2009-03-31 Thread Eugene Loh
Jeff Squyres wrote: On Mar 31, 2009, at 1:46 AM, Eugene Loh wrote: > FWIW, George found what looks like a race condition in the sm init > code today -- it looks like we don't call maffinity anywhere in the > sm btl startup, so we're not actually guaranteed that the memory

Re: [OMPI devel] SM init failures

2009-03-31 Thread Eugene Loh
Jeff Squyres wrote: FWIW, George found what looks like a race condition in the sm init code today -- it looks like we don't call maffinity anywhere in the sm btl startup, so we're not actually guaranteed that the memory is local to any particular process(or) (!). This race shouldn't cause

Re: [OMPI devel] SM init failures

2009-03-30 Thread Eugene Loh
Jeff Squyres wrote: On Mar 30, 2009, at 1:40 PM, Patrick Geoffray wrote: > we will have to find a > pretty smart way to do this or we will completely break the memory > affinity stuff. I didn't look at the code, but I sure hope that the SM init code does touch each page to force allocation,

Re: [OMPI devel] SM init failures

2009-03-30 Thread Eugene Loh
Tim Mattox wrote: I think I remember setting up the MTT tests on Sif so that tests are run both with and without the coll_hierarch component selected. The coll_hierarch component stresses code paths and potential race conditions in its own way. So, if the problems are showing up more frequently

Re: [OMPI devel] SM init failures

2009-03-30 Thread Eugene Loh
Patrick Geoffray wrote: Jeff Squyres wrote: Why not? The "owning" process can do the touch; then it'll be affinity'ed properly. Right? Yes, that's what I meant by forcing allocation. From the thread, it looked like nobody touched the pages of the mapped file. If it's already done, no nee

Re: [OMPI devel] SM init failures

2009-03-30 Thread Eugene Loh
Jeff Squyres wrote: It's half done, actually. But it was still going to be an option, not necessarily the only way to do it: http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/shm-sysv/ On Mar 30, 2009, at 1:40 PM, Tim Mattox wrote: I've been lurking on this conversation, and I am again

Re: [OMPI devel] SM init failures

2009-03-30 Thread Eugene Loh
George Bosilca wrote: Then it looks like the safest solution is the use either ftruncate or the lseek method and then touch the first byte of all memory pages. Unfortunately, I see two problems with this. First, there is a clear performance hit on the startup time. And second, we will have to

Re: [OMPI devel] SM init failures

2009-03-27 Thread Eugene Loh
Paul H. Hargrove wrote: Quoting from a different manpage for ftruncate: [T]he POSIX standard allows two behaviours for ftruncate when length exceeds the file length [...]: either returning an error, or extending the file. So, if that is to be trusted, it is not legal by PO

Re: [OMPI devel] SM init failures

2009-03-27 Thread Eugene Loh
Josh Hursey wrote: Sif is also running the coll_hierarch component on some of those tests which has caused some additional problems. I don't know if that is related or not. Indeed. Many of the MTT stack traces (for both 1.3.1 and 1.3.2 and that have seg faults and call out mca_btl_sm.so)

Re: [OMPI devel] SM init failures

2009-03-27 Thread Eugene Loh
Ralph Castain wrote: You are correct - the Sun errors are in a version prior to the insertion of the SM changes. We didn't relabel the version to 1.3.2 until -after- those changes went in, so you have to look for anything with an r number >= 20839. The sif errors are all in that group - I

Re: [OMPI devel] SM init failures

2009-03-26 Thread Eugene Loh
Ralph Castain wrote: It looks like the SM revisions we inserted into 1.3.2 are a great detector for shared memory init failures - it segfaulted 143 times last night on IU's sif computer, 34 times on Sun/Linux, and 3 times on Sun/SunOS...almost every single time due to "Address not mapped"

Re: [OMPI devel] SM init failures

2009-03-26 Thread Eugene Loh
Ralph Castain wrote: Hi folks Er, perhaps pronounced "Eugene". :^( It looks like the SM revisions we inserted into 1.3.2 are a great detector for shared memory init failures How delicately put! I appreciate the gentleness. - it segfaulted 143 times last night on IU's sif computer, 34

Re: [OMPI devel] 1.3.1rc5

2009-03-23 Thread Eugene Loh
Jeff Squyres wrote: Looks good to cisco. Ship it. I'm still seeing a very low incidence of the sm segv during startup (. 01% -- 23 tests out of ~160k), so let's ship 1.3.1 and roll in Eugene's new sm code for 1.3.2. For what it's worth, I just ran a start-up test... "main() {MPI_Init();M

Re: [OMPI devel] 1.3.1rc5

2009-03-20 Thread Eugene Loh
Jeff Squyres wrote: Looks good to cisco. Ship it. I'm still seeing a very low incidence of the sm segv during startup (. 01% -- 23 tests out of ~160k), so let's ship 1.3.1 and roll in Eugene's new sm code for 1.3.2. I wanted to join in the fun, but... no go. I'm running an "MPI_Init()"

Re: [OMPI devel] OMPI vs Scali performance comparisons

2009-03-18 Thread Eugene Loh
tigate, in order to see if we can have the same bandwidth as they do or not. Are you suggesting bumping up the btl_sm_max_send_size value from 32K to something greater? On Mar 17, 2009, at 18:23 , Eugene Loh wrote: A colleague of mine ran some microkernels on an 8-way Barcelona box (Sun

Re: [OMPI devel] OMPI vs Scali performance comparisons

2009-03-17 Thread Eugene Loh
Jeff Squyres (jsquyres) wrote: Re: [OMPI devel] OMPI vs Scali performance comparisons I still think that the pml fast path fixes would be good. As do I.  Again, I think one needs to go to the BTL sendi as soon as possible after entering the PML, which raised those thorny discuss

[OMPI devel] OMPI vs Scali performance comparisons

2009-03-17 Thread Eugene Loh
A colleague of mine ran some microkernels on an 8-way Barcelona box (Sun x2200M2 at 2.3 GHz). Here are some performance comparisons with Scali. The performance tests are modified versions of the HPCC pingpong tests. The OMPI version is the trunk with my "single-queue" fixes... otherwise, OMP

Re: [OMPI devel] 1.3.1 -- bad MTT from Cisco

2009-03-11 Thread Eugene Loh
Ethan Mallove wrote: Can this error happen on any test? Presumably yes if two or more processes are on the same node. What do these tests have in common? They all try to start. :^) The problem is in MPI_Init. It almost looks like the problem is more likely to occur if MPI_UB or MPI_L

Re: [OMPI devel] 1.3.1 -- bad MTT from Cisco

2009-03-11 Thread Eugene Loh
Ralph Castain wrote: Could be nobody is saying anything...but I would be surprised if - nobody- barked at a segfault during startup. Well, if it segfaulted during startup, someone's first reaction would probably be, "Oh really?" They would try again, have success, attribute to cosmic rays,

Re: [OMPI devel] 1.3.1 -- bad MTT from Cisco

2009-03-11 Thread Eugene Loh
Ralph Castain wrote: Hey Jeff I seem to recall seeing the identical problem reported on the user list not long ago...or may have been the devel list. Anyway, it was during btl_sm_add_procs, and the code was segv'ing. I don't have the archives handy here, but perhaps you might search the

Re: [OMPI devel] trunk problem for large-SMP startup?

2009-03-05 Thread Eugene Loh
Ralph Castain wrote: I just ran a 64ppn job without problem. Couple of possibilities come to mind: 1. you might have some stale lib around - try blowing things away and rebuilding 2. there may be a problem in your specific situation. Can you provide some info on what you are doing (e.g.

Re: [OMPI devel] calling sendi earlier in the PML

2009-03-04 Thread Eugene Loh
Brian W. Barrett wrote: How about removing the MCA parameter from my earlier proposal and just having r2 filter out the sendi calls if there are multiple BTLs with heterogeneous BTLs (ie, some with sendi and some without) to the same peer. That way, the early sendi will be bypassed in that ca

Re: [OMPI devel] calling sendi earlier in the PML

2009-03-04 Thread Eugene Loh
George Bosilca wrote: On Mar 4, 2009, at 14:44 , Eugene Loh wrote: Let me try another thought here. Why do we have BTL sendi functions at all? I'll make an assertion and would appreciate feedback: a BTL sendi function contributes nothing to optimizing send latency. To optimize

[OMPI devel] trunk problem for large-SMP startup?

2009-03-04 Thread Eugene Loh
I have a problem starting large SMP jobs (e.g., 64 processes on a single SMP) that might be related to a recent trunk change. (Guessing.) Does the following ring any bells? ... ... ... [burl-t5440-0:06798] [[57827,1],42] ORTE_ERROR_LOG: Not found in file ess_env_module.c at line 299 [burl-t5

Re: [OMPI devel] calling sendi earlier in the PML

2009-03-04 Thread Eugene Loh
Jeff Squyres wrote: On Mar 3, 2009, at 4:04 PM, Eugene Loh wrote: How about an MCA parameter to switch between this mechanism (early sendi) and the original behavior (late sendi)? This is the usual way that we resolve "I want to do X / I want to do Y" disputes. :-) I see

Re: [OMPI devel] calling sendi earlier in the PML

2009-03-03 Thread Eugene Loh
Jeff Squyres wrote: How about an MCA parameter to switch between this mechanism (early sendi) and the original behavior (late sendi)? This is the usual way that we resolve "I want to do X / I want to do Y" disputes. :-) I see the smiley face, but am unsure how much of the message to appl

Re: [OMPI devel] calling sendi earlier in the PML

2009-03-03 Thread Eugene Loh
Brian W. Barrett wrote: On Tue, 3 Mar 2009, Eugene Loh wrote: First, this behavior is basically what I was proposing and what George didn't feel comfortable with. It is arguably no compromise at all. (Uggh, why must I be so honest?) For eager messages, it favors BTLs with sendi func

Re: [OMPI devel] calling sendi earlier in the PML

2009-03-03 Thread Eugene Loh
Terry Dontje wrote: Eugene Loh wrote: I'm on the verge of giving up moving the sendi call in the PML. I will try one or two last things, including this e-mail asking for feedback. The idea is that when a BTL goes over a very low-latency interconnect (like sm), we really want to shav

Re: [OMPI devel] calling sendi earlier in the PML

2009-03-03 Thread Eugene Loh
Jeff Squyres wrote: How about a compromise... Keep a separate list somewhere of the sendi-enabled BTLs (this avoids looping over all the btl's and testing -- you can just loop over the btl's that you *know* have a sendi). Put that at the top of the PML and avoid the costly overhead, yadd

[OMPI devel] calling sendi earlier in the PML

2009-03-02 Thread Eugene Loh
I'm on the verge of giving up moving the sendi call in the PML. I will try one or two last things, including this e-mail asking for feedback. The idea is that when a BTL goes over a very low-latency interconnect (like sm), we really want to shave off whatever we can from the software stack.

[OMPI devel] PML Start error?

2009-02-27 Thread Eugene Loh
I'm looking at pml_ob1_start.c. It loops over requests and starts them. It makes some decision about whether an old request can be reused or if a new one must be allocated/initialized. So, there is a variable named reuse_old_request. It's initialized to "true", but if a new request must be

Re: [OMPI devel] mca_btl_sm_sendi question

2009-02-25 Thread Eugene Loh
George Bosilca wrote: On Feb 24, 2009, at 18:08 , Eugene Loh wrote: (Probably this message only for George, but I'll toss it out to the alias/archive.) Actually, maybe Rich should weigh in here, too. This relates to the overflow mechanism in MCA_BTL_SM_FIFO_WRITE. I have a que

[OMPI devel] mca_btl_sm_sendi question

2009-02-24 Thread Eugene Loh
(Probably this message only for George, but I'll toss it out to the alias/archive.) I have a question about the sm sendi() function. What should happen if the sendi() function attempts to write to the FIFO, but the FIFO is full? Currently, it appears that the sendi() function returns an erro

Re: [OMPI devel] RFC: eliminating "descriptor" argument from sendi function

2009-02-24 Thread Eugene Loh
George Bosilca wrote: Here is another way to write the code without having to pay the expensive initialization of sendreq. first_time = 0; for ( btl = ... ) { if ( SUCCESS == sendi() ) return SUCCESS; if( 0 == first_time++) set_up_expensive_send_request(&sendreq); i

Re: [OMPI devel] RFC: eliminating "descriptor" argument from sendi function

2009-02-23 Thread Eugene Loh
Eugene Loh wrote: Actually, there may be a more important issue here. Currently, the PML chooses the BTL first. Once the BTL choice is established, only then does the PML choose between sendi and send. Currently, it's also the case that we're spending a lot of time in the P

Re: [OMPI devel] RFC: eliminating "descriptor" argument from sendi function

2009-02-23 Thread Eugene Loh
uld succeed? Or, does my proposed change provide the justification for my pulling descriptor allocations out of the sendi functions? Further comments (of less importance) below: George Bosilca wrote: On Feb 23, 2009, at 12:14 , Eugene Loh wrote: George Bosilca wrote: It doesn't soun

Re: [OMPI devel] RFC: eliminating "descriptor" argument from sendi function

2009-02-23 Thread Eugene Loh
I'm a newbie and George is a veteran. So, this feels rather like David and Goliath. (Hmm, David won and became king. Gee, I kinda like that.) Anyhow... George Bosilca wrote: It doesn't sound reasonable to me. There is a reason for this, and I think it's a good reason. The sendi function

Re: [OMPI devel] RFC: eliminating "descriptor" argument from sendi function

2009-02-23 Thread Eugene Loh
now who it is? Why, then, should the PML have to tell it? So, how about removing the BTL argument as well? Jeff Squyres wrote: Sounds reasonable to me. George / Brian? On Feb 21, 2009, at 2:11 AM, Eugene Loh wrote: What: Eliminate the "descriptor" argument from sendi functions.

[OMPI devel] sendi side effects in the case of failure

2009-02-21 Thread Eugene Loh
I'm still trying to understand what side effects there are if a sendi function fails. So far as I can tell, there are no written contracts/specs about what should happen (please tell me if that's wrong), so it's a matter of looking at the code. The only BTLs with sendi code are portals, mx, a

[OMPI devel] RFC: eliminating "descriptor" argument from sendi function

2009-02-21 Thread Eugene Loh
What: Eliminate the "descriptor" argument from sendi functions. Why: The only thing this argument is used for is so that the sendi function can allocate a descriptor in the event that the "send" cannot complete. But, in that case, the sendi reverts to the PML, where there is already code to

Re: [OMPI devel] workspace management question

2009-02-19 Thread Eugene Loh
gnore directives have to be modified. (I'm assuming here.) If someone wants to get fancy, let them blaze their own trail. On Feb 19, 2009, at 12:37 PM, Eugene Loh wrote: Eugene Loh wrote: Okay, thanks for all the feedback. New version is at: https://svn.open-mpi.org/trac/ompi/w

Re: [OMPI devel] workspace management question

2009-02-19 Thread Eugene Loh
Terry Dontje wrote: Eugene Loh wrote: Okay, thanks for all the feedback. New version is at: https://svn.open-mpi.org/trac/ompi/wiki/UsingMercurial#Developmentcycle If everyone is happy with that, I'll remove the old version, along with the diagram. So I like the new text much b

Re: [OMPI devel] workspace management question

2009-02-19 Thread Eugene Loh
Okay, thanks for all the feedback. New version is at: https://svn.open-mpi.org/trac/ompi/wiki/UsingMercurial#Developmentcycle If everyone is happy with that, I'll remove the old version, along with the diagram. Jeff Squyres wrote: Here's what I typically run to bring down changes from SVN

Re: [OMPI devel] workspace management question

2009-02-19 Thread Eugene Loh
Terry Dontje wrote: Ralph Castain wrote: On Feb 19, 2009, at 5:39 AM, Terry Dontje wrote: Eugene Loh wrote: Jeff Squyres wrote: Here's what I typically run to bring down changes from SVN to HG: # Ensure all the latest hg repo changes are in the working dir hg up # Bring in all th

[OMPI devel] sm BTL question: frag alloc

2009-02-17 Thread Eugene Loh
(Rich:  same question as I asked you in private e-mail.) Should the first fragment of a message be an eager fragment even when the message is long and a rendezvous protocol is employed? So far as I can tell, a long MPI_Send starts like this:    MPI_Send()    mca_pml_ob1_send()    mca_pml_o

Re: [OMPI devel] workspace management question

2009-02-17 Thread Eugene Loh
Jeff Squyres wrote: Here's what I typically run to bring down changes from SVN to HG: # Ensure all the latest hg repo changes are in the working dir hg up # Bring in all the SVN changes svn up # Refresh the .hgignore file (may change due to the svn up) ./contrib/hg/build-hgignore.pl # Add / rem

Re: [OMPI devel] workspace management question

2009-02-17 Thread Eugene Loh
svn.open-mpi.org/trac/ompi/wiki/UsingMercurial On Feb 17, 2009, at 12:36 PM, Eugene Loh wrote: Let's say I have a combo SVN/HG workspace. Let's say someone makes changes to the trunk. I guess I bring those over to my combo workspace with "svn up". Yes? How then do I make the HG side of the combo repository see those updates?

[OMPI devel] workspace management question

2009-02-17 Thread Eugene Loh
Let's say I have a combo SVN/HG workspace. Let's say someone makes changes to the trunk. I guess I bring those over to my combo workspace with "svn up". Yes? How then do I make the HG side of the combo repository see those updates?

[OMPI devel] sm latency putback

2009-02-17 Thread Eugene Loh
I think I just did my first putback to the trunk. God help us all! It's r20578 and feedback (e.g., "you broke everything") is appreciated, gentle feedback even more so. I had claimed at the in-person meeting last week that the "single queue" approach showed no appreciable performance regress

Re: [OMPI devel] Announcing searchable OMPI source code tree

2009-02-14 Thread Eugene Loh
Jeff Squyres wrote: Indiana U. has added another service to the Open MPI web site: a fully indexed and searchable database of Open MPI source code trees. There's a link under "Source Code Access" entitled "Searchable source tree" on the OMPI web site that takes you to https://svn.open-mpi

Re: [OMPI devel] svn commit

2009-02-13 Thread Eugene Loh
Ralph Castain wrote: Once you have them in the hg repo, you can do an "svn st" to see if you need to do anything further before committing back to the svn repo - e.g., add or remove files. When you are ready, just do an "svn ci" to commit your changes to the svn repo. Thanks, but I get:

[OMPI devel] svn commit

2009-02-13 Thread Eugene Loh
I'm having trouble figuring out how to put my changes back to the trunk. I've been looking at the wiki pages, but don't really see the one last piece that I need of this puzzle. I've used https://svn.open-mpi.org/trac/ompi/wiki/UsingMercurial to get me through these steps: svn check-out of

Re: [OMPI devel] RFC: Eliminate ompi/class/ompi_[circular_buffer_]fifo.h

2009-02-13 Thread Eugene Loh
George Bosilca wrote: I can't confirm or deny. The only thing I can tell is that the same test works fine over other BTL, so this tent either to pinpoint a problem in the sm BTL or in a particular path in the PML (the one used by the sm BTL). I'll have to dig a little bit more into it, but

Re: [OMPI devel] RFC: Eliminate ompi/class/ompi_[circular_buffer_]fifo.h

2009-02-13 Thread Eugene Loh
hings. On Feb 12, 2009, at 3:58 PM, Eugene Loh wrote: Sorry, what's the connection? Are we talking about https://svn.open-mpi.org/trac/ompi/ticket/1791 ? Are you simply saying that if I'm doing some sm BTL work, I should also look at 1791? I'm trying to figure out if there&

Re: [OMPI devel] RFC: Eliminate ompi/class/ompi_[circular_buffer_]fifo.h

2009-02-12 Thread Eugene Loh
Ralph Castain wrote: You might want to look at ticket #1791 while you are doing this - Brad added some valuable data earlier today. On Feb 12, 2009, at 12:13 PM, Eugene Loh wrote: Jeff Squyres wrote: This should probably include the disclaimer that we talked about this extensively yester

Re: [OMPI devel] RFC: Eliminate ompi/class/ompi_[circular_buffer_]fifo.h

2009-02-12 Thread Eugene Loh
- longer-necessary kruft. On Feb 12, 2009, at 8:53 AM, Eugene Loh wrote: RFC: Eliminate ompi/class/ompi_[circular_buffer_]fifo.h WHAT: Eliminate those two include files. WHY: These include files are only used by the sm BTL. They are not generally usable. Further, the sm BTL will soon no

[OMPI devel] RFC: Eliminate ompi/class/ompi_[circular_buffer_]fifo.h

2009-02-12 Thread Eugene Loh
RFC: Eliminate ompi/class/ompi_[circular_buffer_]fifo.h WHAT: Eliminate those two include files. WHY: These include files are only used by the sm BTL. They are not generally usable. Further, the sm BTL will soon no longer use them. The current FIFOs support only a single sender each and we

[OMPI devel] add_procs

2009-02-05 Thread Eugene Loh
BTLs have "add_procs" functions. E.g., my own parochial interests are with the sm BTL and there is a mca_btl_sm_add_procs() function. I'm trying to get a feel for how likely it is that this function would be called more than once. There is code in there to support the case where it's called

[OMPI devel] "unknown" in-coming fragment in sm BTL

2009-02-05 Thread Eugene Loh
In btl_sm_component.c, mca_btl_sm_component_progress() polls on FIFOs. If it gets something, it has a "switch" statement with cases for send fragments, returned fragments (ACKs) to be returned to the freelist, and default/unknown. What's that default/unknown case about? What behavior should

Re: [OMPI devel] RFC: [slightly] Optimize Fortran MPI_SEND / MPI_RECV

2009-02-04 Thread Eugene Loh
Jeff Squyres wrote: WHAT: Have Fortran MPI_SEND/MPI_RECV directly call the corresponding PML functions instead of the C MPI_Send/MPI_Recv WHY: Slightly optimize the blocking send/receive in Fortran (i.e., remove a function call) WHERE: ompi/mpi/f77/*.c -- possibly add an --enable switch t

Re: [OMPI devel] BTL/sm meeting on Wed after Forum

2009-01-27 Thread Eugene Loh
Jeff Squyres wrote: On the call today, we decided that the main (only) topic for the Wednesday OMPI meeting after Forum will be BTL and sm issues. This is a complex set of topics that could take a few hours to discuss. - Eugene has proposed a number of sm BTL changes, some of which are no

Re: [OMPI devel] RFC: sm Latency

2009-01-22 Thread Eugene Loh
Richard Graham wrote: Re: [OMPI devel] RFC: sm Latency In the recvi function, do you first try to match off the unexpected list before you try and match data in the fifo’s? Within the proposed approach, a variety of things are possible. Within the specific code I've put back so far, I

Re: [OMPI devel] RFC: sm Latency

2009-01-21 Thread Eugene Loh
Ron Brightwell wrote: If you poll only the queue that correspond to a posted receive, you only optimize micro-benchmarks, until they start using ANY_SOURCE. Note that the HPCC RandomAccess benchmark only uses MPI_ANY_SOURCE (and MPI_ANY_TAG). But HPCC RandomAccess also jus

Re: [OMPI devel] RFC: sm Latency

2009-01-21 Thread Eugene Loh
Patrick Geoffray wrote: Eugene Loh wrote: Possibly, you meant to ask how one does directed polling with a wildcard source MPI_ANY_SOURCE. If that was your question, the answer is we punt. We report failure to the ULP, which reverts to the standard code path. Sorry, I meant ANY_SOURCE

Re: [OMPI devel] RFC: sm Latency

2009-01-21 Thread Eugene Loh
Patrick Geoffray wrote: Eugene Loh wrote: To recap: 1) The work is already done. How do you do "directed polling" with ANY_TAG ? Not sure I understand the question.  So, maybe we start by being explicitly about what we mean by "directed polling". C

Re: [OMPI devel] RFC: sm Latency

2009-01-21 Thread Eugene Loh
Richard Graham wrote: On 1/20/09 8:53 PM, "Jeff Squyres" wrote: Eugene: you mentioned that there are other possibilities to having the BTL understand match headers, such as a callback into the PML. Have you tried this approach to see what the performance cost would be, perchance?

Re: [OMPI devel] RFC: sm Latency

2009-01-21 Thread Eugene Loh
Richard Graham wrote: Re: [OMPI devel] RFC: sm Latency On 1/20/09 2:08 PM, "Eugene Loh" <eugene@sun.com> wrote: Richard Graham wrote: Re: [OMPI devel] RFC: sm Latency First, the performance improvements look really nice. A few questions:   - How much o

Re: [OMPI devel] RFC: sm Latency

2009-01-21 Thread Eugene Loh
Brian Barrett wrote: I unfortunately don't have time to look in depth at the patch. But my concern is that currently (today, not at some made up time in the future, maybe), we use the BTLs for more than just MPI point-to- point. The rdma one-sided component (which was added for 1.3 and h

Re: [OMPI devel] RFC: sm Latency

2009-01-20 Thread Eugene Loh
Patrick Geoffray wrote: >Eugene Loh wrote: > > >>>replace the fifo’s with a single link list per process in shared >>>memory, with senders to this process adding match envelopes >>>atomically, with each process reading its own link list (multiple >&

Re: [OMPI devel] RFC: sm Latency

2009-01-20 Thread Eugene Loh
Richard Graham wrote: Re: [OMPI devel] RFC: sm Latency First, the performance improvements look really nice. A few questions:   - How much of an abstraction violation does this introduce? Doesn't need to be much of an abstraction violation at all if, by that, we mean teaching the BTL about

[OMPI devel] RFC: sm Latency

2009-01-17 Thread Eugene Loh
Title: RFC: sm Latency RFC: sm Latency WHAT: Introducing optimizations to reduce ping-pong latencies over the sm BTL. WHY: This is a visible benchmark of MPI performance. We can improve shared-memory latencies from 30% (if hardware latency is the limiting factor) to 2× or more (if MPI s

Re: [OMPI devel] RFC: Eliminate opal_round_up_to_nearest_pow2()

2009-01-15 Thread Eugene Loh
2009, at 5:01 PM, George Bosilca wrote: Absolutely! Why wait until the 1.4 while we can have that in the 1.3.1... On Jan 15, 2009, at 16:39 , Eugene Loh wrote: I don't know what scope of changes require RFCs, but here's a trivial change. ==

[OMPI devel] RFC: Eliminate opal_round_up_to_nearest_pow2()

2009-01-15 Thread Eugene Loh
I don't know what scope of changes require RFCs, but here's a trivial change. == RFC: Eliminate opal_round_up_to_nearest_pow2(). WHAT: Eliminate the function opal_round_up_to_nearest_pow2(). WHY: It's poorly written. A clean rewrite would take

[OMPI devel] RFC: Fragmented sm Allocations

2009-01-15 Thread Eugene Loh
I put back more code changes and refreshed the RFC a little. So, if you want a latest/greatest copy, here is the (slightly) amended RFC. Thanks for the positive feedback so far, but more scrutiny is appreciated! Title: RFC: Fragmented sm Allocations RFC: Fragmented sm Allocations WHAT: D

Re: [OMPI devel] autosizing the shared memory backing file

2009-01-14 Thread Eugene Loh
eed - if all you need is how many procs are local, that can be obtained fairly easily. Be happy to contribute to the chat, if it would be helpful. On Jan 14, 2009, at 7:43 AM, Jeff Squyres wrote: Would it be useful to get on the phone and discuss this stuff? On Jan 14, 2009, at 1:11 AM, Eugene L

[OMPI devel] RFC: Fragmented sm Allocations

2009-01-14 Thread Eugene Loh
Title: RFC: Fragmented sm Allocations RFC: Fragmented sm Allocations WHAT: Dealing with the fragmented allocations of sm BTL FIFO circular buffers (CB) during MPI_Init(). Also: Improve handling of error codes. Automate the sizing of the mmap file. WHY: To reduce consumption of sha

Re: [OMPI devel] autosizing the shared memory backing file

2009-01-14 Thread Eugene Loh
lobal variable or something in the mca/common. In other words there is no way for you to call from the mpool a function from the sm BTL. On Jan 13, 2009, at 19:22 , Eugene Loh wrote: With the sm BTL, there is a file that each process mmaps in for shared memory. I'm trying to get

[OMPI devel] autosizing the shared memory backing file

2009-01-13 Thread Eugene Loh
With the sm BTL, there is a file that each process mmaps in for shared memory. I'm trying to get mpool_sm to size the file appropriately. So, I would like mpool_sm to call some mca_btl_sm function that provides a good guess of the size. (mpool_sm creates and mmaps the file, but the size dep

Re: [OMPI devel] size of shared-memory backing file + maffinity

2009-01-13 Thread Eugene Loh
them for a grand total of 512 MB of shared space. Does that explain my concern any better? On Mon, Jan 12, 2009 at 10:02 PM, Eugene Loh wrote: I'm trying to understand how much shared memory is allocated when maffinity is on. The sm BTL sets up a file that is mmapped into each

[OMPI devel] size of shared-memory backing file + maffinity

2009-01-12 Thread Eugene Loh
I'm trying to understand how much shared memory is allocated when maffinity is on. The sm BTL sets up a file that is mmapped into each local process's address space so that the processes on a node can communicate via shared memory. Actually, when maffinity indicates that there are multiple "

Re: [OMPI devel] sm BTL "extra procs"

2008-12-23 Thread Eugene Loh
ll the required synchronization, we allocated extra memory up front - for dynamic process control. Since this has never been enabled, we really don't need this extra memory. On 12/22/08 11:47 AM, "Eugene Loh" wrote: Why does the sm BTL allocate "extra procs"? E.g., ht

[OMPI devel] sm BTL "extra procs"

2008-12-22 Thread Eugene Loh
Why does the sm BTL allocate "extra procs"? E.g., https://svn.open-mpi.org/trac/ompi/browser/branches/v1.3/ompi/mca/btl/sm/btl_sm.c?version=19785#L403 In particular: *) sm_max_procs is -1 (so there is no max) *) sm_sm_extra_procs (sic, this is the ompi_info name) is 2 So, if there are n procs

Re: [OMPI devel] shared-memory allocations

2008-12-21 Thread Eugene Loh
Richard Graham wrote: Re: [OMPI devel] shared-memory allocations It does not make a difference who allocates it, what makes a difference is who touches it first. Fair enough, but the process that allocates it right away starts to initialize it.  So, each circular buffer is set up (alloc

[OMPI devel] fast path MPI_Sendrecv

2008-12-21 Thread Eugene Loh
I've been looking at a "fast path" for sends and receives. This is like the sendi function, which attempts to send "immediately", without creating a bulky PML send request (which would be needed if, say, the send had to be progressed over multiple user MPI calls). One can do something similar

Re: [OMPI devel] shared-memory allocations

2008-12-12 Thread Eugene Loh
Richard Graham wrote: Re: [OMPI devel] shared-memory allocations The memory allocation is intended to take into account that two separate procs may be touching the same memory, so the intent is to reduce cache conflicts (false sharing) Got it.  I'm totally fine with that.  Separate cacheli

[OMPI devel] BML problem?

2008-12-11 Thread Eugene Loh
I'm not exactly sure where the fix to this should be, but I think I've found a problem. Consider, for illustration, launching a multi-process job on a single node. The function mca_bml_r2_add_procs() calls mca_btl_sm_add_procs() Each process could conceivably return a different valu

[OMPI devel] shared-memory allocations

2008-12-10 Thread Eugene Loh
For shared memory communications, each on-node connection (non-self, sender-receiver pair) gets a circular buffer during MPI_Init(). Each CB requires the following allocations: *) ompi_cb_fifo_wrapper_t (roughly 64 bytes) *) ompi_cb_fifo_ctl_t head (roughly 12 bytes) *) ompi_cb_fifo_ctl_t tail

Re: [OMPI devel] Preparations for moving the btl's

2008-12-04 Thread Eugene Loh
Richard Graham wrote: Re: [OMPI devel] Preparations for moving the btl's I expect this will involve some sort of well defined interface between the btl’s and orte, and I don’t know if this will also require something like this between the btl’s and the pml – I think that interface is rigid

[OMPI devel] make dependency problem?

2008-11-29 Thread Eugene Loh
I was playing with OMPI and I noticed that if I modified btl.h, bml_r2.c did not automatically get rebuilt, even though it includes btl.h. This caused me all sorts of unnecessary debugging troubles. In the end, just touching bml_r2.c was enough... it caused bml_r2.c to be recompiled and to se

Re: [OMPI devel] SM backing file size

2008-11-15 Thread Eugene Loh
Ralph Castain wrote: I probably wasn't clear - see below On Nov 14, 2008, at 6:31 PM, Eugene Loh wrote: Ralph Castain wrote: I have two examples so far: 1. using a ramdisk, /tmp was set to 10MB. OMPI was run on a single node, 2ppn, with btl=openib,sm,self. The program started

Re: [OMPI devel] SM backing file size

2008-11-14 Thread Eugene Loh
Ralph Castain wrote: I have two examples so far: 1. using a ramdisk, /tmp was set to 10MB. OMPI was run on a single node, 2ppn, with btl=openib,sm,self. The program started, but segfaulted on the first MPI_Send. No warnings were printed. Interesting. So far as I can tell, the actual memo

Re: [OMPI devel] SM backing file size

2008-11-14 Thread Eugene Loh
Ralph Castain wrote: I too am interested - I think we need to do something about the sm backing file situation as larger core machines are slated to become more prevalent shortly. I think there is at least one piece of low-flying fruit: get rid of a lot of the page alignments. Especially

<    1   2   3   4   >