[OMPI devel] Orte update

2007-07-12 Thread Ralph H Castain
Yo all I have a fairly significant change coming to the orte part of the code base that will require an autogen (sorry). I'll check it in late this afternoon (can't do it at night as it is on my office desktop). The commit will fix the singleton operations, including singleton comm_spawn. It also

Re: [OMPI devel] Notes on building and running Open MPI on Red Storm

2007-07-12 Thread Brian Barrett
Do you have a Subversion account? If so, feel free to update the wiki ;). If not, we should probably get you an account. Then feel free to update the wiki ;). But thanks for the notes! Brian On Jul 11, 2007, at 4:47 PM, Glendenning, Lisa wrote: Some supplementary information to the wik

Re: [OMPI devel] Notes on building and running Open MPI on Red Storm

2007-07-12 Thread Brian Barrett
On Jul 11, 2007, at 4:47 PM, Glendenning, Lisa wrote: * When linking with libopen-pal, the following warning is normal: 'In function `checkpoint_response': warning: mkfifo is not implemented and will always fail' Josh - I thought the checkpoint code wasn't built unless requested. Anyway,

Re: [OMPI devel] Notes on building and running Open MPI on Red Storm

2007-07-12 Thread Joshua Hursey
Thanks for the heads up. I've noticed this warning on the Cray systems here at ORNL, and haven't had a chance to put the fix in yet. This function is exposed in non-CR builds as a user interface item. If the user requests a checkpoint of an MPI job that was not compiled with C/R (or doesn't

[OMPI devel] OpenIB BTL and SRQs

2007-07-12 Thread Don Kerr
Through mca parameters one can select the use of shared receive queues in the openib btl, other than having fewer queues I am wondering what are the benefits of using this option. Can anyone eleborate on using them vs the default?

Re: [OMPI devel] OpenIB BTL and SRQs

2007-07-12 Thread Galen Shipman
On Jul 12, 2007, at 10:29 AM, Don Kerr wrote: Through mca parameters one can select the use of shared receive queues in the openib btl, other than having fewer queues I am wondering what are the benefits of using this option. Can anyone eleborate on using them vs the default? In the trunk the

Re: [OMPI devel] OpenIB BTL and SRQs

2007-07-12 Thread Jeff Squyres
There's a few benefits: - Remember that you post a big pool of buffers instead of num_peers individual sets of receive buffers. Hence, if you post M buffers for each of N peers, each peer -- due to flow control -- can only have M outstanding sends at a time. So if you have apps sending lo

Re: [OMPI devel] OpenIB BTL and SRQs

2007-07-12 Thread Don Kerr
Interesting. So with SRQs there is no flow control, I am guessing the btl sets some reasonable default but essentially is relying on the user to adjust other parameters so the buffers are not over run. And yes Galen I would like to read your paper. Jeff Squyres wrote: There's a few benefits:

Re: [OMPI devel] OpenIB BTL and SRQs

2007-07-12 Thread Don Kerr
Jeff Squyres wrote: There's a few benefits: - Remember that you post a big pool of buffers instead of num_peers individual sets of receive buffers. Hence, if you post M buffers for each of N peers, each peer -- due to flow control -- can only have M outstanding sends at a time. So if

Re: [OMPI devel] OpenIB BTL and SRQs

2007-07-12 Thread Jeff Squyres
On Jul 12, 2007, at 1:18 PM, Don Kerr wrote: - So if you want to simply eliminate the flow control, choose M high enough (or just a total number of receive buffers to post to the SRQ) that you won't ever run out of resources and you should see some speedup from lack of flow control. This obviou

Re: [OMPI devel] Orte update

2007-07-12 Thread Ralph H Castain
Yo folks Several of us are stuck waiting for this commit to hit. Rather than wasting the next several hours, I'm going to make the commit now. So please be advised: if you do an update after this commit hits, you will need to autogen. You may want to wait until a convenient time before doing the

Re: [OMPI devel] Orte update

2007-07-12 Thread Ralph H Castain
The commit has been made - it is r15390. This commit restored the ability to execute singletons and singleton comm_spawn, both in single node and multi-node environments. It also includes a first step in our plan to reduce the ORTE system to the minimum functionality required to support Open MPI (

Re: [OMPI devel] [devel-core] Orte update

2007-07-12 Thread George Bosilca
We have the ODLS framework which is supposed to launch local processes. Can we use it in order to spawn the local daemons ? This will solve the Windows problem, and will give us a more consistent environment. george. On Jul 12, 2007, at 4:02 PM, Ralph H Castain wrote: The commit has be

Re: [OMPI devel] [devel-core] Orte update

2007-07-12 Thread Ralph H Castain
I don't think so - the decision to fork must come earlier, before that framework can be selected. At the time of the fork, we don't have access to very much in terms of services. You are welcome to look and see if you can find a way to do it. The fork/exec occurs in orte/mca/sds/base/sds_base_univ

Re: [OMPI devel] [devel-core] Orte update

2007-07-12 Thread Ralph H Castain
I should have expounded further... We actually looked at using the ODLS, and at creating a new opal_fork capability that could perhaps be shared between the ODLS and this point in the code. Unfortunately, neither option worked very well. In the ODLS case, we had to pepper it with if statements to

[OMPI devel] OMPI_FREE_LIST improvements

2007-07-12 Thread Galen Shipman
In working on my changes in the ib_multifrag branch I modified the ompi_free_list. The change enables a free list to have a bit more personality than what is dictated by the type of the item on the free list. The overall problem was that we often use different free list item types to simply

[OMPI devel] Major reduction in ORTE

2007-07-12 Thread Ralph H Castain
Yo all As we are discussing functional requirements for the upcoming 1.3 release, I was asked to provide a little info about what is going to be happening to the ORTE part of the code base over the remainder of this year. Short answer: there will be a major code revision to reduce ORTE to the min

Re: [OMPI devel] [devel-core] Major reduction in ORTE

2007-07-12 Thread Jeff Squyres
Thanks for the summary Ralph. On Jul 12, 2007, at 5:04 PM, Ralph H Castain wrote: Yo all As we are discussing functional requirements for the upcoming 1.3 release, I was asked to provide a little info about what is going to be happening to the ORTE part of the code base over the remainder