Re: [OMPI devel] Commit r19868

Ralph Castain Fri, 31 Oct 2008 19:53:48 -0400

Crumby - referenced wrong commit. My commit was r19866. My apologiesto George, the author of 19868 that cleaned up a problem created by mycommit.


Ralph


On Oct 31, 2008, at 5:50 PM, Ralph Castain wrote:

Hi all
I made a commit a little earlier that contains modifications thatreduces duplicate data storage and represents a first step towardssupporting fully routed RML communications, along with a new "radixtree" routed component requested by ORNL. There will undoubtedly beimprovements to these changes over the next few months, but theyprovide an initial platform for us to more thoroughly investigatethe issues involved in fully routing all out-of-band communications.
A brief outline of the changes include:
1. removes the direct routed component and adds a new "radix"component
2. shifts storage of nidmap and pidmap info from the odls to the esson daemons - this is where the data is stored for everyone else, soit makes no sense to store it someplace different on the daemon.Required adding an API to the ess framework so that a pidmap can beadded to the data in the ess when daemons get a comm_spawn request(the ess data store was already setup for this - just didn't havethe API yet).
3. adds an API to the ess framework to obtain the daemon that hostsa specified proc from the ess pidmap. Because this data is nowobtained here, we don't need to keep callingorte_routed.update_route for every proc in our own job - so thosecalls have been removed from the startup procedure. This eliminatesthe hash tables in every routed module that essentially duplicatedthe pidmap already present in the ess - not because anyone wasstupid, but rather because the first routed modules were originallywritten prior to the ess pidmap being created, and everyone copy/pasted from there.
At the moment, the revised trunk fully routes all communicationswith two exceptions:
1. the binomial module still directly routes between all daemons -i.e., communications don't flow along the tree, but instead short-circuit the tree to go directly to the daemon that hosts the targetproc. I propose to change this in a later revision, but want toleave something constant for the moment.
2. all routed modules have daemons sending direct to the HNP itself.This was required for two reasons:
(a) during startup, the daemons need to "phone home", but have noknowledge at that moment of the contact info for the other daemonsin the routing tree. Thus, they have no choice but to send direct tothe HNP. We hope to change this in a later revision by switching towell-known static ports - but for now, we have to go direct.
(b) in our current shutdown procedure, the outbound message tellingthe orteds to terminate goes out across the module's routing tree.This xcast procedure causes the daemon to relay the cmd to the nextdaemons in the tree, and then to execute it. Thus, after relayingthe cmd, the daemon dutifully terminates. However, we require eachdaemon to send a confirming message to return to the HNP so it knowsit can exit. That returning message cannot get through because theintermediate daemons have already terminated. I am working onalternative methods for detecting daemon termination so we caneliminate the return "ack" - but for now, we have to send the "ack"
direct to the HNP to ensure it gets through.
Some preliminary tests I've conducted indicate that fully routingcommunications had no detrimental impact on launch speed nor IBwireup time. I plan to further test this at larger scales, as wellas continue to develop the new capabilities.
Please let me know if you encounter any problems, or have anycomments/suggestions.
Ralph

Re: [OMPI devel] Commit r19868

Reply via email to