Yo all There has been a bit of discussion about this on the core developers list and on telecons, but I felt that perhaps I should provide a more detailed warning to the broader developer community.
In the next few weeks, there will be some major revisions submitted to the Open MPI trunk on the OpenRTE (ORTE) side of the code base. These will primarily address three known issues: 1. Scalability - the test code that was run on Sandia's Thunderbird cluster a few weeks ago utilized a stage gate and trigger to help speed up launch of the OpenRTE daemons on backend nodes. In addition, some code cleanup occurred in the TM launcher. These improvements yielded a positive result, and they will be brought over to the trunk with these changes. 2. MAD (MPI_Abort Disease) - we have encountered a problem whereby daemons are left "spinning" wildly when MPI processes call MPI_Abort. This is symptomatic of a circular logic loop that has crept into the abort handling section of the OpenRTE code base. These changes will resolve that problem. 3. Daemon timeout on start - currently, we will wait forever for all daemons to start because we have no way to detect that they failed in some environments. We are adding a timeout mechanism (adjustable via MCA param, of course) that will allow orte/mpirun to give up after some period of time. As part of these revisions, I am working to bring the code base another step closer to OpenRTE 2.0 compatibility. As a result, some of the changes may appear unnecessary in terms of fixing the three issues noted above. I apologize in advance for that, but beg your indulgence as these changes will make eventual integration with 2.0 a little easier. The upcoming revisions will involve changes to the RDS, RAS, PLS, ERRMGR, RMGR, and SMR frameworks in the form of API changes. Most of these changes are not massive, but impact a number of places in the code. However, significant change will occur in several places: 1. the ERRMGR will see significant change in actual behavior as we clarify its role. New components to differentiate behavior between head node processes (HNPs), daemons (our orteds), and application processes are being created. 2. communications to the OpenRTE daemons (orted's) will no longer take place via individual frameworks but will be concentrated through the existing orted non-blocking receive function. This will help us break the circular logic loop and (hopefully) avoid re-creating it in the future. 3. the PLS "fork" component really was the orted's private launcher for local processes. It has been moved to the orted's directory and renamed to indicate that fact. Although there were good reasons to do this before, it could not previously be done due to the built-in calls to the PLS - however, with the new clarification of roles, this can now be cleanly done. 4. ALL resource management functionality has been constrained to the HNP. Non-HNP processes (including orteds and application processes) solely communicate their requests back to the HNP for execution. In addition, in accordance with the OpenRTE 2.0 design, all resource management frameworks (i.e., RDS, RAS, RMAPS, and PLS) are now publicly available (i.e., not just through the RMGR). 5. the RMAPS framework has been changed to support multiple concurrent mapping components, and a parameter added to the "map" API so the caller can specify which one should be used for this specific map command. For those of you with components in the affected frameworks, I am going through them and making changes to keep them compatible with the revisions. Again, these aren't major, but will require some checking to ensure they are correct, especially for those components that will not compile unless in a specific environment. I hope to complete this work in the next 2-3 weeks. The work is taking place on the tmp/mad branch - those of you with access to it are welcome to keep track of what I am doing. Prior to committing this massive a change to the trunk, I will be performing testing on various platforms. I will also be contacting key people with access to platforms beyond my domain to ask their help in testing the branch in those environments. And yes - I WILL send out a note alerting people to the upcoming commit prior to throwing it into the trunk! Feel free to contact me with any comments, suggestions, or concerns. Ralph