Hi folks

There is a PR that has cleared Jenkins, but it represents a fairly significant 
change in OMPI capabilities. Thus, I think it merits a little more attention.

The PR (https://github.com/open-mpi/ompi/pull/1767 
<https://github.com/open-mpi/ompi/pull/1767>) brings the PMIx event 
notification system into OMPI. Quoting from the PMIx RFC:

===============================
The PMIx Event Notification system provides a mechanism by which the resource 
manager can communicate system events to applications, thus providing 
applications with an opportunity to generate an appropriate response. In 
addition, applications can use the system to request that the resource manager 
notify their peers of internal events (e.g., computational errors and aborted 
operations), and notify the resource manager of events detected by the 
application.

The resource manager will be aware of a wide range of events that occur across 
the system. For the purposes of this discussion, only events that impact the 
allocated session being served by the PMIx server are considered. These events 
can be divided into two distinct classes:

* Job-specific events that directly relate to a job executing within the
  session. This might include events such as debugger attachment or process 
failure within a related job. These events are characterized by directly 
targeting processes within session jobs - i.e., the "procs" parameter of the 
notification contain members of a job executing within the session. Events in 
this category are to be immediately delivered to the PMIx server library for 
delivery to the specified processes.

  Clients can indicate a desire to register solely for job-specific events by 
including the _PMIX\_EVENT\_JOB\_LEVEL_ key in their call to 
_PMIx\_Register\_event_ - i.e., providing this key will explicitly indicate 
that environment events are _not_ to be reported to this callback function.

* Environment events that impact the session, but are not directly sent to
  executing jobs. This is a much broader category of events that includes ECC 
errors, temperature excursions, and other environmental events directly 
affecting the session's resources. Note that although these do impact the 
session's jobs, they are not directly referencing those jobs - i.e., the event 
is generated without specifying a particular target. Thus, events in this 
category are to be delivered to the PMIx server library only upon request - 
i.e., when the PMIx server has registered for those events.

Note that race conditions can cause the registration to come _after_ events of 
possible interest (e.g., a memory ECC event that occurs after start of 
execution but prior to registration). RMs are free to cache events in this 
category for some time to mitigate this situation, but are not required to do 
so. Thus, applications must be aware that environment events prior to 
registration may not be included in notifications.

As above, clients can indicate a desire to register solely for environment 
events of a given type by include the _PMIX\_EVENT\_ENVIRO\_LEVEL_ key in their 
registration call.

The PMIx server will cache any environment events passed to it for a period of 
time to provide notification to clients that have not yet registered for them. 
Currently, the PMIx server uses a ring buffer to cache events. The size of the 
ring buffer defaults to 512 events (as of PMIx 2.0), but can be configured 
using the _PMIx\_server\_cache\_size_ info key during the call to the 
_PMIx\_Server\_init_ API.

Client application processes can also use the PMIx Event Notification system to 
request that the resource manager notify its peers of internal events, and 
notify the resource manager of events detected by the application process. 
Examples of the latter include network communication errors that may not have 
been detected by the fabric manager itself (e.g., data corruption). The client 
must direct the notification to the appropriate target (RM or peers) using the 
corresponding range parameter.
===============================

The biggest change for OMPI is that it enables you to register event handlers 
for specific error constants - e.g., for knowing when debugger release has been 
issued. What you do in response to that notification is totally up to you, and 
we do “chain” the handlers (and pass the output of one down to the following 
handlers).

This should not be considered a cast-in-concrete capability - it will evolve as 
folks start to use it. However, we believe the interfaces should now be stable 
and ready for use.

The changes include:

* upgrade the base PMIx installation to 2.0.0a1, tracking (but lagging) the 
PMIx master
* creating a PMIx 1.1.4-specific external component for backward compatibility
* adding a PMIx 2.x-specific external component for those wanting to build 
directly against the PMIx master
* converting debugger support to use PMIx instead of RML for release. Note that 
the OOB/usock component remains for show_help support until the upcoming 
PMIx_Log interface is available.

Please provide any comments or concerns. I’m planning to “hold” this PR a bit 
while we resolve the OMPI 2.0 issues.

Ralph


Reply via email to