[OMPI devel] Changes: opal_output and opal_show_help

Jeff Squyres Fri, 9 May 2008 17:52:20 -0400

Per the teleconf this week, Ralph and I worked up two new featuresthat we're nearly ready to put back in the trunk:

1. IBM+LANL needed a way to XML-ize all output that comes out of OMPIso that 3rd party tools can parse and use it intelligently (e.g., thePTP debugger can now distinguish between OMPI error messages andstderr from the MPI app).

2. In order to do #1, we created separate logical channels (vs, justthrowing everything in stderr and letting IOF relay it back to theHNP) for the following:

   - stdout/stderr from the MPI app
   - opal_show_help() messages (***)
   - opal_output*() messages (***)

As a side effect, we now filter show_help() messages and only printthem *once* at the HNP (this has been a very long-standing goal ofmine). So if your MPI app barfs, you will no longer see the sameshow_help() error message N times -- you'll see it only once, possiblyaccompanied with a "...and we got the same error message from N otherprocesses" notice.

(***) To make both #1 and #2 work, we had to raise the abstractionlevel. That is, there had to be job-level intelligence about thedifferent kinds of output. So we have created orte_output() (andfriends) and orte_show_help(). The OPAL variants still exist, butthey *SHOULD NOT BE USED* by the MPI layer. Specifically, the OPALvariants are for what OPAL does best: single process stuff. The ORTEvariants provide the job-level intelligence, such as duplicateshow_help filtering, relaying to the HNP in a different channel thanIOF, etc.

So when this stuff hits the trunk, you'll see a ton of s/opal_output/orte_output/g and /opal_show_help/orte_show_help/g changes throughoutthe code base. Do not be alarmed. :-)


--
Jeff Squyres
Cisco Systems

[OMPI devel] Changes: opal_output and opal_show_help

Reply via email to