Just to further clarify the clarification... ;-) This condition has existed for the last several months. The root problem dates at least back into the 1.1 series. We chased the problem down to the iof_flush call in the odls when a process terminates in something like Jan or Feb this year, at which point we #if 0'd the iof_flush out of the code pending a resolution (tickets were filed, emails flew, phone calls ensued - just took awhile for people to have time to deal with it). It is still "on" in 1.2 - just has been turned "off" in the trunk for months.
[Actually, I did turn it back on briefly following r15390. Turned out the timing changed just enough to make it work most of the time with things that called orte_finalize, but always fail for programs that didn't, so we turned it back off again] So the problem of having clipped output has been around for quite some time. Since only Galen ever commented to me about being impacted by it, I gather nobody has really noticed. ;-) Hopefully, we'll be able to turn it back on again soon. On 7/18/07 6:02 AM, "Jeff Squyres" <jsquy...@cisco.com> wrote: > BTW, the fix didn't occur over the weekend because of some merging > issues. > > I also didn't explain the problem well; you may see some clipped > output from your program or the orted may hang while everything is > shutting down. This is especially likely to occur for very short > applications. > > The problem is actually in the oob; the orted gets into a state where > it's waiting for some IOF OOB callbacks to occur for messages that > were already successfully sent, but the callbacks never occur due > to... well, it's a long story. The IOF is basically spinning during > the orted shutdown waiting for pending OOB callbacks that will never > occur. > > I can explain in more detail if anyone cares, but hopefully Brian > will be able to work the fix in within the next few days. > > > On Jul 13, 2007, at 5:04 PM, Jeff Squyres wrote: > >> FYI: there is an issue on the OMPI trunk right now that the tail >> end of output from applications may get clipped. The fix is coming >> this weekend. If you care, I'll explain, but I just wanted to give >> everyone heads up that if you see the tail end of your stdout/ >> stderr not show up, it's probably not your fault. :-) >> >> -- >> Jeff Squyres >> Cisco Systems >> >> >