FWIW: this has been "fixed" in PMIx/PRRTE and should make it into OMPI v5 if the OMPI community accepts it. The default behavior has been changed to output a full line-at-a-time so that the output from different ranks doesn't get mixed together. The negative to this, of course, is that we now internally buffer output until we see a newline character or the process terminates.
Since some applications really do need immediate output (and/or may not have newline characters in the output), I added a "raw" option to the "output" directive that matches the old behavior - i.e., any output from a proc is immediately staged for writing out regardless of whether or not it has a newline. Ralph > On Dec 7, 2021, at 6:05 AM, Jeff Squyres (jsquyres) via users > <users@lists.open-mpi.org> wrote: > > Open MPI launches a single "helper" process on each node (in Open MPI <= > v4.x, that helper process is called "orted"). This process is responsible > for launching all the individual MPI processes, and it's also responsible for > capturing all the stdout/stderr from those processes and sending it back to > mpirun via an out-of-band network message protocol (using TCP sockets). > mpirun accepts those network messages and emits them to mpirun's > stdout/stderr. > > There's multiple places in that pipeline where messages can get fragmented, > and therefore emitted as incomplete lines (OS stdout/stderr buffering, > network MTU size, TCP buffering, etc.). > > This is mainly because we have always assumed that stdout/stderr is not the > primary work output of an MPI application. We've seen many MPI applications > either write their results to stable files or send the results back to a > single MPI process, who then gathers and emits them (i.e., if there's only > stdout/stderr coming from a single MPI process, the output won't get > interleaved with anything else). > > -- > Jeff Squyres > jsquy...@cisco.com > > ________________________________________ > From: users <users-boun...@lists.open-mpi.org> on behalf of Fisher (US), Mark > S via users <users@lists.open-mpi.org> > Sent: Monday, December 6, 2021 3:45 PM > To: Joachim Protze; Open MPI Users > Cc: Fisher (US), Mark S > Subject: Re: [OMPI users] stdout scrambled in file > > This usually happens if we get a number of warning message from multiple > processes. Seems like unbuffered is what we want but not sure how this > interacts with MPI since stdout/stderr is pulled back from different hosts. > Not sure how you are doing that. > > -----Original Message----- > From: Joachim Protze <pro...@itc.rwth-aachen.de> > Sent: Monday, December 06, 2021 11:12 AM > To: Fisher (US), Mark S <mark.s.fis...@boeing.com>; Open MPI Users > <users@lists.open-mpi.org> > Subject: Re: [OMPI users] stdout scrambled in file > > I would assume, that the buffering mode is compiler/runtime specific. At > least for Intel compiler, the default seems to be/have been unbuffered > for stdout, but there is a flag for buffered output: > > https://community.intel.com/t5/Intel-Fortran-Compiler/Enabling-buffered-I-O-to-stdout-with-Intel-ifort-compiler/td-p/993203 > > In the worst case, each character might be written individually. If the > scrambling only happens from time to time, I guess you really just see > the buffer flush when the buffer filled up. > > - Joachim > > Am 06.12.21 um 16:42 schrieb Fisher (US), Mark S: >> All strings are writing as one output so that is not the issue. Adding in >> some flushing is a good idea and we can try that. We do not open stdout just >> write to unit 6, but we could open it if there is some un-buffered option >> that could help. I will look into that also. Thanks! >> >> -----Original Message----- >> From: Joachim Protze <pro...@itc.rwth-aachen.de> >> Sent: Monday, December 6, 2021 9:24 AM >> To: Open MPI Users <users@lists.open-mpi.org> >> Cc: Fisher (US), Mark S <mark.s.fis...@boeing.com> >> Subject: Re: [OMPI users] stdout scrambled in file >> >> Hi Mark, >> >> "[...] MPI makes neither requirements nor recommendations for the output >> [...]" (MPI4.0, ยง2.9.1) >> >> From my experience, an application can avoid such scrambling (still no >> guarantee), if the output of lines is written atomically. C++ streams >> are worst for concurrent output, as every stream operator writes a >> chunk. It can help to collect output into a stringstream and print out >> at once. Using printf in C is typically least problematic. Flushing the >> buffer (fflush) helpts to avoid that the output buffer fills up and is >> flushed in the middle of printing. >> >> I'm not the Fortran expert. But, I think there are some options to >> change to a buffered output mode (at least I found such options for file >> I/O). Again, the goal should be that a write statement is printed at >> once and the buffer doesn't fill up while printing. >> >> In any case, it could help to write warnings to stderr and separate the >> stdout and stderr streams. >> >> Best >> Joachim >> >> Am 02.12.21 um 16:48 schrieb Fisher (US), Mark S via users: >>> We are using Mellanox HPC-X MPI based on OpenMPI 4.1.1RC1 and having >>> issues with lines scrambling together occasionally. This causes issues >>> our convergence checking code since we put convergence data there. We >>> are not using any mpirun options for stdout we just redirect >>> stdout/stderr to a file before we run the mpirun command so all output >>> goes there. We had similar issue with Intel MPI in the past and used the >>> -ordered-output to fix it but I do not see any similar option for >>> OpenMPI. See example below. Is there anyway to ensure a line from a >>> process gets one line in the output file? >>> >>> *The data in red below is scrambled up and should look like the >>> cleaned-up version. You can see it put a line from a different process >>> inside a line from another processes and the rest of the line ended up a >>> couple of lines down.* >>> >>> ZONE 0 : Min/Max CFL= 5.000E-01 1.500E+01 Min/Max DT= 8.411E-10 >>> 1.004E-01 sec >>> >>> *IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04 >>> -4.945E-06 aerosurfs >>> >>> *IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04 >>> -2.785E-05 aerosurfs >>> >>> *IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04 >>> -4.945E-06 Aircraft-Total >>> >>> *IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04 >>> -2.785E-05 Aircr Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin: 699 >>> 1625 12 >>> >>> Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin: 111 1626 6 >>> >>> aft-Total >>> >>> *IGSTAB* 1626 6.623E-02 2.137E-01 -9.063E-04 8.450E-03 -5.485E-04 >>> -4.961E-06 Aircraft-OML >>> >>> *IGMNTAERO* 1626 -6.118E-04 -1.602E-02 6.404E-04 5.756E-08 3.341E-04 >>> -2.791E-05 Aircraft-OML >>> >>> *Cleaned up version:* >>> >>> ZONE 0 : Min/Max CFL= 5.000E-01 1.500E+01 Min/Max DT= 8.411E-10 >>> 1.004E-01 sec >>> >>> *IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04 >>> -4.945E-06 aerosurfs >>> >>> *IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04 >>> -2.785E-05 aerosurfs >>> >>> *IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04 >>> -4.945E-06 Aircraft-Total >>> >>> *IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04 >>> -2.785E-05 Aircraft-Total >>> >>> Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin: 699 1625 12 >>> >>> Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin: 111 1626 6 >>> >>> *IGSTAB* 1626 6.623E-02 2.137E-01 -9.063E-04 8.450E-03 -5.485E-04 >>> -4.961E-06 Aircraft-OML >>> >>> *IGMNTAERO* 1626 -6.118E-04 -1.602E-02 6.404E-04 5.756E-08 3.341E-04 >>> -2.791E-05 Aircraft-OML >>> >>> Thanks! >>> >> >> > > > -- > Dr. rer. nat. Joachim Protze > > IT Center > Group: High Performance Computing > Division: Computational Science and Engineering > RWTH Aachen University > Seffenter Weg 23 > D 52074 Aachen (Germany) > Tel: +49 241 80- 24765 > Fax: +49 241 80-624765 > pro...@itc.rwth-aachen.de > www.itc.rwth-aachen.de