FWIW: this has been "fixed" in PMIx/PRRTE and should make it into OMPI v5 if 
the OMPI community accepts it. The default behavior has been changed to output 
a full line-at-a-time so that the output from different ranks doesn't get mixed 
together. The negative to this, of course, is that we now internally buffer 
output until we see a newline character or the process terminates.

Since some applications really do need immediate output (and/or may not have 
newline characters in the output), I added a "raw" option to the "output" 
directive that matches the old behavior - i.e., any output from a proc is 
immediately staged for writing out regardless of whether or not it has a 
newline.

Ralph


> On Dec 7, 2021, at 6:05 AM, Jeff Squyres (jsquyres) via users 
> <users@lists.open-mpi.org> wrote:
> 
> Open MPI launches a single "helper" process on each node (in Open MPI <= 
> v4.x, that helper process is called "orted").  This process is responsible 
> for launching all the individual MPI processes, and it's also responsible for 
> capturing all the stdout/stderr from those processes and sending it back to 
> mpirun via an out-of-band network message protocol (using TCP sockets).  
> mpirun accepts those network messages and emits them to mpirun's 
> stdout/stderr.
> 
> There's multiple places in that pipeline where messages can get fragmented, 
> and therefore emitted as incomplete lines (OS stdout/stderr buffering, 
> network MTU size, TCP buffering, etc.).
> 
> This is mainly because we have always assumed that stdout/stderr is not the 
> primary work output of an MPI application.  We've seen many MPI applications 
> either write their results to stable files or send the results back to a 
> single MPI process, who then gathers and emits them (i.e., if there's only 
> stdout/stderr coming from a single MPI process, the output won't get 
> interleaved with anything else).
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> 
> ________________________________________
> From: users <users-boun...@lists.open-mpi.org> on behalf of Fisher (US), Mark 
> S via users <users@lists.open-mpi.org>
> Sent: Monday, December 6, 2021 3:45 PM
> To: Joachim Protze; Open MPI Users
> Cc: Fisher (US), Mark S
> Subject: Re: [OMPI users] stdout scrambled in file
> 
> This usually happens if we get a number of warning message from multiple 
> processes. Seems like unbuffered is what we want but not sure how this 
> interacts with MPI since stdout/stderr is pulled back from different hosts. 
> Not sure how you are doing that.
> 
> -----Original Message-----
> From: Joachim Protze <pro...@itc.rwth-aachen.de>
> Sent: Monday, December 06, 2021 11:12 AM
> To: Fisher (US), Mark S <mark.s.fis...@boeing.com>; Open MPI Users 
> <users@lists.open-mpi.org>
> Subject: Re: [OMPI users] stdout scrambled in file
> 
> I would assume, that the buffering mode is compiler/runtime specific. At
> least for Intel compiler, the default seems to be/have been unbuffered
> for stdout, but there is a flag for buffered output:
> 
> https://community.intel.com/t5/Intel-Fortran-Compiler/Enabling-buffered-I-O-to-stdout-with-Intel-ifort-compiler/td-p/993203
> 
> In the worst case, each character might be written individually. If the
> scrambling only happens from time to time, I guess you really just see
> the buffer flush when the buffer filled up.
> 
> - Joachim
> 
> Am 06.12.21 um 16:42 schrieb Fisher (US), Mark S:
>> All strings are writing as one output so that is not the issue. Adding in 
>> some flushing is a good idea and we can try that. We do not open stdout just 
>> write to unit 6, but we could open it if there is some un-buffered option 
>> that could help. I will look into that also.  Thanks!
>> 
>> -----Original Message-----
>> From: Joachim Protze <pro...@itc.rwth-aachen.de>
>> Sent: Monday, December 6, 2021 9:24 AM
>> To: Open MPI Users <users@lists.open-mpi.org>
>> Cc: Fisher (US), Mark S <mark.s.fis...@boeing.com>
>> Subject: Re: [OMPI users] stdout scrambled in file
>> 
>> Hi Mark,
>> 
>> "[...] MPI makes neither requirements nor recommendations for the output
>> [...]" (MPI4.0, ยง2.9.1)
>> 
>>  From my experience, an application can avoid such scrambling (still no
>> guarantee), if the output of lines is written atomically. C++ streams
>> are worst for concurrent output, as every stream operator writes a
>> chunk. It can help to collect output into a stringstream and print out
>> at once. Using printf in C is typically least problematic. Flushing the
>> buffer (fflush) helpts to avoid that the output buffer fills up and is
>> flushed in the middle of printing.
>> 
>> I'm not the Fortran expert. But, I think there are some options to
>> change to a buffered output mode (at least I found such options for file
>> I/O). Again, the goal should be that a write statement is printed at
>> once and the buffer doesn't fill up while printing.
>> 
>> In any case, it could help to write warnings to stderr and separate the
>> stdout and stderr streams.
>> 
>> Best
>> Joachim
>> 
>> Am 02.12.21 um 16:48 schrieb Fisher (US), Mark S via users:
>>> We are using Mellanox HPC-X MPI based on OpenMPI 4.1.1RC1 and having
>>> issues with lines scrambling together occasionally. This causes issues
>>> our convergence checking code since we put convergence data there. We
>>> are not using any mpirun options for stdout we just redirect
>>> stdout/stderr to a file before we run the mpirun command so all output
>>> goes there. We had similar issue with Intel MPI in the past and used the
>>> -ordered-output to fix it but I do not see any similar option for
>>> OpenMPI. See example below. Is there anyway to ensure a line from a
>>> process gets one line in the output file?
>>> 
>>> *The data in red below is scrambled up and should look like the
>>> cleaned-up version. You can see it put a line from a different process
>>> inside a line from another processes and the rest of the line ended up a
>>> couple of lines down.*
>>> 
>>> ZONE   0 : Min/Max CFL= 5.000E-01 1.500E+01 Min/Max DT= 8.411E-10
>>> 1.004E-01 sec
>>> 
>>> *IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04
>>> -4.945E-06  aerosurfs
>>> 
>>> *IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04
>>> -2.785E-05  aerosurfs
>>> 
>>> *IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04
>>> -4.945E-06  Aircraft-Total
>>> 
>>> *IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04
>>> -2.785E-05 Aircr Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin:  699
>>> 1625     12
>>> 
>>> Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin:  111  1626      6
>>> 
>>> aft-Total
>>> 
>>> *IGSTAB* 1626 6.623E-02 2.137E-01 -9.063E-04 8.450E-03 -5.485E-04
>>> -4.961E-06  Aircraft-OML
>>> 
>>> *IGMNTAERO* 1626 -6.118E-04 -1.602E-02 6.404E-04 5.756E-08 3.341E-04
>>> -2.791E-05  Aircraft-OML
>>> 
>>> *Cleaned up version:*
>>> 
>>> ZONE   0 : Min/Max CFL= 5.000E-01 1.500E+01 Min/Max DT= 8.411E-10
>>> 1.004E-01 sec
>>> 
>>> *IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04
>>> -4.945E-06  aerosurfs
>>> 
>>> *IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04
>>> -2.785E-05  aerosurfs
>>> 
>>> *IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04
>>> -4.945E-06  Aircraft-Total
>>> 
>>> *IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04
>>> -2.785E-05 Aircraft-Total
>>> 
>>>   Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin:  699  1625     12
>>> 
>>> Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin:  111  1626      6
>>> 
>>> *IGSTAB* 1626 6.623E-02 2.137E-01 -9.063E-04 8.450E-03 -5.485E-04
>>> -4.961E-06  Aircraft-OML
>>> 
>>> *IGMNTAERO* 1626 -6.118E-04 -1.602E-02 6.404E-04 5.756E-08 3.341E-04
>>> -2.791E-05  Aircraft-OML
>>> 
>>> Thanks!
>>> 
>> 
>> 
> 
> 
> --
> Dr. rer. nat. Joachim Protze
> 
> IT Center
> Group: High Performance Computing
> Division: Computational Science and Engineering
> RWTH Aachen University
> Seffenter Weg 23
> D 52074  Aachen (Germany)
> Tel: +49 241 80- 24765
> Fax: +49 241 80-624765
> pro...@itc.rwth-aachen.de
> www.itc.rwth-aachen.de


Reply via email to