Hi Mark

Back in the days I liked the mpirun/mpiexec *--tag-output *option.
Jeff: Does it still exist?
It may not prevent 100% the splitting of output lines,
but tagging the lines with the process rank helps.
You can grep the stdout log for the rank that you want,
which helps a lot when several processes are talking.

I hope this helps,
Gus Correa


On Sun, Dec 5, 2021 at 1:12 PM Jeff Squyres (jsquyres) via users <
users@lists.open-mpi.org> wrote:

> FWIW: Open MPI 4.1.2 has been released -- you can probably stop using an
> RC release.
>
> I think you're probably running into an issue that is just a fact of
> life.  Especially when there's a lot of output simultaneously from multiple
> MPI processes (potentially on different nodes), the stdout/stderr lines can
> just get munged together.
>
> Can you check for convergence a different way?
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
> ________________________________________
> From: users <users-boun...@lists.open-mpi.org> on behalf of Fisher (US),
> Mark S via users <users@lists.open-mpi.org>
> Sent: Thursday, December 2, 2021 10:48 AM
> To: users@lists.open-mpi.org
> Cc: Fisher (US), Mark S
> Subject: [OMPI users] stdout scrambled in file
>
> We are using Mellanox HPC-X MPI based on OpenMPI 4.1.1RC1 and having
> issues with lines scrambling together occasionally. This causes issues our
> convergence checking code since we put convergence data there. We are not
> using any mpirun options for stdout we just redirect stdout/stderr to a
> file before we run the mpirun command so all output goes there. We had
> similar issue with Intel MPI in the past and used the -ordered-output to
> fix it but I do not see any similar option for OpenMPI. See example below.
> Is there anyway to ensure a line from a process gets one line in the output
> file?
>
>
> The data in red below is scrambled up and should look like the cleaned-up
> version. You can see it put a line from a different process inside a line
> from another processes and the rest of the line ended up a couple of lines
> down.
>
> ZONE   0 : Min/Max CFL= 5.000E-01 1.500E+01 Min/Max DT= 8.411E-10
> 1.004E-01 sec
>
> *IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04
> -4.945E-06  aerosurfs
> *IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04
> -2.785E-05  aerosurfs
> *IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04
> -4.945E-06  Aircraft-Total
> *IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04
> -2.785E-05  Aircr Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin:  699
> 1625     12
> Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin:  111  1626      6
> aft-Total
> *IGSTAB* 1626 6.623E-02 2.137E-01 -9.063E-04 8.450E-03 -5.485E-04
> -4.961E-06  Aircraft-OML
> *IGMNTAERO* 1626 -6.118E-04 -1.602E-02 6.404E-04 5.756E-08 3.341E-04
> -2.791E-05  Aircraft-OML
>
>
> Cleaned up version:
>
> ZONE   0 : Min/Max CFL= 5.000E-01 1.500E+01 Min/Max DT= 8.411E-10
> 1.004E-01 sec
>
> *IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04
> -4.945E-06  aerosurfs
> *IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04
> -2.785E-05  aerosurfs
> *IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04
> -4.945E-06  Aircraft-Total
> *IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04
> -2.785E-05  Aircraft-Total
>  Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin:  699  1625     12
> Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin:  111  1626      6
> *IGSTAB* 1626 6.623E-02 2.137E-01 -9.063E-04 8.450E-03 -5.485E-04
> -4.961E-06  Aircraft-OML
> *IGMNTAERO* 1626 -6.118E-04 -1.602E-02 6.404E-04 5.756E-08 3.341E-04
> -2.791E-05  Aircraft-OML
>
> Thanks!
>

Reply via email to