Harold --

Along those lines, would you mind downloading a nightly trunk tarball and giving it a whirl to see if the IOF patch we have soaking does fix the problem in your environment? Part of the difficulty we've had in testing the patch is that the problem was only intermittently repeatable on some machines in some environments.


On Dec 12, 2008, at 11:45 AM, Ralph Castain wrote:

Hi Harald

There is a patch for the IOF in 1.3 "soaking" in the trunk right now. I'll check to ensure it fixes this issue too. Hopefully, it will come over to the 1.3 branch early next week.


On Dec 12, 2008, at 8:21 AM, Harald Anlauf wrote:


I am having problems with OMPI-1.3beta with an interactive job where rank 0
reads stdin from a terminal.  The problem does not show up when stdin
is redirected from a file. The problem also does not exist with OMPI 1.2.[5-9]. Has there been any change in OMPI between 1.2 and 1.3 that I should take
care of?

Please find attached a famous sample program that was modified to aid debugging.

The program reads the number of intervals used to calculate pi. 0 means exit.
I first enter 1000, then 0.

Interactive run, without mpirun:
% ./a.out
Process            0  of            1  is alive
Process            0  before read
Enter the number of intervals: (0 quits)
Process            0  read:  n =        1000
Process            0  before MPI_BCAST
Process            0  after  MPI_BCAST
pi is approximately: 3.1415927369231227  Error is: 0.0000000833333296
Process            0  before read
Enter the number of intervals: (0 quits)
Process            0  read:  n =           0
Process            0  before MPI_BCAST
Process            0  after  MPI_BCAST
Process            0  Normal exit

With mpirun:
% mpirun -np 1 ./a.out
Process            0  of            1  is alive
Process            0  before read
Enter the number of intervals: (0 quits)
mpirun has exited due to process rank 0 with PID 10909 on
node oflws105 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).

Now with stdin redirected from a here-document:
% mpirun -np 1 ./a.out <<EOF
Process            0  of            1  is alive
Process            0  before read
Enter the number of intervals: (0 quits)
Process            0  read:  n =        1000
Process            0  before MPI_BCAST
Process            0  after  MPI_BCAST
pi is approximately: 3.1415927369231227  Error is: 0.0000000833333296
Process            0  before read
Enter the number of intervals: (0 quits)
Process            0  read:  n =           0
Process            0  before MPI_BCAST
Process            0  after  MPI_BCAST

Similarly for np > 1, with minor variations, but the same error message.

Can anybody reproduce this behavior?

% ompi_info |grep SVN
 Open MPI SVN revision: r20119
 Open RTE SVN revision: r20119
     OPAL SVN revision: r20119


Sensationsangebot verlängert: GMX FreeDSL - Telefonanschluss + DSL
für nur 16,37 Euro/mtl.!* http://dsl.gmx.de/? ac=OM.AD.PD003K1308T4569a
users mailing list

users mailing list

Jeff Squyres
Cisco Systems

Reply via email to