On Oct 6, 2006, at 12:04 AM, Jeff Squyres wrote:

On 10/5/06 2:42 PM, "Michael Kluskens" <mk...@ieee.org> wrote:

System: BLACS 1.1p3 on Debian Linux 3.1r3 on dual-opteron, gcc 3.3.5,
Intel ifort 9.0.32 all tests with 4 processors (comments below)

Good.  Can you expand on what you mean by "slowed down"?

Bad interaction between BLACS tester and OpenMPI 1.1.2rc3 (lesser so with OpenMPI 1.3a1r12069).

The last thing the BLACS tester does is:

The final auxiliary test is for BLACS_ABORT.
Immediately after this message, all processes should be killed.
If processes survive the call, your BLACS_ABORT is incorrect.
{0,2}, pnum=2, Contxt=0, killed other procs, exiting with error #-1.

forrtl: error (78): process killed (SIGTERM)
forrtl: error (78): process killed (SIGTERM)

This test leaves "orted" running only on the second node and using 99% of the CPU. In contrast with OpenMPI 1.3a1r12069 orted is left running on both nodes but not using cpu time -- this may be perfectly normal for BLACS_ABORT.

Trying to run either the C or Fortran BLACS tester after the first run causes the BLACS tester to slow down and possibly freeze up.

The final message with OpenMPI 1.3a1r12069 is:

The final auxiliary test is for BLACS_ABORT.
Immediately after this message, all processes should be killed.
If processes survive the call, your BLACS_ABORT is incorrect.

Michael

Reply via email to