Hi, I have a mpi code which works fine in my previous clusters.
However, there's mpi problem when I use it in the new clusters, which use openmpi. I wonder if there's a relation. The error is: [n12-70:14429] *** An error occurred in MPI_comm_size [n12-70:14429] *** on communicator MPI_COMM_WORLD [n12-70:14429] *** MPI_ERR_COMM: invalid communicator [n12-70:14429] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) 2.48user 0.45system 0:03.99elapsed 73%CPU (0avgtext+0avgdata 2871936maxresident)k 0inputs+0outputs (0major+185587minor)pagefaults 0swaps -------------------------------------------------------------------------- mpiexec has exited due to process rank 3 with PID 14425 on node n12-70 exiting without calling "finalize". This may have caused other processes in the application to be terminated by signals sent by mpiexec (as reported here). -------------------------------------------------------------------------- [n12-70:14421] 3 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal [n12-70:14421] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages I am not sure what other information you will need. I will provide more information if required. I tried running on 1, 4 and 8 processors but all can't work. -- Yours sincerely, TAY Wee Beng
