Re: [petsc-users] signal received error; MatNullSpaceTest; Stokes flow solver with pc fieldsplit and schur complement

Jed Brown Thu, 17 Oct 2013 07:40:00 -0700

Bishesh Khanal <[email protected]> writes:

>> I tried running on the cluster with one core per node with 4 nodes and I
> got the following errors (note: using valgrind, and openmpi of the cluster)
> at the very end after the many usual "unconditional jump ... errors"  which
> might be interesting
>
> mpiexec: killing job...
>
> mpiexec: abort is already in progress...hit ctrl-c again to forcibly
> terminate
>
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode 59.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the
> batch system) has told this process to end


Memory corruption generally results in SIGSEGV, so I suspect this is
still either a memory issue or some other resource issue.  How much
memory is available on these compute nodes?  Do turn off Valgrind for
this run; it takes a lot of memory.

> Does it mean it is crashing near MatSetValues_MPIAIJ ?

Possibly, but it could be killing the program for other reasons.

pgpGjW0RqqGud.pgp
Description: PGP signature

Re: [petsc-users] signal received error; MatNullSpaceTest; Stokes flow solver with pc fieldsplit and schur complement

Reply via email to