On 11/09/2016 08:36 AM, Ashkan Dorostkar wrote:
[n49422:9059] *** An error occurred in MPI_Allreduce
[n49422:9059] *** reported by process [3040346113,140733193388063]
[n49422:9059] *** on communicator MPI_COMM_WORLD
[n49422:9059] *** MPI_ERR_IN_STATUS: error code in status
[n49422:9059] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
will now abort,
[n49422:9059] *** and potentially your MPI job)
This looks like a communication issue or a communication oversaturation.
Does anyone have any experience with this?
Ashkan,
we have had people report such issues before, on an intermittent basis.
It is of course not impossible that this points to an actual problem in
deal.II (or PETSc, or Trilinos), but it is difficult to know for sure
without a backtrace where this came from.
Does it happen every time with the same program? If it doesn't, it is
also possible that this is a symptom of a lose cable or a wrong
configuration of the MPI system -- both things we have seen in cluster
in the past.
Best
W.
--
------------------------------------------------------------------------
Wolfgang Bangerth email: [email protected]
www: http://www.math.colostate.edu/~bangerth/
--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see
https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.