Hi Julius,
Is the implementation of the (colored) WorkStream::run() compatible
with a hybrid parallel code-architecture?
Besides checking for the validity of the ranges (as Bruno points out),
there is at least one potential problem as seen from the stack trace:
You cannot write into TrilinosWrappers::MPI::Vector in a multithreaded
way unless you take special care at construction. The way the vector
works is that it accumulates non-local entries in a data structure a bit
similar to an std::map. Of course, that is not thread-safe. This also
explains why the problem only appears sporadically: It happens when one
thread changes the shared data structure while the other is looking at
it (or trying to modify the same data structure at the same time).
The solution to this problem is to either initialize the vector with the
appropriate ghost entries (locally relevant dofs), i.e., the reinit
method/constructor that takes two index sets, an MPI_Comm and a
vector_writable flag. Alternatively, you can use
parallel::distributed::Vector that you initialize with the same two
index sets, locally owned dofs as first argument and locally relevant
dofs as the second element (i.e., ghosts are the locally relevant minus
the locally owned dofs). Trilinos matrices and preconditioners should
know how to collaborate with these vectors.
There is potentially the same problem for Trilinos sparse matrices in
case you start using them, but there it is simpler to arrive to the
thread-safe path (if you initialize them by a DynamicSparsityPattern and
set 'exchange_data' to true).
#10 0x00007ffff0735a05 in std::vector<int, std::allocator<int> >::insert (
this=this@entry=0x8ecb40, __position=..., __x=@0x7fffda9c85f8: 1)
at /usr/include/c++/4.8/bits/vector.tcc:127
#11 0x00007ffff077ff62 in Epetra_FEVector::inputNonlocalValues<int> (
this=this@entry=0x8eca40, GID=771, numValues=numValues@entry=1,
values=values@entry=0x7fffda9c8680, suminto=suminto@entry=true,
vectorIndex=vectorIndex@entry=0)
at
/export/home/jwitte/trilinos-12.6.1-Source/packages/epetra/src/Epetra_FEVector.cpp:427
It is this frame #11 that tells me that you use
Epetra_FEVector::inputNonlocalValues that is a non-thread-safe method.
Best,
Martin
--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see
https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.