On 2021-10-12 17:46, Jeff Squyres (jsquyres) wrote:
I'm sorry, I just noticed that you replied 6 days ago, but I
apparently wasn't notified by the Debian bug tracker. :-(
Sorry about that. I'm never quite sure when the bug tracker does or does
not add cc:s. This reply has you cc:d in any case.
Ok, so this is an MPI_Alltoall issue. Does it use MPI_IN_PLACE?
Interesting question. An older version of dolfinx did not use it, but
the current release does,
l.96 in cpp/dolfinx/common/MPI.cpp,
https://salsa.debian.org/science-team/fenics/fenics-dolfinx/-/blob/experimental/cpp/dolfinx/common/MPI.cpp#L96
std::vector<int> dolfinx::MPI::compute_graph_edges(MPI_Comm comm,
const std::set<int>&
edges)
{
// Send '1' to ranks that I have a edge to
std::vector<std::uint8_t> edge_count(dolfinx::MPI::size(comm), 0);
std::for_each(edges.cbegin(), edges.cend(),
[&edge_count](auto e) { edge_count[e] = 1; });
MPI_Alltoall(MPI_IN_PLACE, 1, MPI_UINT8_T, edge_count.data(), 1,
MPI_UINT8_T,
comm);
// Build list of rank that had an edge to me
std::vector<int> edges1;
for (std::size_t i = 0; i < edge_count.size(); ++i)
{
if (edge_count[i] > 0)
edges1.push_back(i);
}
return edges1;
}
Looks like it already got removed upstream,
https://github.com/FEniCS/dolfinx/blob/02f35afa956ee2fc26284d529591c2589bf4d35e/cpp/dolfinx/common/MPI.cpp#L97
That was removed 7 days ago in upstream PR1738,
https://github.com/FEniCS/dolfinx/pull/1738
with the comment "Performance of in-place calls can be poor with some
implementations. Not real benefit to in place calls for the cases where
it has been used."
If I got the sense of your question right, it's not just "no real
benefit", it actually breaks behaviour on 32-bit arches.
I'll apply PR1738 to the debian dolfinx build and see how it turns out.
Drew
On Wed, 06 Oct 2021 20:15:38 +0200 Drew Parsons <dpars...@debian.org>
wrote:
Source: openmpi
Followup-For: Bug #995599
Not so simple to make a minimal test case I think.
all_to_all is defined in cpp/dolfinx/common/MPI.h in dolfinx source,
and calls MPI_Alltoall from openmpi.
It's designed to use with graph::AdjacencyList<T> from
graph/AdjacencyList.h, and is called from
compute_nonlocal_dual_graph() in mesh/graphbuild.cpp, where T is set
to std::int64_t.
I tried grabbing dolfinx' all_to_all and use it with a pared down
version of AdjacencyList. But it's not triggering the segfault on an
i386 chroot. Possibly because I haven't populated it with an actual
graph so there's nothing to send with MPI_Alltoall.