On 2021-10-12 17:46, Jeff Squyres (jsquyres) wrote:
I'm sorry, I just noticed that you replied 6 days ago, but I
apparently wasn't notified by the Debian bug tracker.  :-(

Sorry about that. I'm never quite sure when the bug tracker does or does not add cc:s. This reply has you cc:d in any case.

Ok, so this is an MPI_Alltoall issue.  Does it use MPI_IN_PLACE?


Interesting question. An older version of dolfinx did not use it, but the current release does,
l.96 in cpp/dolfinx/common/MPI.cpp,
https://salsa.debian.org/science-team/fenics/fenics-dolfinx/-/blob/experimental/cpp/dolfinx/common/MPI.cpp#L96

std::vector<int> dolfinx::MPI::compute_graph_edges(MPI_Comm comm,
const std::set<int>& edges)
{
  // Send '1' to ranks that I have a edge to
  std::vector<std::uint8_t> edge_count(dolfinx::MPI::size(comm), 0);
  std::for_each(edges.cbegin(), edges.cend(),
                [&edge_count](auto e) { edge_count[e] = 1; });
MPI_Alltoall(MPI_IN_PLACE, 1, MPI_UINT8_T, edge_count.data(), 1, MPI_UINT8_T,
               comm);

  // Build list of rank that had an edge to me
  std::vector<int> edges1;
  for (std::size_t i = 0; i < edge_count.size(); ++i)
  {
    if (edge_count[i] > 0)
      edges1.push_back(i);
  }
  return edges1;
}


Looks like it already got removed upstream,
https://github.com/FEniCS/dolfinx/blob/02f35afa956ee2fc26284d529591c2589bf4d35e/cpp/dolfinx/common/MPI.cpp#L97

That was removed 7 days ago in upstream PR1738,
https://github.com/FEniCS/dolfinx/pull/1738
with the comment "Performance of in-place calls can be poor with some implementations. Not real benefit to in place calls for the cases where it has been used."

If I got the sense of your question right, it's not just "no real benefit", it actually breaks behaviour on 32-bit arches.

I'll apply PR1738 to the debian dolfinx build and see how it turns out.

Drew





On Wed, 06 Oct 2021 20:15:38 +0200 Drew Parsons <dpars...@debian.org> wrote:
Source: openmpi
Followup-For: Bug #995599

Not so simple to make a minimal test case I think.

all_to_all is defined in cpp/dolfinx/common/MPI.h in dolfinx source,
and calls MPI_Alltoall from openmpi.

It's designed to use with graph::AdjacencyList<T> from
graph/AdjacencyList.h, and is called from
compute_nonlocal_dual_graph() in mesh/graphbuild.cpp, where T is set
to std::int64_t.

I tried grabbing dolfinx' all_to_all and use it with a pared down
version of AdjacencyList.  But it's not triggering the segfault on an
i386 chroot. Possibly because I haven't populated it with an actual
graph so there's nothing to send with MPI_Alltoall.



Reply via email to