What modules do you have loaded. I don't know if it currently works with 
cuda-11.7. I assume you're following these instructions carefully.

https://docs.nersc.gov/development/programming-models/mpi/cray-mpich/#cuda-aware-mpi

In our experience, GPU-aware MPI continues to be brittle on these machines. 
Maybe you can inquire with NERSC exactly which CUDA versions are tested with 
GPU-aware MPI.

Sajid Ali <sajidsyed2...@u.northwestern.edu> writes:

> Hi PETSc-developers,
>
> I had posted about crashes within PETScSF when using GPU-aware MPI on
> Perlmutter a while ago (
> https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2022-February/045585.html).
> Now that the software stacks have stabilized, I was wondering if there was
> a fix for the same as I am still observing similar crashes.
>
> I am attaching the trace of the latest crash (with PETSc-3.20.0) for
> reference.
>
> Thank You,
> Sajid Ali (he/him) | Research Associate
> Data Science, Simulation, and Learning Division
> Fermi National Accelerator Laboratory
> s-sajid-ali.github.io

Reply via email to