I don't see why it is not running the Kokkos check. Here is the rule right
below the CUDA rule that is apparently running.
check_build:
-@echo "Running check examples to verify correct installation"
-@echo "Using PETSC_DIR=${PETSC_DIR} and PETSC_ARCH=${PETSC_ARCH}"
+@cd src/snes/tutorials >/dev/null; ${OMAKE_SELF}
PETSC_ARCH=${PETSC_ARCH} PETSC_DIR=${PETSC_DIR} clean-legacy
+@cd src/snes/tutorials >/dev/null; ${OMAKE_SELF}
PETSC_ARCH=${PETSC_ARCH} PETSC_DIR=${PETSC_DIR} testex19
+@if [ "${HYPRE_LIB}" != "" ] && [ "${PETSC_WITH_BATCH}" = "" ] && [
"${PETSC_SCALAR}" = "real" ]; then \
cd src/snes/tutorials >/dev/null; ${OMAKE_SELF}
PETSC_ARCH=${PETSC_ARCH} PETSC_DIR=${PETSC_DIR}
DIFF=${PETSC_DIR}/lib/petsc/bin/petscdiff runex19_hypre; \
fi;
+@if [ "${CUDA_LIB}" != "" ] && [ "${PETSC_WITH_BATCH}" = "" ] && [
"${PETSC_SCALAR}" = "real" ]; then \
cd src/snes/tutorials >/dev/null; ${OMAKE_SELF}
PETSC_ARCH=${PETSC_ARCH} PETSC_DIR=${PETSC_DIR}
DIFF=${PETSC_DIR}/lib/petsc/bin/petscdiff runex19_cuda; \
fi;
+@if [ "${KOKKOS_KERNELS_LIB}" != "" ] && [ "${PETSC_WITH_BATCH}" = ""
] && [ "${PETSC_SCALAR}" = "real" ] && [ "${PETSC_PRECISION}" = "double" ] &&
[ "${MPI_IS_MPIUNI}" = "0" ]; then \
cd src/snes/tutorials >/dev/null; ${OMAKE_SELF}
PETSC_ARCH=${PETSC_ARCH} PETSC_DIR=${PETSC_DIR}
DIFF=${PETSC_DIR}/lib/petsc/bin/petscdiff runex3k_kokkos; \
fi;
Regarding the debugging, if it is just one MPI rank (or even more) with GDB
it will trap the error and show the exact line of source code where the error
occurred and you can poke around at variables to see if they look corrupt or
wrong (for example crazy address in a pointer), I don't know why your debugger
is not giving more useful information.
Barry
> On May 29, 2021, at 2:16 PM, Mark Adams <[email protected]> wrote:
>
> I am running on Summit with Kokkos-CUDA and I am getting a segv that looks
> like some sort of a compile/link mismatch. I also have a user with a C++ code
> that is getting strange segvs when calling MatSetValues with CUDA (I know
> MatSetValues is not a cupsarse method, but that is the report that I have). I
> have no idea if these are related but they both involve C -- C++ calls ...
>
> I started with a clean build (attached) and I ran in DDT. DDT stopped at the
> call in plexland.c to the KokkosLanau operator. I stepped into this function
> and then took this screenshot of the stack, with the Kokkos call and PETSc
> signal handler.
>
> Make check does not seem to be running Kokkos tests:
>
> 15:02 adams/landau-mass-opt *= /gpfs/alpine/csc314/scratch/adams/petsc$ make
> PETSC_DIR=/gpfs/alpine/csc314/scratch/adams/petsc
> PETSC_ARCH=arch-summit-opt-gnu-kokkos-notpl-cuda10 check
> Running check examples to verify correct installation
> Using PETSC_DIR=/gpfs/alpine/csc314/scratch/adams/petsc and
> PETSC_ARCH=arch-summit-opt-gnu-kokkos-notpl-cuda10
> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
> C/C++ example src/snes/tutorials/ex19 run successfully with cuda
> Completed test examples
>
> Also, I ran this AM with another branch that had not been rebased with main
> as recently as this branch (adams/landau-mass-opt).
>
> Any ideas?
> <make.log><configure.log><Screen Shot 2021-05-29 at 2.51.00 PM.png>