I'm seeing the same thing on latest main with a different machine and -sm52 card, cuda 11.8. make check fails with the below, where the indicated line 249 corresponds to PetscCallCUPM(cupmDeviceGetMemPool(&mempool, static_cast<int>(device->deviceId))); in the initialize function.
Running check examples to verify correct installation Using PETSC_DIR=/home/mlohry/dev/petsc and PETSC_ARCH=arch-linux-c-debug C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes 2,17c2,46 < 0 SNES Function norm 2.391552133017e-01 < 0 KSP Residual norm 2.928487269734e-01 < 1 KSP Residual norm 1.876489580142e-02 < 2 KSP Residual norm 3.291394847944e-03 < 3 KSP Residual norm 2.456493072124e-04 < 4 KSP Residual norm 1.161647147715e-05 < 5 KSP Residual norm 1.285648407621e-06 < 1 SNES Function norm 6.846805706142e-05 < 0 KSP Residual norm 2.292783790384e-05 < 1 KSP Residual norm 2.100673631699e-06 < 2 KSP Residual norm 2.121341386147e-07 < 3 KSP Residual norm 2.455932678957e-08 < 4 KSP Residual norm 1.753095730744e-09 < 5 KSP Residual norm 7.489214418904e-11 < 2 SNES Function norm 2.103908447865e-10 < Number of SNES iterations = 2 --- > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: GPU error > [0]PETSC ERROR: cuda error 801 (cudaErrorNotSupported) : operation not supported > [0]PETSC ERROR: WARNING! There are option(s) set that were not used! Could be the program crashed before they were used or a spelling mistake, etc! > [0]PETSC ERROR: Option left: name:-mg_levels_ksp_max_it value: 3 source: command line > [0]PETSC ERROR: Option left: name:-nox (no value) source: environment > [0]PETSC ERROR: Option left: name:-nox_warning (no value) source: environment > [0]PETSC ERROR: Option left: name:-pc_gamg_esteig_ksp_max_it value: 10 source: command line > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Development GIT revision: v3.18.3-352-g91c56366cb GIT Date: 2023-01-05 17:22:48 +0000 > [0]PETSC ERROR: ./ex19 on a arch-linux-c-debug named osprey by mlohry Thu Jan 5 17:25:17 2023 > [0]PETSC ERROR: Configure options --with-cuda --with-mpi=1 > [0]PETSC ERROR: #1 initialize() at /home/mlohry/dev/petsc/src/sys/objects/device/impls/cupm/cuda/../cupmcontext.hpp:249 > [0]PETSC ERROR: #2 PetscDeviceContextCreate_CUDA() at /home/mlohry/dev/petsc/src/sys/objects/device/impls/cupm/cuda/ cupmcontext.cu:10 > [0]PETSC ERROR: #3 PetscDeviceContextSetDevice_Private() at /home/mlohry/dev/petsc/src/sys/objects/device/interface/dcontext.cxx:247 > [0]PETSC ERROR: #4 PetscDeviceContextSetDefaultDeviceForType_Internal() at /home/mlohry/dev/petsc/src/sys/objects/device/interface/dcontext.cxx:260 > [0]PETSC ERROR: #5 PetscDeviceContextSetupGlobalContext_Private() at /home/mlohry/dev/petsc/src/sys/objects/device/interface/global_dcontext.cxx:52 > [0]PETSC ERROR: #6 PetscDeviceContextGetCurrentContext() at /home/mlohry/dev/petsc/src/sys/objects/device/interface/global_dcontext.cxx:84 > [0]PETSC ERROR: #7 GetHandleDispatch_() at /home/mlohry/dev/petsc/include/petsc/private/veccupmimpl.h:499 > [0]PETSC ERROR: #8 create() at /home/mlohry/dev/petsc/include/petsc/private/veccupmimpl.h:1069 > [0]PETSC ERROR: #9 VecCreate_SeqCUDA() at /home/mlohry/dev/petsc/src/vec/vec/impls/seq/cupm/cuda/vecseqcupm.cu:10 > [0]PETSC ERROR: #10 VecSetType() at /home/mlohry/dev/petsc/src/vec/vec/interface/vecreg.c:89 > [0]PETSC ERROR: #11 DMCreateGlobalVector_DA() at /home/mlohry/dev/petsc/src/dm/impls/da/dadist.c:31 > [0]PETSC ERROR: #12 DMCreateGlobalVector() at /home/mlohry/dev/petsc/src/dm/interface/dm.c:1023 > [0]PETSC ERROR: #13 main() at ex19.c:149 On Thu, Jan 5, 2023 at 3:42 PM Mark Lohry <[email protected]> wrote: > I'm trying to compile the cuda example > > ./config/examples/arch-ci-linux-cuda-double-64idx.py > --with-cudac=/usr/local/cuda-11.5/bin/nvcc > > and running make test passes the test ok > diff-sys_objects_device_tests-ex1_host_with_device+nsize-1device_enable-lazy > but the eager variant fails, pasted below. > > I get a similar error running my client code, pasted after. There when > running with -info, it seems that some lazy initialization happens first, > and i also call VecCreateSeqCuda which seems to have no issue. > > Any idea? This happens to be with an -sm 3.5 device if it matters, > otherwise it's a recent cuda compiler+driver. > > > petsc test code output: > > > > not ok > sys_objects_device_tests-ex1_host_with_device+nsize-1device_enable-eager # > Error code: 97 > # [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > # [0]PETSC ERROR: GPU error > # [0]PETSC ERROR: cuda error 801 (cudaErrorNotSupported) : operation not > supported > # [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > # [0]PETSC ERROR: Petsc Release Version 3.18.3, Dec 28, 2022 > # [0]PETSC ERROR: ../ex1 on a named lancer by mlohry Thu Jan 5 15:22:33 > 2023 > # [0]PETSC ERROR: Configure options > --package-prefix-hash=/home/mlohry/petsc-hash-pkgs --with-make-test-np=2 > --download-openmpi=1 --download-hypre=1 --download-hwloc=1 COPTFLAGS="-g > -O" FOPTFLAGS="-g -O" CXXOPTFLAGS="-g -O" --with-64-bit-indices=1 > --with-cuda=1 --with-precision=double --with-clanguage=c > --with-cudac=/usr/local/cuda-11.5/bin/nvcc > PETSC_ARCH=arch-ci-linux-cuda-double-64idx > # [0]PETSC ERROR: #1 CUPMAwareMPI_() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:194 > # [0]PETSC ERROR: #2 initialize() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:71 > # [0]PETSC ERROR: #3 init_device_id_() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/impls/cupm/cupmdevice.cxx:290 > # [0]PETSC ERROR: #4 getDevice() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/../impls/host/../impldevicebase.hpp:99 > # [0]PETSC ERROR: #5 PetscDeviceCreate() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/device.cxx:104 > # [0]PETSC ERROR: #6 PetscDeviceInitializeDefaultDevice_Internal() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/device.cxx:375 > # [0]PETSC ERROR: #7 PetscDeviceInitializeTypeFromOptions_Private() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/device.cxx:499 > # [0]PETSC ERROR: #8 PetscDeviceInitializeFromOptions_Internal() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/device.cxx:634 > # [0]PETSC ERROR: #9 PetscInitialize_Common() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/pinit.c:1001 > # [0]PETSC ERROR: #10 PetscInitialize() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/pinit.c:1267 > # [0]PETSC ERROR: #11 main() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/tests/ex1.c:12 > # [0]PETSC ERROR: PETSc Option Table entries: > # [0]PETSC ERROR: -default_device_type host > # [0]PETSC ERROR: -device_enable eager > # [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to [email protected] > > > > > > solver code output: > > > > [0] <sys> PetscDetermineInitialFPTrap(): Floating point trapping is off by > default 0 > [0] <sys> PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType > host available, initializing > [0] <sys> PetscDeviceInitializeTypeFromOptions_Private(): PetscDevice host > initialized, default device id 0, view FALSE, init type lazy > [0] <sys> PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType > cuda available, initializing > [0] <sys> PetscDeviceInitializeTypeFromOptions_Private(): PetscDevice cuda > initialized, default device id 0, view FALSE, init type lazy > [0] <sys> PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType > hip not available > [0] <sys> PetscDeviceInitializeTypeFromOptions_Private(): PetscDeviceType > sycl not available > [0] <sys> PetscInitialize_Common(): PETSc successfully started: number of > processors = 1 > [0] <sys> PetscGetHostName(): Rejecting domainname, likely is NIS > lancer.(none) > [0] <sys> PetscInitialize_Common(): Running on machine: lancer > # [Info] Petsc initialization complete. > # [Trace] Timing: Starting solver... > # [Info] RNG initial conditions have mean 0.000004, renormalizing. > # [Trace] Timing: PetscTimeIntegrator initialization... > # [Trace] Timing: Allocating Petsc CUDA arrays... > [0] <sys> PetscCommDuplicate(): Duplicating a communicator 2 3 max tags = > 100000000 > [0] <sys> configure(): Configured device 0 > [0] <sys> PetscCommDuplicate(): Using internal PETSc communicator 2 3 > # [Trace] Timing: Allocating Petsc CUDA arrays finished in 0.015439 > seconds. > [0] <sys> PetscCommDuplicate(): Using internal PETSc communicator 2 3 > [0] <sys> PetscCommDuplicate(): Duplicating a communicator 1 4 max tags = > 100000000 > [0] <sys> PetscCommDuplicate(): Using internal PETSc communicator 1 4 > [0] <dm> DMGetDMTS(): Creating new DMTS > [0] <sys> PetscCommDuplicate(): Using internal PETSc communicator 1 4 > [0] <sys> PetscCommDuplicate(): Using internal PETSc communicator 1 4 > [0] <dm> DMGetDMSNES(): Creating new DMSNES > [0] <dm> DMGetDMSNESWrite(): Copying DMSNES due to write > # [Info] Initializing petsc with ode23 integrator > # [Trace] Timing: PetscTimeIntegrator initialization finished in 0.016754 > seconds. > > [0] <sys> PetscCommDuplicate(): Using internal PETSc communicator 1 4 > [0] <sys> PetscCommDuplicate(): Using internal PETSc communicator 1 4 > [0] <device> PetscDeviceContextSetupGlobalContext_Private(): Initializing > global PetscDeviceContext with device type cuda > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: GPU error > [0]PETSC ERROR: cuda error 801 (cudaErrorNotSupported) : operation not > supported > [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.18.3, Dec 28, 2022 > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named lancer by mlohry Thu Jan > 5 15:39:14 2023 > [0]PETSC ERROR: Configure options > PETSC_DIR=/home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc > PETSC_ARCH=arch-linux2-c-opt --with-cc=/usr/bin/cc --with-cxx=/usr/bin/c++ > --with-fc=0 --with-pic=1 --with-cxx-dialect=C++11 MAKEFLAGS=$MAKEFLAGS > COPTFLAGS="-O3 -march=native" CXXOPTFLAGS="-O3 -march=native" --with-mpi=0 > --with-debugging=no --with-cudac=/usr/local/cuda-11.5/bin/nvcc > --with-cuda-arch=35 --with-cuda --with-cuda-dir=/usr/local/cuda-11.5/ > --download-hwloc=1 > [0]PETSC ERROR: #1 initialize() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/impls/cupm/cuda/../cupmcontext.hpp:255 > [0]PETSC ERROR: #2 PetscDeviceContextCreate_CUDA() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/impls/cupm/cuda/ > cupmcontext.cu:10 > [0]PETSC ERROR: #3 PetscDeviceContextSetDevice_Private() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/dcontext.cxx:244 > [0]PETSC ERROR: #4 PetscDeviceContextSetDefaultDeviceForType_Internal() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/dcontext.cxx:259 > [0]PETSC ERROR: #5 PetscDeviceContextSetupGlobalContext_Private() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/global_dcontext.cxx:52 > [0]PETSC ERROR: #6 PetscDeviceContextGetCurrentContext() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/interface/global_dcontext.cxx:84 > [0]PETSC ERROR: #7 > PetscDeviceContextGetCurrentContextAssertType_Internal() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/include/petsc/private/deviceimpl.h:371 > [0]PETSC ERROR: #8 PetscCUBLASGetHandle() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/sys/objects/device/impls/cupm/cuda/ > cupmcontext.cu:23 > [0]PETSC ERROR: #9 VecMAXPY_SeqCUDA() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/vec/vec/impls/seq/seqcuda/ > veccuda2.cu:261 > [0]PETSC ERROR: #10 VecMAXPY() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/vec/vec/interface/rvector.c:1221 > [0]PETSC ERROR: #11 TSStep_RK() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/ts/impls/explicit/rk/rk.c:814 > [0]PETSC ERROR: #12 TSStep() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/ts/interface/ts.c:3424 > [0]PETSC ERROR: #13 TSSolve() at > /home/mlohry/dev/maDGiCart-cmake-build-cuda-release/external/petsc/src/ts/interface/ts.c:3814 > > >
