Should a default build of PETSc configure both with and without debugging and compile both sets of libraries? Increases the initial build time for people but simplifies life.
> On Aug 11, 2023, at 10:52 AM, Junchao Zhang <junchao.zh...@gmail.com> wrote: > > Hi, Marcos, > Could you build petsc in debug mode and then copy and paste the whole error > stack message? > > Thanks > --Junchao Zhang > > > On Thu, Aug 10, 2023 at 5:51 PM Vanella, Marcos (Fed) via petsc-users > <petsc-us...@mcs.anl.gov <mailto:petsc-us...@mcs.anl.gov>> wrote: >> Hi, I'm trying to run a parallel matrix vector build and linear solution >> with PETSc on 2 MPI processes + one V100 GPU. I tested that the matrix build >> and solution is successful in CPUs only. I'm using cuda 11.5 and cuda >> enabled openmpi and gcc 9.3. When I run the job with GPU enabled I get the >> following error: >> >> terminate called after throwing an instance of 'thrust::system::system_error' >> what(): merge_sort: failed to synchronize: cudaErrorIllegalAddress: an >> illegal memory access was encountered >> >> Program received signal SIGABRT: Process abort signal. >> >> Backtrace for this error: >> terminate called after throwing an instance of 'thrust::system::system_error' >> what(): merge_sort: failed to synchronize: cudaErrorIllegalAddress: an >> illegal memory access was encountered >> >> Program received signal SIGABRT: Process abort signal. >> >> I'm new to submitting jobs in slurm that also use GPU resources, so I might >> be doing something wrong in my submission script. This is it: >> >> #!/bin/bash >> #SBATCH -J test >> #SBATCH -e /home/Issues/PETSc/test.err >> #SBATCH -o /home/Issues/PETSc/test.log >> #SBATCH --partition=batch >> #SBATCH --ntasks=2 >> #SBATCH --nodes=1 >> #SBATCH --cpus-per-task=1 >> #SBATCH --ntasks-per-node=2 >> #SBATCH --time=01:00:00 >> #SBATCH --gres=gpu:1 >> >> export OMP_NUM_THREADS=1 >> module load cuda/11.5 >> module load openmpi/4.1.1 >> >> cd /home/Issues/PETSc >> mpirun -n 2 /home/fds/Build/ompi_gnu_linux/fds_ompi_gnu_linux test.fds >> -vec_type mpicuda -mat_type mpiaijcusparse -pc_type gamg >> >> If anyone has any suggestions on how o troubleshoot this please let me know. >> Thanks! >> Marcos >> >> >>