What is the fastest way to rebuild hypre? reconfiguring did not work and is slow.
I am printf debugging to find this HSA_STATUS_ERROR_MEMORY_FAULT (no debuggers other than valgrind on Crusher??!?!) and I get to a hypre call: PetscStackCallStandard(HYPRE_IJMatrixAddToValues,(hA->ij,1,&hnc,(HYPRE_BigInt*)(rows+i),(HYPRE_BigInt*)cscr[0],sscr)); This is from DMPlexComputeJacobian_Internal and MatSetClosure. HYPRE_IJMatrixAddToValues is successfully called in earlier parts of the run. The args look OK, so I am going into HYPRE_IJMatrixAddToValues. Thanks, Mark On Sun, Jan 23, 2022 at 9:55 AM Mark Adams <[email protected]> wrote: > Stefano and Matt, This segv looks like a Plexism. > > + srun -n1 -N1 --ntasks-per-gpu=1 --gpu-bind=closest ../ex13 > -dm_plex_box_faces 2,2,2 -petscpartitioner_simple_process_grid > 1,1,1 -dm_plex_box_upper 1,1,1 -petscpartitioner_simple_node_grid 1,1,1 > -dm_refine 2 -dm_view -malloc_debug -log_trace -pc_type hypre -dm_vec_type > hip -dm_mat_type hypre > + tee out_001_kokkos_Crusher_2_8_hypre.txt > [0] 1.293e-06 Event begin: DMPlexSymmetrize > [0] 8.9463e-05 Event end: DMPlexSymmetrize > ..... > [0] 0.554529 Event end: VecHIPCopyFrom > [0] 0.559891 Event begin: DMCreateInterp > [0] 0.560603 Event begin: DMPlexInterpFE > [0] 0.566707 Event begin: MatAssemblyBegin > [0] 0.566962 Event begin: BuildTwoSidedF > [0] 0.567068 Event begin: BuildTwoSided > [0] 0.567119 Event end: BuildTwoSided > [0] 0.567154 Event end: BuildTwoSidedF > [0] 0.567162 Event end: MatAssemblyBegin > [0] 0.567164 Event begin: MatAssemblyEnd > [0] 0.567356 Event end: MatAssemblyEnd > [0] 0.572884 Event begin: MatAssemblyBegin > [0] 0.57289 Event end: MatAssemblyBegin > [0] 0.572892 Event begin: MatAssemblyEnd > [0] 0.573978 Event end: MatAssemblyEnd > [0] 0.574428 Event begin: MatZeroEntries > [0] 0.579998 Event end: MatZeroEntries > :0:rocdevice.cpp :2589: 257935591316 us: Device::callbackQueue > aborting with error : HSA_STATUS_ERROR_MEMORY_FAULT: Agent attempted to > access an inaccessible address. code: 0x2b > srun: error: crusher001: task 0: Aborted > srun: launch/slurm: _step_signal: Terminating StepId=65929.4 > + date > Sun 23 Jan 2022 09:46:55 AM EST > > On Sun, Jan 23, 2022 at 8:15 AM Mark Adams <[email protected]> wrote: > >> Thanks, >> '-mat_type hypre' was failing for me. I could not find a test that used >> it and I was not sure it was considered functional. >> I will look at it again and collect a bug report if needed. >> >> On Sat, Jan 22, 2022 at 11:31 AM Stefano Zampini < >> [email protected]> wrote: >> >>> Mark >>> >>> the two options are only there to test the code in CI, and are not >>> needed in general >>> >>> '--download-hypre-configure-arguments=--enable-unified-memory', >>> This is only here to test the unified memory code path >>> >>> '--with-hypre-gpuarch=gfx90a', >>> This is not needed if rocminfo is in PATH >>> >>> Our interface code with HYPRE GPU works fine for HIP, it is tested in CI. >>> The -mat_type hypre assembling for ex19 does not work because ex19 uses >>> FDColoring. Just assemble in mpiaij format (look at runex19_hypre_hip in >>> src/snes/tutorials/makefile); the interface code will copy the matrix to >>> the GPU >>> >>> Il giorno ven 21 gen 2022 alle ore 19:24 Mark Adams <[email protected]> >>> ha scritto: >>> >>>> >>>> >>>> On Fri, Jan 21, 2022 at 11:14 AM Jed Brown <[email protected]> wrote: >>>> >>>>> "Paul T. Bauman" <[email protected]> writes: >>>>> >>>>> > On Fri, Jan 21, 2022 at 8:52 AM Paul T. Bauman <[email protected]> >>>>> wrote: >>>>> >> Yes. The way HYPRE's memory model is setup is that ALL GPU >>>>> allocations are >>>>> >> "native" (i.e. [cuda,hip]Malloc) or, if unified memory is enabled, >>>>> then ALL >>>>> >> GPU allocations are unified memory (i.e. [cuda,hip]MallocManaged). >>>>> >> Regarding HIP, there is an HMM implementation of hipMallocManaged >>>>> planned, >>>>> >> but is it not yet delivered AFAIK (and it will *not* support >>>>> gfx906, e.g. >>>>> >> RVII, FYI), so, today, under the covers, hipMallocManaged is calling >>>>> >> hipHostMalloc. So, today, all your unified memory allocations in >>>>> HYPRE on >>>>> >> HIP are doing CPU-pinned memory accesses. And performance is just >>>>> truly >>>>> >> terrible (as you might expect). >>>>> >>>>> Thanks for this important bit of information. >>>>> >>>>> And it sounds like when we add support to hand off Kokkos matrices and >>>>> vectors (our current support for matrices on ROCm devices uses Kokkos) or >>>>> add direct support for hipSparse, we'll avoid touching host memory in >>>>> assembly-to-solve with hypre. >>>>> >>>> >>>> It does not look like anyone has made Hypre work with HIP. Stafano >>>> added a runex19_hypre_hip target 4 months ago and hypre.py has some HIP >>>> things. >>>> >>>> I have a user that would like to try this, no hurry but, can I get an >>>> idea of a plan for this? >>>> >>>> Thanks, >>>> Mark >>>> >>>> >>> >>> >>> -- >>> Stefano >>> >>
