Add an implementation of MatGetDiagonal_SeqAIJCUSPARSE(), which is missing. Use 
for example this: 
https://stackoverflow.com/questions/60311408/how-to-get-the-diagonal-of-a-sparse-matrix-in-cusparse

Jose

> El 8 jun 2022, a las 3:21, Mark Adams <[email protected]> escribió:
> 
> I am looking at TS/SNES/KSP/GAMG solve with Landau, which is all on the GPU, 
> but it looks like MatGetDiagonal (see attached), and to a lesser extent 
> VecPointWiseMult (biggest red band on the right side under PCApply), are 
> resulting in expensive CPU-GPU movement. MatGetDiagonal on the fine grid is 
> taking about 10x the time of TFQMR/GAMG iteration.
> 
> Attached is a view of this with CUDA and an nsys data file with Kokkos that 
> is pretty much the same.
> 
> Any thoughts on how to fix this?
> 
> Thanks,
> Mark
> <Screen Shot 2022-06-07 at 8.31.20 PM.png><output_ex2_3d_kokkos.nsys-rep>

Reply via email to