Please send the full output when you run with the monitors I mentioned turned 
on. If one approach is converging and one is not then we should be able to see 
this in differences in the convergence output printed for the two runs getting 
further and further apart.

  Barry


> On Oct 31, 2022, at 1:56 AM, Carl-Johan Thore <carl-johan.th...@liu.se> wrote:
> 
> The GPU supports double precision and I didn’t explicitly tell PETSc to use 
> float when compiling, so
> I guess it uses double? What’s the easiest way to check?
>  
> Barry, running -ksp_view shows that the solver options are the same for CPU 
> and GPU. The only
> difference is the coarse grid solver for gamg (“the package used to perform 
> factorization:”) which
> is petsc for CPU and cusparse for GPU. I tried forcing the GPU to use petsc 
> via
> -fieldsplit_0_mg_coarse_sub_pc_factor_mat_solver_type, but then ksp failed to 
> converge
> even on the first topology optimization iteration.  
>  
> -ksp_view also shows differences in the eigenvalues from the Chebyshev 
> smoother. For example,
>  
> GPU: 
>    Down solver (pre-smoother) on level 2 -------------------------------
>           KSP Object: (fieldsplit_0_mg_levels_2_) 1 MPI process
>             type: chebyshev
>               eigenvalue targets used: min 0.109245, max 1.2017
>               eigenvalues provided (min 0.889134, max 1.09245) with
>  
> CPU: 
>               eigenvalue targets used: min 0.112623, max 1.23886
>               eigenvalues provided (min 0.879582, max 1.12623)
>  
> But I guess such differences are expected?
>  
> /Carl-Johan
>  
> From: Matthew Knepley <knep...@gmail.com <mailto:knep...@gmail.com>> 
> Sent: den 30 oktober 2022 22:00
> To: Barry Smith <bsm...@petsc.dev <mailto:bsm...@petsc.dev>>
> Cc: Carl-Johan Thore <carl-johan.th...@liu.se 
> <mailto:carl-johan.th...@liu.se>>; petsc-users@mcs.anl.gov 
> <mailto:petsc-users@mcs.anl.gov>
> Subject: Re: [petsc-users] KSP on GPU
>  
> On Sun, Oct 30, 2022 at 3:52 PM Barry Smith <bsm...@petsc.dev 
> <mailto:bsm...@petsc.dev>> wrote:
>  
>    In general you should expect similar but not identical conference 
> behavior. 
>  
>     I suggest running with all the monitoring you can. 
> -ksp_monitor_true_residual -fieldsplit_0_monitor_true_residual 
> -fieldsplit_1_monitor_true_residual and compare the various convergence 
> between the CPU and GPU. Also run with -ksp_view and check that the various 
> solver options are the same (they should be).
>  
> Is the GPU using float or double?
>  
>    Matt
>  
>   Barry
>  
> 
> 
> On Oct 30, 2022, at 11:02 AM, Carl-Johan Thore via petsc-users 
> <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>> wrote:
>  
> Hi,
>  
> I'm solving a topology optimization problem with Stokes flow discretized by a 
> stabilized Q1-Q0 finite element method
> and using BiCGStab with the fieldsplit preconditioner to solve the linear 
> systems. The implementation
> is based on DMStag, runs on Ubuntu via WSL2, and works fine with PETSc-3.18.1 
> on multiple CPU cores and the following
> options for the preconditioner:
>  
> -fieldsplit_0_ksp_type preonly \
> -fieldsplit_0_pc_type gamg \
> -fieldsplit_0_pc_gamg_reuse_interpolation 0 \
> -fieldsplit_1_ksp_type preonly \
> -fieldsplit_1_pc_type jacobi 
>  
> However, when I enable GPU computations by adding two options -
>  
> ...
> -dm_vec_type cuda \
> -dm_mat_type aijcusparse \
> -fieldsplit_0_ksp_type preonly \
> -fieldsplit_0_pc_type gamg \
> -fieldsplit_0_pc_gamg_reuse_interpolation 0 \
> -fieldsplit_1_ksp_type preonly \
> -fieldsplit_1_pc_type jacobi 
>  
> - KSP still works fine the first couple of topology optimization iterations 
> but then
> stops with "Linear solve did not converge due to DIVERGED_DTOL ..".
>  
> My question is whether I should expect the GPU versions of the linear solvers 
> and pre-conditioners
> to function exactly as their CPU counterparts (I got this impression from the 
> documentation),
> in which case I've probably made some mistake in my own code, or whether 
> there are other/additional
> settings or modifications I should use to run on the GPU (an NVIDIA Quadro 
> T2000)?
>  
> Kind regards,
>  
> Carl-Johan
>  
> 
>  
> -- 
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
>  
> https://www.cse.buffalo.edu/~knepley/ 
> <https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.cse.buffalo.edu%2F~knepley%2F&data=05%7C01%7Ccarl-johan.thore%40liu.se%7C8113f968516a44cce6ba08dabab9b28e%7C913f18ec7f264c5fa816784fe9a58edd%7C0%7C0%7C638027603971893036%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gQ45cdsgo404NeBzZ7e6c5zNXhYOy39ZPfZzqtGDaDk%3D&reserved=0>

Reply via email to