Are you calling VecRestoreArray when you are done inserting the values? On Tue, Nov 1, 2022, 18:42 Carl-Johan Thore via petsc-users < [email protected]> wrote:
> Thanks for the tips! > > > > The suggested settings for GAMG did not yield better results, > > but hypre worked well right away, giving very good convergence! > > > > A follow-up question then (I hope that’s ok; and it could be related to > GAMG > > not working, I’ll check that). Once everything was running I discovered > that my gradient vector > > dfdx which I populate via an array df obtained from VecGetArray(dfdx, &df) > doesn’t get > > filled properly; it always contains only zeros. This is not the case when > I run on the CPU, > > and df gets filled as it should even on the GPU, suggesting that either > I’m not using > > VecGetArray properly, or I shouldn’t use it at all for GPU computations? > > > > Kind regards, > > Carl-Johan > > > > *From:* Mark Adams <[email protected]> > *Sent:* den 31 oktober 2022 13:30 > *To:* Carl-Johan Thore <[email protected]> > *Cc:* Matthew Knepley <[email protected]>; Barry Smith <[email protected]>; > [email protected] > *Subject:* Re: [petsc-users] KSP on GPU > > > > * You could try hypre or another preconditioner that you can afford, > like LU or ASM, that works. > > * If this matrix is SPD, you want to use > -fieldsplit_0_pc_gamg_esteig_ksp_type cg > -fieldsplit_0_pc_gamg_esteig_ksp_max_it 10 > > These will give better eigen estimates, and that is important. > > The differences between these steimates is not too bad. > > There is a safety factor (1.05 is the default) that you could increase > with: -fieldsplit_0_mg_levels_ksp_chebyshev_esteig 0,0.05,0,*1.1* > > * Finally you could try -fieldsplit_0_pc_gamg_reuse_interpolation 1, if > GAMG is still not working. > > > > Use -fieldsplit_0_ksp_converged_reason and check the iteration count. > > And it is a good idea to check with hypre to make sure something is not > going badly in terms of performance anyway. AMG is hard and hypre is a good > solver. > > > > Mark > > > > On Mon, Oct 31, 2022 at 1:56 AM Carl-Johan Thore via petsc-users < > [email protected]> wrote: > > The GPU supports double precision and I didn’t explicitly tell PETSc to > use float when compiling, so > > I guess it uses double? What’s the easiest way to check? > > > > Barry, running -ksp_view shows that the solver options are the same for > CPU and GPU. The only > > difference is the coarse grid solver for gamg (“the package used to > perform factorization:”) which > > is petsc for CPU and cusparse for GPU. I tried forcing the GPU to use > petsc via > > -fieldsplit_0_mg_coarse_sub_pc_factor_mat_solver_type, but then ksp failed > to converge > > even on the first topology optimization iteration. > > > > -ksp_view also shows differences in the eigenvalues from the Chebyshev > smoother. For example, > > > > GPU: > > Down solver (pre-smoother) on level 2 ------------------------------- > > KSP Object: (fieldsplit_0_mg_levels_2_) 1 MPI process > > type: chebyshev > > eigenvalue targets used: min 0.109245, max 1.2017 > > eigenvalues provided (min 0.889134, max 1.09245) with > > > > CPU: > > eigenvalue targets used: min 0.112623, max 1.23886 > > eigenvalues provided (min 0.879582, max 1.12623) > > > > But I guess such differences are expected? > > > > /Carl-Johan > > > > *From:* Matthew Knepley <[email protected]> > *Sent:* den 30 oktober 2022 22:00 > *To:* Barry Smith <[email protected]> > *Cc:* Carl-Johan Thore <[email protected]>; [email protected] > *Subject:* Re: [petsc-users] KSP on GPU > > > > On Sun, Oct 30, 2022 at 3:52 PM Barry Smith <[email protected]> wrote: > > > > In general you should expect similar but not identical conference > behavior. > > > > I suggest running with all the monitoring you can. > -ksp_monitor_true_residual > -fieldsplit_0_monitor_true_residual -fieldsplit_1_monitor_true_residual and > compare the various convergence between the CPU and GPU. Also run with > -ksp_view and check that the various solver options are the same (they > should be). > > > > Is the GPU using float or double? > > > > Matt > > > > Barry > > > > > > On Oct 30, 2022, at 11:02 AM, Carl-Johan Thore via petsc-users < > [email protected]> wrote: > > > > Hi, > > > > I'm solving a topology optimization problem with Stokes flow discretized > by a stabilized Q1-Q0 finite element method > > and using BiCGStab with the fieldsplit preconditioner to solve the linear > systems. The implementation > > is based on DMStag, runs on Ubuntu via WSL2, and works fine with > PETSc-3.18.1 on multiple CPU cores and the following > > options for the preconditioner: > > > > -fieldsplit_0_ksp_type preonly \ > > -fieldsplit_0_pc_type gamg \ > > -fieldsplit_0_pc_gamg_reuse_interpolation 0 \ > > -fieldsplit_1_ksp_type preonly \ > > -fieldsplit_1_pc_type jacobi > > > > However, when I enable GPU computations by adding two options - > > > > ... > > -dm_vec_type cuda \ > > -dm_mat_type aijcusparse \ > > -fieldsplit_0_ksp_type preonly \ > > -fieldsplit_0_pc_type gamg \ > > -fieldsplit_0_pc_gamg_reuse_interpolation 0 \ > > -fieldsplit_1_ksp_type preonly \ > > -fieldsplit_1_pc_type jacobi > > > > - KSP still works fine the first couple of topology optimization > iterations but then > > stops with "Linear solve did not converge due to DIVERGED_DTOL ..". > > > > My question is whether I should expect the GPU versions of the linear > solvers and pre-conditioners > > to function exactly as their CPU counterparts (I got this impression from > the documentation), > > in which case I've probably made some mistake in my own code, or whether > there are other/additional > > settings or modifications I should use to run on the GPU (an NVIDIA Quadro > T2000)? > > > > Kind regards, > > > > Carl-Johan > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://www.cse.buffalo.edu/~knepley/ > <https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.cse.buffalo.edu%2F~knepley%2F&data=05%7C01%7Ccarl-johan.thore%40liu.se%7C88e88d9edbf5436ced1c08dabb3bba29%7C913f18ec7f264c5fa816784fe9a58edd%7C0%7C0%7C638028162449727533%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=E403%2B2DscOzU8UM2HMxEp5iVR3HOl3Atv4o6CF1WGQg%3D&reserved=0> > >
