Hi Junchao,

We have recently been using ASM + LU for 2D problems on both CPU and GPU. 
However, I found that this method has very bad weak scaling. I find that the 
cost of PCApply increases by about a factor of 4 each time I increase the 
problem size in 1 dimension by a factor of 2 while keeping the load per 
core/gpu the same. The total number of GMRES iterations does not increase, just 
the cost of PCApply (and PCSetup). Is this scaling behavior expected? Any ideas 
of how to optimize the preconditioner?

Thank you.

-Justin

From: Junchao Zhang <[email protected]>
Date: Monday, April 14, 2025 at 7:35 PM
To: Angus, Justin Ray <[email protected]>
Cc: [email protected] <[email protected]>, Ghosh, Debojyoti 
<[email protected]>
Subject: Re: [petsc-dev] Additive Schwarz Method + ILU on GPU platforms

Petsc supports ILU0/ICC0 numeric factorization (without reordering) and then 
triangular solve on GPUs. It is done by calling vendor libraries (ex. cusparse).
We have options -pc_factor_mat_factor_on_host <bool>  
-pc_factor_mat_solve_on_host <bool> to force doing the factorization and 
MatSolve on the host for device matrix types.

You can try to see if it works for your case.

--Junchao Zhang


On Mon, Apr 14, 2025 at 4:39 PM Angus, Justin Ray via petsc-dev 
<[email protected]<mailto:[email protected]>> wrote:
Hello,

A project I work on uses GMRES via PETSc. In particular, we have had good 
successes using the Additive Schwarz Method + ILU preconditioner setup using a 
CPU-based code. I found online where it is stated that “Parts of most 
preconditioners run directly on the GPU” 
(https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!f0XJWVP6elKNdUG2AClFI5dyf1itzs2_b-_J60xUiPOON5oStGYegI8F9z6lgw0ucidOPXX5_OhJ628dK3vGJQ$
 
<https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!bw6qeKcY7MKSvlEgcogdKR7fpjZSOFvka6zfDprUZ_sJHdE-YZmRD6UTqWQW3_uGVBII4P-AG0zaGTLbI67_fQ$>).
 Is ASM + ILU also available for GPU platforms?

-Justin

Reply via email to