Re: [petsc-users] Action of inverse of Cholesky factors in parallel

Jose E. Roman Mon, 05 Dec 2022 00:40:28 -0800

MUMPS does not support doing the forward or backward solve only. You should ask 
MUMPS developers to add an option for this, then we would be able to interface 
it in PETSc. Regarding the other Cholesky options (see 
https://petsc.org/release/overview/linear_solve_table/), PaStiX has the same 
problem as MUMPS; CHOLMOD is sequential; SuperLU_DIST implements LU so it would 
not be helpful for you if the forward/backward solves were available.


If you want to go the inner-product route, SLEPc provides some support via its 
BV object, including basis orthogonalization, see 
https://slepc.upv.es/documentation/current/docs/manualpages/BV/BVSetMatrix.html

Jose

> El 5 dic 2022, a las 8:26, Johannes Haubner <[email protected]> escribió:
> 
> We are trying to write an interface to general purpose optimization solvers 
> for PDE constrained optimal control problems in Hilbert spaces based on 
> PETSc. Ipopt or the scipy implementation of L-BFGS work with the Euclidean 
> inner product. For discretizations of PDE constrained optimization problems 
> in Hilbert spaces one typically wants to work with inner products of the form 
> x^T A x (A s.p.d, A typically not equal to the identity matrix).
> 
> Not working with the correct inner product typically leads to mesh dependent 
> behavior of the optimization algorithm, which can be motivated by considering 
> the optimization problem
> 
> min_{x in X} 1/2 ||x||^2_X  =: f(x)
> 
> where X denotes a Hilbert space. We choose x_0 != 0 and apply a gradient 
> descent algorithm with step size 1.
> 
> Case 1: X = R^d, d in N, || x ||_X ^2 =  x^T x:
>              It holds nabla f(x) = x.
>              Applying gradient descent with step size 1: x_1 = x_0 - x_0 = 0.
>              Hence, we have convergence to the optimal solution in 1 
> iteration.
> 
> Case 2: X = H^1(Omega), Omega subset R^d, d in {2, 3},
>              (as a specific example for X being a infinite dimensional 
> Hilbert space)
>              || x ||_X ^2 = int x(xi) x(xi) dxi + int nabla x(xi) . nabla 
> x(xi) *dxi
>              Derivative of f(x) is an element of the dual space of X and 
> given by
>              Df(x) (delta x) = int x (xi) delta_x(xi) dx + int nabla x(xi) . 
> nabla delta_x(xi) *dxi
>              for delta_x in X
>              Gradient (Riesz representation of the derivative) is given by 
> nabla f = x
>              Hence, gradient descent with step size 1: x_1 = x_0 - x_0 = 0
> 
> Case 3: Discretization of Case 2 using finite elements:
>              f(x) := 1/2 x^T (M + S) x, where M denotes the mass matrix, S 
> the stiffness matrix,
>              and x the degrees of freedom of the discretization
>              Neglecting (by working with the Euclidean inner product) that x 
> encodes a H^1-function
>              and just solving the discretized optimization problem with 
> gradient descent with step size 1 yields
>              nabla f(x) = (M + S) x
>              and x_1 = x_0 - (M + S) x_0
>              where the condition number of (M + S) depends on the mesh size,
>              i.e. mesh size dependent convergence to optimal solution (and 
> not in 1 iteration).
> 
> One can either respect the correct inner product (in the example x^T (M + S) 
> x) in the implementation of the optimization algorithms (see e.g. 
> https://github.com/funsim/moola) or by doing a "discrete hack". The latter 
> consists of introducing a auxiliary variable y, applying a linear 
> transformation y --> (L^T)^{-1} y =: x, where (M + S) = L L^T (that is why we 
> need the action of the inverse of the Cholesky factors (L^{-1} y is needed 
> for applying the chain rule when computing the derivative)), and optimizing 
> for y instead of x.
> 
> Doing this trick gives for the above example:
> tilde f(y) := f(L^-T y) = 1/2 y^T L^{-1} (M + S) L^{-T} y = 1/2 y^T y
> and gradient descent in y with step size 1 yields y_1 = y_0 - y_0 = 0.
> Therefore, also x_1 = L^{-T} y_1 = 0 (i.e., convergence in 1 iteration).
> 
> Johannes
> 
>> On 2. Dec 2022, at 13:48, Jose E. Roman <[email protected]> wrote:
>> 
>> As far as I know, the solveForward/Backward operations are not implemented 
>> for external packages.
>> 
>> What are you trying to do? One may be tempted to use the Cholesky 
>> decomposition to transform a symmetric-definite generalized eigenvalue 
>> problem into a standard eigenvalue problem, as is done e.g. in LAPACK 
>> https://netlib.org/lapack/lug/node54.html - But note that in SLEPc you do 
>> not need to do this, as EPSSolve() will take care of symmetry using Cholesky 
>> without having to apply solveForward/Backward separately.
>> 
>> Jose
>> 
>> 
>> 
>> 
>>> El 2 dic 2022, a las 13:32, Johannes Haubner <[email protected]> escribió:
>>> 
>>> Given a s.p.d. matrix A and a right-hand-side b, we want to compute 
>>> (L^T)^{-1}b and L^{-1}b, where A= L L^T is the Cholesky decomposition of A. 
>>> In serial, we are able to do this using pc_factor_mat_solver_type petsc and 
>>> using "solveForward" and "solveBackward" (see 
>>> https://github.com/JohannesHaubner/petsc-cholesky/blob/main/main.py). In 
>>> parallel, we are not able to do this yet. According to p. 52 
>>> https://slepc.upv.es/documentation/slepc.pdf , PETSc's Cholesky 
>>> factorization is not parallel. However, using "mumps" or "superlu_dist" 
>>> prevents us from using "solveForward" or "solveBackward". Is there a way of 
>>> using solveBackward in parallel with a distributed matrix (MPI)?
>> 
>

Re: [petsc-users] Action of inverse of Cholesky factors in parallel

Reply via email to