Re: [petsc-users] Action of inverse of Cholesky factors in parallel

Johannes Haubner Sun, 04 Dec 2022 23:27:03 -0800

We are trying to write an interface to general purpose optimization solvers for 
PDE constrained optimal control problems in Hilbert spaces based on PETSc. 
Ipopt or the scipy implementation of L-BFGS work with the Euclidean inner 
product. For discretizations of PDE constrained optimization problems in 
Hilbert spaces one typically wants to work with inner products of the form x^T 
A x (A s.p.d, A typically not equal to the identity matrix).


Not working with the correct inner product typically leads to mesh dependent 
behavior of the optimization algorithm, which can be motivated by considering 
the optimization problem

min_{x in X} 1/2 ||x||^2_X  =: f(x)

where X denotes a Hilbert space. We choose x_0 != 0 and apply a gradient 
descent algorithm with step size 1.

Case 1: X = R^d, d in N, || x ||_X ^2 =  x^T x:
              It holds nabla f(x) = x.
              Applying gradient descent with step size 1: x_1 = x_0 - x_0 = 0.
              Hence, we have convergence to the optimal solution in 1 iteration.

Case 2: X = H^1(Omega), Omega subset R^d, d in {2, 3},
              (as a specific example for X being a infinite dimensional Hilbert 
space)
              || x ||_X ^2 = int x(xi) x(xi) dxi + int nabla x(xi) . nabla 
x(xi) *dxi
              Derivative of f(x) is an element of the dual space of X and given 
by
              Df(x) (delta x) = int x (xi) delta_x(xi) dx + int nabla x(xi) . 
nabla delta_x(xi) *dxi
              for delta_x in X
              Gradient (Riesz representation of the derivative) is given by 
nabla f = x
              Hence, gradient descent with step size 1: x_1 = x_0 - x_0 = 0

Case 3: Discretization of Case 2 using finite elements:
              f(x) := 1/2 x^T (M + S) x, where M denotes the mass matrix, S the 
stiffness matrix,
              and x the degrees of freedom of the discretization
              Neglecting (by working with the Euclidean inner product) that x 
encodes a H^1-function
              and just solving the discretized optimization problem with 
gradient descent with step size 1 yields
              nabla f(x) = (M + S) x
              and x_1 = x_0 - (M + S) x_0
              where the condition number of (M + S) depends on the mesh size,
              i.e. mesh size dependent convergence to optimal solution (and not 
in 1 iteration).

One can either respect the correct inner product (in the example x^T (M + S) x) 
in the implementation of the optimization algorithms (see e.g. 
https://github.com/funsim/moola) or by doing a "discrete hack". The latter 
consists of introducing a auxiliary variable y, applying a linear 
transformation y --> (L^T)^{-1} y =: x, where (M + S) = L L^T (that is why we 
need the action of the inverse of the Cholesky factors (L^{-1} y is needed for 
applying the chain rule when computing the derivative)), and optimizing for y 
instead of x.

Doing this trick gives for the above example:
tilde f(y) := f(L^-T y) = 1/2 y^T L^{-1} (M + S) L^{-T} y = 1/2 y^T y
and gradient descent in y with step size 1 yields y_1 = y_0 - y_0 = 0.
Therefore, also x_1 = L^{-T} y_1 = 0 (i.e., convergence in 1 iteration).

Johannes

> On 2. Dec 2022, at 13:48, Jose E. Roman <[email protected]> wrote:
> 
> As far as I know, the solveForward/Backward operations are not implemented 
> for external packages.
> 
> What are you trying to do? One may be tempted to use the Cholesky 
> decomposition to transform a symmetric-definite generalized eigenvalue 
> problem into a standard eigenvalue problem, as is done e.g. in LAPACK 
> https://netlib.org/lapack/lug/node54.html - But note that in SLEPc you do not 
> need to do this, as EPSSolve() will take care of symmetry using Cholesky 
> without having to apply solveForward/Backward separately.
> 
> Jose
> 
> 
> 
> 
>> El 2 dic 2022, a las 13:32, Johannes Haubner <[email protected]> escribió:
>> 
>> Given a s.p.d. matrix A and a right-hand-side b, we want to compute 
>> (L^T)^{-1}b and L^{-1}b, where A= L L^T is the Cholesky decomposition of A. 
>> In serial, we are able to do this using pc_factor_mat_solver_type petsc and 
>> using "solveForward" and "solveBackward" (see 
>> https://github.com/JohannesHaubner/petsc-cholesky/blob/main/main.py). In 
>> parallel, we are not able to do this yet. According to p. 52 
>> https://slepc.upv.es/documentation/slepc.pdf , PETSc's Cholesky 
>> factorization is not parallel. However, using "mumps" or "superlu_dist" 
>> prevents us from using "solveForward" or "solveBackward". Is there a way of 
>> using solveBackward in parallel with a distributed matrix (MPI)?
>

Re: [petsc-users] Action of inverse of Cholesky factors in parallel

Reply via email to