Hello,
I am interested in calculating the gradients of an optimization problem with
one goal and one constraint functions which need TSAdjoint for their adjoints.
I’d like to call each of their adjoints in different calls, but it does not
seem to be possible without making compromises. For instance, one could set
TSCreateQuadratureTS() and TSSetCostGradients() with different quadratures (and
their gradients) for each adjoint call (one at a time). This would evaluate the
cost functions in the backwards run though, whereas one typically computes the
cost functions in a different routine than the adjoint call (like in line
searches evaluations)
One could also set TSCreateQuadratureTS() with the goal and the constraint
functions to be evaluated at the forward run (as typically done when computing
the cost function). The problem would be that the adjoint call now requires two
sets of gradients for TSSetCostGradients() and their adjoint are calculated
together, costing twice if your routines for the cost and the constraint
gradients are separated.
The only solution I can think of is to set TSCreateQuadratureTS() with both the
goal and constraint functions in the forward run. Then, in the adjoint calls,
reset TSCreateQuadratureTS() with just the cost function I am interested in
(either the goal or the constraint) and set just a single TSSetCostGradients().
Will this work? Are there better alternatives?
Even if successful, there is the problem that the trajectory goes back to the
beginning when we perform a TSAdjointSolve() call. Subsequent calls to
TSAdjointSolve() (for instance for another cost function) are invalid because
the trajectory is not set at the end of the simulation. One needs to call the
forward problem to bring it back to the end. Is there a quick way to set the
trajectory state to the last time step without having to run the forward
problem? I am attaching an example to illustrate this issue. One can uncomment
lines 120-122 to obtain the right value of the derivative.
Thanks
Miguel
Miguel A. Salazar de Troya
Postdoctoral Researcher, Lawrence Livermore National Laboratory
B141
Rm: 1085-5
Ph: 1(925) 422-6411
import sys, petsc4py
petsc4py.init(sys.argv)
from petsc4py import PETSc
a = 2.0
b = 4.0
c = 3.0
class SimpleODE(object):
n = 1
comm = PETSc.COMM_SELF
def __init__(self, b, c, deriv="b"):
self.b_ = b
self.c_ = c
self.deriv = deriv
def initialCondition(self, u):
u[0] = a
u.assemble()
def evalIFunction(self, ts, t, u, udot, f):
b = self.b_
c = self.c_
f[0] = c * udot[0] - b * u[0]
f.assemble()
def evalIJacobian(self, ts, t, u, udot, shift, A, B):
J = A
b = self.b_
c = self.c_
J[0, 0] = c * shift - b * 1.0
J.assemble()
if A != B:
B.assemble()
return True # same nonzero pattern
def evalIJacobianP(self, ts, t, u, udot, shift, C):
Jp = C
if self.deriv == "c":
dt = ts.getTimeStep()
Jp[0, 0] = udot[0]
else:
Jp[0, 0] = -u[0]
Jp.assemble()
return True
OptDB = PETSc.Options()
deriv = "c"
ode = SimpleODE(b, c, deriv=deriv)
J = PETSc.Mat().createDense([ode.n, ode.n], comm=ode.comm)
J.setUp()
Jp = PETSc.Mat().createDense([ode.n, 1], comm=ode.comm)
Jp.setUp()
u = PETSc.Vec().createSeq(ode.n, comm=ode.comm)
f = u.duplicate()
adj_u = []
adj_u.append(PETSc.Vec().createSeq(ode.n, comm=ode.comm))
adj_p = []
adj_p.append(PETSc.Vec().createSeq(1, comm=ode.comm))
ts = PETSc.TS().create(comm=ode.comm)
ts.setProblemType(ts.ProblemType.NONLINEAR)
ts.setType(ts.Type.THETA)
ts.setTheta(0.5)
ts.setIFunction(ode.evalIFunction, f)
ts.setIJacobian(ode.evalIJacobian, J)
ts.setIJacobianP(ode.evalIJacobianP, Jp)
ts.setSaveTrajectory()
ts.setTime(0.0)
dt = 0.001
ts.setTimeStep(dt)
T = 1.0
ts.setMaxTime(T)
ts.setMaxSteps(1000)
ts.setExactFinalTime(PETSc.TS.ExactFinalTime.MATCHSTEP)
ts.setFromOptions()
ode.initialCondition(u)
ts.solve(u)
import numpy as np
print(f"Numerical solution {u.array[0]}")
print(f"Analytical solution: {a * np.exp(b / c * T)}")
adj_u[0][0] = 1.0
adj_u[0].assemble()
adj_p[0][0] = 0.0
adj_p[0].assemble()
ts.setCostGradients(adj_u, adj_p)
ts.adjointSolve()
print(f"Numerical derivative {adj_p[0].array[0]}")
if deriv == "c":
print(f"Analytical derivative w.r.t c: {-a * T * b / c**2 * np.exp(b / c * T)}")
else:
print(f"Analytical derivative w.r.t b: {a * T / c * np.exp(b / c * T)}")
# Second adjoint solve. Terminal cost function is 2.0 * u
adj_u[0][0] = 2.0
adj_u[0].assemble()
adj_p[0][0] = 0.0
adj_p[0].assemble()
###### ## These guys are needed to call adjointSolve again on a different cost function.
#ts.setTime(0.0)
#ts.setTimeStep(dt)
#ts.solve(u)
ts.setCostGradients(adj_u, adj_p)
ts.adjointSolve()
print(f"Numerical derivative {adj_p[0].array[0]}")
if deriv == "c":
print(f"Analytical derivative w.r.t c: {-2.0*a * T * b / c**2 * np.exp(b / c * T)}")
else:
print(f"Analytical derivative w.r.t b: {2.0*a * T / c * np.exp(b / c * T)}")
del ode, J, Jp, u, f, ts, adj_u, adj_p