I'm porting some Matlab code to Julia.
The optimization objective function evaluates the main cost function and
its gradient simultaneously. Some of the interim calculations from the cost
function are plugged into to gradient calculation to avoid making the same
calculation twice. Here is the actual function.
function SparseFilteringObj (W, X, N)
# Reshape W into matrix form
W = reshape(W, (N, size(X,1)))
# Feed Forward
F = W * X # Linear Activation
Fs = sqrt(F.^2 + 1e-8) # Soft-Absolute Activation
NFs, L2Fs = l2row(Fs) # Normalize by Rows
Fhat, L2Fn = l2row(NFs') # Normalize by Columns
# Compute Objective Function
Obj = sum(sum(Fhat, 2), 1)
# Backprop through each feedforward step
DeltaW = l2grad(NFs', Fhat, L2Fn, ones(size(Fhat)))
DeltaW = l2grad(Fs, NFs, L2Fs, DeltaW')
DeltaW = (DeltaW .* (F ./ Fs)) * X'
DeltaW = DeltaW[:]
return Obj, DeltaW
end
This is my first time using Optim.jl. It seems the interface requires that
the objective function be separated into a cost and a gradient function,
but it also says that I can get better performance by providing a third
function, DifferentiableFunction(f,g!), that calculates both of these
simultaneously. So, as I understand it, I have to split them up, and then
re-combine them using DifferentiableFunction(f,g!) to get better performance.
Is this correct?
Any suggestions on how to split this in a way that avoids duplicating
calculation? Do all the calculation that are shared as inline calcs perhaps?
It feels like I missing some easy solution. Any advice would be
appreciated.
Gist of the as yet incomplete port:
https://gist.github.com/Andy-P/5c88e524d46a3749ba5f
Original matlab code
http://cs.stanford.edu/~jngiam/papers/NgiamKohChenBhaskarNg2011_Supplementary.pdf