Github user njayaram2 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/272#discussion_r192246910
--- Diff: doc/design/modules/neural-network.tex ---
@@ -117,6 +117,24 @@ \subsubsection{Backpropagation}
\[\boxed{\delta_{k}^j = \sum_{t=1}^{n_{k+1}} \left( \delta_{k+1}^t \cdot
u_{k}^{jt} \right) \cdot \phi'(\mathit{net}_{k}^j)}\]
where $k = 1,...,N-1$, and $j = 1,...,n_{k}$.
+\paragraph{Momentum updates.}
+Momentum\cite{momentum_ilya}\cite{momentum_cs231n} can help accelerate
learning and avoid local minima when using gradient descent. We also support
nesterov's accelarated gradient due to its look ahead characteristics. \\
+Here we need to introduce two new variables namely velocity and momentum.
momentum must be in the range 0 to 1, where 0 means no momentum. The momentum
value is responsible for damping the velocity and is analogous to the
coefficient of friction. \\
+In classical momentum you first correct the velocity and step with that
velocity, whereas in Nesterov momentum you first step in the velocity direction
then make a correction to the velocity vector based on the new location. \\
+Classic momentum update
+\[\begin{aligned}
+ \mathit{v} \set \mathit{mu} * \mathit{v} - \eta * \mathit{gradient}
\text{ (velocity update)} \\ % $\eta$,\\ * $\nabla f(u)$
+ \mathit{u} \set \mathit{u} + \mathit{v} \\
+\end{aligned}\]
+
+Nesterov momentum update
+\[\begin{aligned}
+ \mathit{u} \set \mathit{u} + \mathit{mu} * \mathit{v} \text{
(nesterov's initial coefficient update )} \\
+ \mathit{v} \set \mathit{mu} * \mathit{v} - \eta * \mathit{gradient}
\text{ (velocity update)} \\ % $\eta$,\\ * $\nabla f(u)$
+ \mathit{u} \set \mathit{u} - \eta * \mathit{gradient} \\
--- End diff --
Can we left-align these and other momentum related equations?
---