[GitHub] madlib pull request #272: MLP: Add momentum and nesterov to gradient updates...

njayaram2 Thu, 31 May 2018 15:12:48 -0700

Github user njayaram2 commented on a diff in the pull request:

    https://github.com/apache/madlib/pull/272#discussion_r192248589
  
    --- Diff: doc/design/modules/neural-network.tex ---
    @@ -117,6 +117,24 @@ \subsubsection{Backpropagation}
     \[\boxed{\delta_{k}^j = \sum_{t=1}^{n_{k+1}} \left( \delta_{k+1}^t \cdot 
u_{k}^{jt} \right) \cdot \phi'(\mathit{net}_{k}^j)}\]
     where $k = 1,...,N-1$, and $j = 1,...,n_{k}$.
     
    +\paragraph{Momentum updates.}
    +Momentum\cite{momentum_ilya}\cite{momentum_cs231n} can help accelerate 
learning and avoid local minima when using gradient descent. We also support 
nesterov's accelarated gradient due to its look ahead characteristics. \\
    +Here we need to introduce two new variables namely velocity and momentum. 
momentum must be in the range 0 to 1, where 0 means no momentum. The momentum 
value is responsible for damping the velocity and is analogous to the 
coefficient of friction. \\
    +In classical momentum you first correct the velocity and step with that 
velocity, whereas in Nesterov momentum you first step in the velocity direction 
then make a correction to the velocity vector based on the new location. \\
    +Classic momentum update
    +\[\begin{aligned}
    +    \mathit{v} \set \mathit{mu} * \mathit{v} - \eta * \mathit{gradient} 
\text{ (velocity update)} \\ % $\eta$,\\ *  $\nabla f(u)$
    +    \mathit{u} \set \mathit{u} + \mathit{v} \\
    +\end{aligned}\]
    +
    +Nesterov momentum update
    +\[\begin{aligned}
    +    \mathit{u} \set \mathit{u} + \mathit{mu} * \mathit{v} \text{ 
(nesterov's initial coefficient update )} \\
    +    \mathit{v} \set \mathit{mu} * \mathit{v} -  \eta * \mathit{gradient} 
\text{ (velocity update)} \\ % $\eta$,\\ *  $\nabla f(u)$
    --- End diff --
    
    Instead of `gradient`, can we replace it with the actual math notation?
    I think the gradient at this line is: `\frac{\partial f}{\partial 
u_{k-1}^{sj}}`
    The gradient in the next line should be w.r.t `v` instead of `u`.

---

[GitHub] madlib pull request #272: MLP: Add momentum and nesterov to gradient updates...

Reply via email to