Hi

On Wed, Jun 6, 2012 at 10:38 AM, xinfan meng <mxf3...@gmail.com> wrote:

> Hi, all. I post this question to the list, since it might be related to
> the MLP being developed.
>
> I found two versions of the error function for output layer of MLP are
> used in the literature.
>
>
>    1. \delta_o = (y-a) f'(z)
>    http://ufldl.stanford.edu/wiki/index.php/Backpropagation_Algorithm
>    2. \delta_o = (y-a)  http://www.idsia.ch/NNcourse/backprop.html
>
> Given that they all use the same sigmoid activation function and loss
> function, how can the error function be different? Also note that the error
> functions will ultimately lead to different propagating errors in the
> hidden layers.
>
>
I just skimmed through them and there are few differencies between those
two pages:

* \delta_o doesn't mean the same in those pages. In the second one, it's
just the derivative of the error function.
* The second page doesn't use sigmoid as output function, look at the
examples on next page and you'll see that y_o = a + f tanh(x) + g tanh(x).
Derivative of this function is y. As can be seen in the matrix form \Delta
W = \delta_l y_{l-1}

I hope this answers your question. Sometimes it's possible to make the
computations simpler, because the error function and output function are
natural pairs, see
http://www.willamette.edu/~gorr/classes/cs449/classify.html

David
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to