Hi Francois,
This took me a good while to figure out! I'll see if I can give a condensed
version of the way I did it, which I think is equivalent. Following the
code at the back of the DBN paper is how I finally figured it out. The way
I see it is there are 3 separate parts (after pre-training as normal):
- Wake phase: data-driven - updates generative parameters
- Associative Phase: transition from data to model - updates RBM parameters
- Sleep Phases: model driven - updates recognition parameters.
Pre-requisite step: Unite the recognition (forward prop/prop-up weights)
from the generative (prop down) weights and leave the rbm weights tied.
1. Wake:
- Propagate values up to the layer before the rbm at the top.
(Second last or penultimate layer if only 1 RBM at the top).
- *Before* going into the RBM propagate these values back down to
get a reconstruction
- Cost = difference between input and reconstruction.
- Update generative parameters with derivative of relevant cost.
2. Associative:
- Do usual update of RBM, using n-step CD etc.
3. Sleep:
- Use values output from n-step Contrastive Divergence in the
associative step as values in the penultimate layer. (pen_RBM)
- Propagate these values down to input layer
- Propagate these values back up again to calculate the reconstructed
penultimate layer. (pen_reconstruction)
- Cost = difference between pen_RBM and pen_Reconstruction
That's my understanding anyway. Hope it helps!
All the Best,
Jim
On Monday, 13 February 2017 12:54:19 UTC, Francois Lasson wrote:
>
> Hello Jim,
>
> I'm currently working on the same problem using Theano.
> Have you implemented the constrastive wake-sleep algorithm on this library
> and this case, could you tell me some guidances?
>
> Many thanks,
> François
>
> Le jeudi 14 juillet 2016 11:32:38 UTC+2, Jim O' Donoghue a écrit :
>>
>> So I'm going to reply to my own question in case it helps anyone else
>> out. Had another look at the paper there, I had forgotten about the
>> contrastive wake-sleep algorithm. That's what's used to train the algorithm
>> completely unsupervised.
>>
>> On Tuesday, 12 July 2016 15:40:48 UTC+1, Jim O' Donoghue wrote:
>>>
>>> Hi There,
>>>
>>> Just wondering how you would fine-tune a DBN for a completely
>>> unsupervised task i.e. practical implementation of "Fine-tune all the
>>> parameters of this deep architecture with respect to a proxy for the DBN
>>> log- likelihood".
>>>
>>> Would this be something like, for example, a negative log likelihood
>>> between the original input and the reconstruction of the data when
>>> propogated entirely up and down the network? What makes the final layer an
>>> rbm and the rest just normally directed. Or would the only way you can do
>>> this be to completely un-roll the network and fine-tune like a deep
>>> autoencoder (as in reducing the dimensionality of data with neural
>>> networks)?
>>>
>>> Many thanks,
>>> Jim
>>>
>>>
>>>
>>>
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.