On Fri, Dec 4, 2009 at 8:07 PM, prasenjit mukherjee < [email protected]> wrote:
> I am indeed learning via CD technique, but using only a single layer, > where the computation of neurons toggles between the same set of > visible and hidden layers. Guess I was too optimistic and was > expecting results in first RBM itself. > Have you read Semantic Hashing<http://www.cs.utoronto.ca/%7Ehinton/absps/sh.pdf>. (pdf link) (Salakhutdinov and Hinton) ? He gives a good explanation of why the multi-layered approach is so necessary. One layer of an RBM is not much more than a simple old-fashioned single-layer neural net, which from what I remember, has never been that good at doing this kind of thing. Not only are multiple layers needed, but fine tuning by old-fashioned gradient descent after the CD pretraining is also necessary. -jake I believe in stacked RBMs you repeat the same thing for more than one > layer, where results of current hidden layers get passed on as visible > layer to a new RBM with new hidden layer and you again do Contrastive > Divergence in that new RBM. You could possibly have different number > of hidden neurons at each layer. Makes sense to keep reducing the > number of hidden neurons at each subsequent layers. > > Anyways, will see if I can try out with multiple layers and see any > improvements. > > -Thanks, > Prasen > > On Fri, Dec 4, 2009 at 11:25 PM, Jake Mannix <[email protected]> > wrote: > > Prasen, > > > > I thought the whole point of doing the RBM approach to autoencoders / > > dimensional > > reduction was to do the stacked approach, since you don't need to do full > > convergence > > layer by layer, but instead do the layer-by-layer "contrastive > divergence" > > technique > > which Hinton advocates, and then do fine-tuning at the end? I wouldn't > > imagine > > you'd get very good relevance on a single layer. > > > > -jake > > > > On Fri, Dec 4, 2009 at 8:37 AM, prasenjit mukherjee < > > [email protected]> wrote: > > > >> I did try out on some sample data where my visible layer was Linear > >> and hidden layer was StochasticBinary. Using a single layer RBM > >> didnt give me great results. I guess I should try out the stacked RBM > >> approach. > >> > >> BTW, Anybody used single layer RBM on a doc X term probability matrix > >> ( aka Continuous visible layer ) with values 0-1 for collaborative > >> filtering ? > >> > >> -Prasen > >> > >> On Thu, Dec 3, 2009 at 12:40 AM, Olivier Grisel > >> <[email protected]> wrote: > >> > 2009/12/2 Jake Mannix <[email protected]>: > >> >> Prasen, > >> >> > >> >> I was just talking about this on here last week. Yes, RBM-based > >> >> clustering can be viewed as > >> >> a nonlinear SVD. I'm pretty interested in your findings on this. Do > >> you > >> >> have any RBM code you > >> >> care to contribute to Mahout? > >> > > >> > Hi, > >> > > >> > I have some C + python code for stacking autoencoders which share > >> > similar features as DBN (stacked RBM) here: > >> > http://bitbucket.org/ogrisel/libsgd/wiki/Home > >> > > >> > This is still pretty much work in progress, I will let you know when I > >> > have easy to run sample demos. > >> > > >> > However, this algo is not trivially mapreducable but I plan to > >> > investigate on that matters in the coming weeks. Would be nice to have > >> > a pure JVM version too. I am also planning to play with clojure + > >> > incanter (with the parallelcolt library as a backend for linear > >> > algebra) to make it easier to work with Hadoop. > >> > > >> > -- > >> > Olivier > >> > http://twitter.com/ogrisel - http://code.oliviergrisel.name > >> > > >> > > >
