Thanks, this is helpful. I have seen RBMs only in pretraining for supervised predictions and was wondering why they are not used in density estimation. Especially, that there is an algorithm (CD) to train them in a reasonable amount of time. Hierarchical bayes might be a better choice for a larger number of parameters like RBM but it would also involve MCMC iterations. Any thoughts?
On Mon, Jul 27, 2015 at 6:18 AM, Kyle Kastner <kastnerk...@gmail.com> wrote: > RBMs are a factorization of a generally intractable problem - as you > mention it is still O(n**2) but much better than the combinatorial brute > force thing that the RBM factorization replaces. There might be faster RBM > algorithms around but I don't know of any faster implementations that don't > use GPU code. There might be specific RBMs for sparse data, but in general > RBMs are designed for latent factor discovery in dense, low-ish dimensional > (1000 - 10000 features) input data. > > The current sklearn code for RBMs is just binary-binary, as you mention. > The Gaussian version (both binary-Gaussian and Gaussian-Gaussian) exists > but is not implemented in the library. I have personally had a harder time > training real-valued latent variable models, compared to binarized versions > - if you can "binarize" your problem it is worth trying that as a first > shot. > > One hack I have tried on other tasks is to use KMeans clustering to get > binary codes (by mapping data points to the nearest cluster, then > representing this with one-hot / LabelBinarizer format). Then the RBM will > give cluster indices, which can be mapped back to cluster centers or made > into "stochastic" units by sampling from a Gaussian / RBF centered at the > cluster center, with some fixed variance you choose. This is kind of weird > but worked as well as could be expected for my task. > > RBMs are still used to form latent variable models such as the RNN-RBM for > timeseries modeling (here http://deeplearning.net/tutorial/rnnrbm.html), > or in the spike and slab RBM / DBN for texture modeling (here > https://ift6266h15.files.wordpress.com/2015/04/20_vae.pdf). > > > > > > On Mon, Jul 27, 2015 at 5:17 AM, Mika S <siddhupi...@gmail.com> wrote: > >> i am using scikit learn's RBM implementation. There are two problems: >> >> 1. >> >> The running time is O(d^2) where d is the number of features. This >> becomes a problem in using high dimensionality sparse features. Consider >> features that come from feature hashing for instance. >> 2. >> >> It only allows for binary visible features. Do I have to change the >> sklearn code to have non binary units or there is some trick that I am >> unaware of? >> >> I am expecting RBMs with 4 features to have a better fit than a mixture >> of 2 gaussians (that has a similar number of parameters). Has anyone seen >> any experiments done on RBMs for unsupervised modeling other than >> pretraining? >> >> >> >> ------------------------------------------------------------------------------ >> >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> >> > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > >
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general