Have a look at Russ Salakhutdinov's thesis for work on density modelling.
The problem is that it is impossible to compute the partition function,
and therefore you can only get unnormalized densities.
On 07/27/2015 12:49 PM, Mika S wrote:
Thanks, this is helpful.
I have seen RBMs only in pretraining for supervised predictions and
was wondering why they are not used in density estimation. Especially,
that there is an algorithm (CD) to train them in a reasonable amount
of time. Hierarchical bayes might be a better choice for a larger
number of parameters like RBM but it would also involve MCMC
iterations. Any thoughts?
On Mon, Jul 27, 2015 at 6:18 AM, Kyle Kastner <kastnerk...@gmail.com
<mailto:kastnerk...@gmail.com>> wrote:
RBMs are a factorization of a generally intractable problem - as
you mention it is still O(n**2) but much better than the
combinatorial brute force thing that the RBM factorization
replaces. There might be faster RBM algorithms around but I don't
know of any faster implementations that don't use GPU code. There
might be specific RBMs for sparse data, but in general RBMs are
designed for latent factor discovery in dense, low-ish dimensional
(1000 - 10000 features) input data.
The current sklearn code for RBMs is just binary-binary, as you
mention. The Gaussian version (both binary-Gaussian and
Gaussian-Gaussian) exists but is not implemented in the library. I
have personally had a harder time training real-valued latent
variable models, compared to binarized versions - if you can
"binarize" your problem it is worth trying that as a first shot.
One hack I have tried on other tasks is to use KMeans clustering
to get binary codes (by mapping data points to the nearest
cluster, then representing this with one-hot / LabelBinarizer
format). Then the RBM will give cluster indices, which can be
mapped back to cluster centers or made into "stochastic" units by
sampling from a Gaussian / RBF centered at the cluster center,
with some fixed variance you choose. This is kind of weird but
worked as well as could be expected for my task.
RBMs are still used to form latent variable models such as the
RNN-RBM for timeseries modeling (here
http://deeplearning.net/tutorial/rnnrbm.html), or in the spike and
slab RBM / DBN for texture modeling (here
https://ift6266h15.files.wordpress.com/2015/04/20_vae.pdf).
On Mon, Jul 27, 2015 at 5:17 AM, Mika S <siddhupi...@gmail.com
<mailto:siddhupi...@gmail.com>> wrote:
i am using scikit learn's RBM implementation. There are two
problems:
1.
The running time is O(d^2) where d is the number of
features. This becomes a problem in using high
dimensionality sparse features. Consider features that
come from feature hashing for instance.
2.
It only allows for binary visible features. Do I have to
change the sklearn code to have non binary units or there
is some trick that I am unaware of?
I am expecting RBMs with 4 features to have a better fit than
a mixture of 2 gaussians (that has a similar number of
parameters). Has anyone seen any experiments done on RBMs for
unsupervised modeling other than pretraining?
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general