Have a look at Russ Salakhutdinov's thesis for work on density modelling.
The problem is that it is impossible to compute the partition function, and therefore you can only get unnormalized densities.

On 07/27/2015 12:49 PM, Mika S wrote:
Thanks, this is helpful.

I have seen RBMs only in pretraining for supervised predictions and was wondering why they are not used in density estimation. Especially, that there is an algorithm (CD) to train them in a reasonable amount of time. Hierarchical bayes might be a better choice for a larger number of parameters like RBM but it would also involve MCMC iterations. Any thoughts?

On Mon, Jul 27, 2015 at 6:18 AM, Kyle Kastner <kastnerk...@gmail.com <mailto:kastnerk...@gmail.com>> wrote:

    RBMs are a factorization of a generally intractable problem - as
    you mention it is still O(n**2) but much better than the
    combinatorial brute force thing that the RBM factorization
    replaces. There might be faster RBM algorithms around but I don't
    know of any faster implementations that don't use GPU code. There
    might be specific RBMs for sparse data, but in general RBMs are
    designed for latent factor discovery in dense, low-ish dimensional
    (1000 - 10000 features) input data.

    The current sklearn code for RBMs is just binary-binary, as you
    mention. The Gaussian version (both binary-Gaussian and
    Gaussian-Gaussian) exists but is not implemented in the library. I
    have personally had a harder time training real-valued latent
    variable models, compared to binarized versions - if you can
    "binarize" your problem it is worth trying that as a first shot.

    One hack I have tried on other tasks is to use KMeans clustering
    to get binary codes (by mapping data points to the nearest
    cluster, then representing this with one-hot / LabelBinarizer
    format). Then the RBM will give cluster indices, which can be
    mapped back to cluster centers or made into "stochastic" units by
    sampling from a Gaussian / RBF centered at the cluster center,
    with some fixed variance you choose. This is kind of weird but
    worked as well as could be expected for my task.

    RBMs are still used to form latent variable models such as the
    RNN-RBM for timeseries modeling (here
    http://deeplearning.net/tutorial/rnnrbm.html), or in the spike and
    slab RBM / DBN for texture modeling (here
    https://ift6266h15.files.wordpress.com/2015/04/20_vae.pdf).





    On Mon, Jul 27, 2015 at 5:17 AM, Mika S <siddhupi...@gmail.com
    <mailto:siddhupi...@gmail.com>> wrote:

        i am using scikit learn's RBM implementation. There are two
        problems:

        1.

            The running time is O(d^2) where d is the number of
            features. This becomes a problem in using high
            dimensionality sparse features. Consider features that
            come from feature hashing for instance.

        2.

            It only allows for binary visible features. Do I have to
            change the sklearn code to have non binary units or there
            is some trick that I am unaware of?

        I am expecting RBMs with 4 features to have a better fit than
        a mixture of 2 gaussians (that has a similar number of
        parameters). Has anyone seen any experiments done on RBMs for
        unsupervised modeling other than pretraining?



        
------------------------------------------------------------------------------

        _______________________________________________
        Scikit-learn-general mailing list
        Scikit-learn-general@lists.sourceforge.net
        <mailto:Scikit-learn-general@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



    
------------------------------------------------------------------------------

    _______________________________________________
    Scikit-learn-general mailing list
    Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




------------------------------------------------------------------------------


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to