RBMs are used for density estimation still - it is just that they are
limited in what they can model in my experience. You should look into the
VAE thing (linked following the texture modeling stuff I sent above) if you
are interested in density modeling - they are pretty cool and seem to do
density estimation pretty well.

Another trick you can try is some kind of tree for binarization - see
@ogrisel 's example here
http://nbviewer.ipython.org/github/ogrisel/notebooks/blob/master/sklearn_demos/Income%20classification.ipynb

On Mon, Jul 27, 2015 at 1:49 PM, Mika S <siddhupi...@gmail.com> wrote:

> Thanks, this is helpful.
>
> I have seen RBMs only in pretraining for supervised predictions and was
> wondering why they are not used in density estimation. Especially, that
> there is an algorithm (CD) to train them in a reasonable amount of time.
> Hierarchical bayes might be a better choice for a larger number of
> parameters like RBM but it would also involve MCMC iterations. Any
> thoughts?
>
> On Mon, Jul 27, 2015 at 6:18 AM, Kyle Kastner <kastnerk...@gmail.com>
> wrote:
>
>> RBMs are a factorization of a generally intractable problem - as you
>> mention it is still O(n**2) but much better than the combinatorial brute
>> force thing that the RBM factorization replaces. There might be faster RBM
>> algorithms around but I don't know of any faster implementations that don't
>> use GPU code. There might be specific RBMs for sparse data, but in general
>> RBMs are designed for latent factor discovery in dense, low-ish dimensional
>> (1000 - 10000 features) input data.
>>
>> The current sklearn code for RBMs is just binary-binary, as you mention.
>> The Gaussian version (both binary-Gaussian and Gaussian-Gaussian) exists
>> but is not implemented in the library. I have personally had a harder time
>> training real-valued latent variable models, compared to binarized versions
>> - if you can "binarize" your problem it is worth trying that as a first
>> shot.
>>
>> One hack I have tried on other tasks is to use KMeans clustering to get
>> binary codes (by mapping data points to the nearest cluster, then
>> representing this with one-hot / LabelBinarizer format). Then the RBM will
>> give cluster indices, which can be mapped back to cluster centers or made
>> into "stochastic" units by sampling from a Gaussian / RBF centered at the
>> cluster center, with some fixed variance you choose. This is kind of weird
>> but worked as well as could be expected for my task.
>>
>> RBMs are still used to form latent variable models such as the RNN-RBM
>> for timeseries modeling (here
>> http://deeplearning.net/tutorial/rnnrbm.html), or in the spike and slab
>> RBM / DBN for texture modeling (here
>> https://ift6266h15.files.wordpress.com/2015/04/20_vae.pdf).
>>
>>
>>
>>
>>
>> On Mon, Jul 27, 2015 at 5:17 AM, Mika S <siddhupi...@gmail.com> wrote:
>>
>>> i am using scikit learn's RBM implementation. There are two problems:
>>>
>>>    1.
>>>
>>>    The running time is O(d^2) where d is the number of features. This
>>>    becomes a problem in using high dimensionality sparse features. Consider
>>>    features that come from feature hashing for instance.
>>>    2.
>>>
>>>    It only allows for binary visible features. Do I have to change the
>>>    sklearn code to have non binary units or there is some trick that I am
>>>    unaware of?
>>>
>>> I am expecting RBMs with 4 features to have a better fit than a mixture
>>> of 2 gaussians (that has a similar number of parameters). Has anyone seen
>>> any experiments done on RBMs for unsupervised modeling other than
>>> pretraining?
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to