Dear all,

I published a new post
<http://xuewei4d.github.io/2015/06/05/gsoc-week2-vbgmm-and-gmm-api.html> in
my blog for the second week. Basically, I have done the derivation of VBGMM
and working on cleaning GMM API.

Thanks,
Wei

On Thu, May 28, 2015 at 12:12 PM, Andreas Mueller <t3k...@gmail.com> wrote:

>  Hi Wei Xue.
>
> I think 1) sounds like a good idea.
> For 3) I think we should deprecate params. Deprecating doesn't mean
> changing users' behavior. It means giving them time to adjust.
>
> For 4) I am unsure.
>
> The bottom of the user guide here:
> http://scikit-learn.org/dev/modules/mixture.html
> has a link to the derivation here:
> http://scikit-learn.org/dev/modules/dp-derivation.html
>
> Cheers,
> Andy
>
>
>
> On 05/27/2015 07:08 PM, Wei Xue wrote:
>
>  Hi Olivier, Loïc, Andreas and group,
>
>  I have been thinking over the API convention for GMM. The discussion on
> issue #2473 <https://github.com/scikit-learn/scikit-learn/issues/2473>,
> #4062 <https://github.com/scikit-learn/scikit-learn/issues/4062> points
> out the inconsistency on ``score_ sample``, ``score``. So I changed and
> made a new API interface of some functions in the ipython notebook
> <http://nbviewer.ipython.org/gist/xuewei4d/de5492d0320eed561b78/GMM_API.ipynb?flush_cache=true>.
> In summary,
>
>  1) create a density mixin class, which contains ``score`` and
> ``density``,
>
>  2) make ``score_sample`` return only the log probability of each data
> instance,
>
>  3) I am not sure we should deprecate ``params='wmc'``. @Andreas pointed
> out that ``params`` would cause strange estimation of GMM, but it is not
> good to change users' behavior.
>
>  4) Rename GMM, VBGMM and DPGMM to GaussianMixture, VBGaussianMixture,
> and DPGaussianMixture? (DirichletProcessGaussianMixture is quite lengthy)
> So any comment? And do you like to discuss on a github issue or here?
>
>  I don't quite understand how the current implementation of DPGMM and
> VBGMM works now, couldn't find any doc about the current implementation of
> DPGMM at all. But I have been working on derivation of VBGMM for a while,
> and have written 4 pdf pages full of equations. I think there will be 10
> pages for all four kinds of covariance matrix. Upon I finish that, I will
> upload it to my blog.
>
>
>  Thanks,
> Wei Xue
>
>
>
> On Tue, May 19, 2015 at 11:07 AM, Andreas Mueller <t3k...@gmail.com>
> wrote:
>
>>  Hey Wei Xue.
>> Thanks for posting the blog post!
>> I think you are right, for diag and tied you can just use gamma
>> distributions, which makes everything easier.
>> Oliver and Loic, it would be great if you found the time to comment on
>> the blog-post and future direction!
>>
>> Thanks!
>> Andy
>>
>>
>> On 05/18/2015 04:04 PM, Wei Xue wrote:
>>
>>   Dear Olivier, Loic and group,
>>
>>  I feel very excited to be selected as a GSoC student this year. Thank
>> you very much.
>>
>>  Following the timeline in my proposal, I have published the first post
>> <http://xuewei4d.github.io/gsoc/2015/05/08/gsoc-prelude.html>
>> introducing this project i.e., 'Improve GMM module'.
>>
>>  My first step is to derive the updating functions for VBGMM for four
>> types of covariance matrix, namely, sphere, diag, tied, and full. Following
>> PRML chapter 10 variational inference, I have verified the updating
>> functions 10.60-10.67 using Gaussian-Wishart distribution as an
>> approximation distribution. The derivation involving Wishart distribution
>> is cumbersome. :|
>>
>>  I am currently trying to get equations for other three types of
>> covariance types, 'sphere', 'diag', 'tied' in VBGMM. After digging into the
>> Wishart distribution, I think for 'full' covariance, the approximate
>> distribution is Gaussian-Wishart distribution, but for 'sphere' and 'diag'
>> covariance, it is not. In this case, the multivariate Gaussian distribution
>> could be decomposed into the production of several univariate Gaussian
>> distribution. Therefore, we should use multiple Gaussian-Gamma distribution
>> for approximation. Working on that. Also I am going to start thinking of
>> API convention for all three models. Among the issues related API I listed
>> in my proposal, I think 4429
>> <https://github.com/scikit-learn/scikit-learn/issues/4429> and 4062
>> <https://github.com/scikit-learn/scikit-learn/issues/4062> need more
>> discussion.
>>
>>  To answer a common question 'what is a good outcome?', I would like to
>> say that, in priority order, the three models should 1) be implemented
>> correctly (in math), 2) have clean APIs, 3)  pass test cases (especially
>> for the last two models), 4) be benchmarked and have speed tuning with
>> respect to existing implementation.
>>
>>  Any comment is welcome.
>>
>>  BTW, I will keep this thread for all the following work.
>>
>>  Cheers,
>> Wei Xue
>>
>>
>>  
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM 
>> Insight.http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>
>>
>>
>> _______________________________________________
>> Scikit-learn-general mailing 
>> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to