Re: [Scikit-learn-general] [GSoC2015 Improve GMM module]

Andreas Mueller Tue, 19 May 2015 08:09:07 -0700

Hey Wei Xue.
Thanks for posting the blog post!

I think you are right, for diag and tied you can just use gammadistributions, which makes everything easier.Oliver and Loic, it would be great if you found the time to comment onthe blog-post and future direction!


Thanks!
Andy

On 05/18/2015 04:04 PM, Wei Xue wrote:

Dear Olivier, Loic and group,
I feel very excited to be selected as a GSoC student this year. Thankyou very much.
Following the timeline in my proposal, I have published the first post<http://xuewei4d.github.io/gsoc/2015/05/08/gsoc-prelude.html>introducing this project i.e., 'Improve GMM module'.
My first step is to derive the updating functions for VBGMM for fourtypes of covariance matrix, namely, sphere, diag, tied, and full.Following PRML chapter 10 variational inference, I have verified theupdating functions 10.60-10.67 using Gaussian-Wishart distribution asan approximation distribution. The derivation involving Wishartdistribution is cumbersome. :|
I am currently trying to get equations for other three types ofcovariance types, 'sphere', 'diag', 'tied' in VBGMM. After digginginto the Wishart distribution, I think for 'full' covariance, theapproximate distribution is Gaussian-Wishart distribution, but for'sphere' and 'diag' covariance, it is not. In this case, themultivariate Gaussian distribution could be decomposed into theproduction of several univariate Gaussian distribution. Therefore, weshould use multiple Gaussian-Gamma distribution for approximation.Working on that. Also I am going to start thinking of API conventionfor all three models. Among the issues related API I listed in myproposal, I think 4429<https://github.com/scikit-learn/scikit-learn/issues/4429> and 4062<https://github.com/scikit-learn/scikit-learn/issues/4062> need morediscussion.
To answer a common question 'what is a good outcome?', I would like tosay that, in priority order, the three models should 1) be implementedcorrectly (in math), 2) have clean APIs, 3) pass test cases(especially for the last two models), 4) be benchmarked and have speedtuning with respect to existing implementation.
Any comment is welcome.

BTW, I will keep this thread for all the following work.

Cheers,
Wei Xue


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] [GSoC2015 Improve GMM module]

Reply via email to