Dear Olivier, Loic and group,
I feel very excited to be selected as a GSoC student this year. Thank you
very much.
Following the timeline in my proposal, I have published the first post
<http://xuewei4d.github.io/gsoc/2015/05/08/gsoc-prelude.html> introducing
this project i.e., 'Improve GMM module'.
My first step is to derive the updating functions for VBGMM for four types
of covariance matrix, namely, sphere, diag, tied, and full. Following PRML
chapter 10 variational inference, I have verified the updating functions
10.60-10.67 using Gaussian-Wishart distribution as an approximation
distribution. The derivation involving Wishart distribution is cumbersome.
:|
I am currently trying to get equations for other three types of covariance
types, 'sphere', 'diag', 'tied' in VBGMM. After digging into the Wishart
distribution, I think for 'full' covariance, the approximate distribution
is Gaussian-Wishart distribution, but for 'sphere' and 'diag' covariance,
it is not. In this case, the multivariate Gaussian distribution could be
decomposed into the production of several univariate Gaussian distribution.
Therefore, we should use multiple Gaussian-Gamma distribution for
approximation. Working on that. Also I am going to start thinking of API
convention for all three models. Among the issues related API I listed in
my proposal, I think 4429
<https://github.com/scikit-learn/scikit-learn/issues/4429> and 4062
<https://github.com/scikit-learn/scikit-learn/issues/4062> need more
discussion.
To answer a common question 'what is a good outcome?', I would like to say
that, in priority order, the three models should 1) be implemented
correctly (in math), 2) have clean APIs, 3) pass test cases (especially
for the last two models), 4) be benchmarked and have speed tuning with
respect to existing implementation.
Any comment is welcome.
BTW, I will keep this thread for all the following work.
Cheers,
Wei Xue
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general