Hey Wei Xue.
Thanks for posting the blog post!
I think you are right, for diag and tied you can just use gamma
distributions, which makes everything easier.
Oliver and Loic, it would be great if you found the time to comment on
the blog-post and future direction!
Thanks!
Andy
On 05/18/2015 04:04 PM, Wei Xue wrote:
Dear Olivier, Loic and group,
I feel very excited to be selected as a GSoC student this year. Thank
you very much.
Following the timeline in my proposal, I have published the first post
<http://xuewei4d.github.io/gsoc/2015/05/08/gsoc-prelude.html>
introducing this project i.e., 'Improve GMM module'.
My first step is to derive the updating functions for VBGMM for four
types of covariance matrix, namely, sphere, diag, tied, and full.
Following PRML chapter 10 variational inference, I have verified the
updating functions 10.60-10.67 using Gaussian-Wishart distribution as
an approximation distribution. The derivation involving Wishart
distribution is cumbersome. :|
I am currently trying to get equations for other three types of
covariance types, 'sphere', 'diag', 'tied' in VBGMM. After digging
into the Wishart distribution, I think for 'full' covariance, the
approximate distribution is Gaussian-Wishart distribution, but for
'sphere' and 'diag' covariance, it is not. In this case, the
multivariate Gaussian distribution could be decomposed into the
production of several univariate Gaussian distribution. Therefore, we
should use multiple Gaussian-Gamma distribution for approximation.
Working on that. Also I am going to start thinking of API convention
for all three models. Among the issues related API I listed in my
proposal, I think 4429
<https://github.com/scikit-learn/scikit-learn/issues/4429> and 4062
<https://github.com/scikit-learn/scikit-learn/issues/4062> need more
discussion.
To answer a common question 'what is a good outcome?', I would like to
say that, in priority order, the three models should 1) be implemented
correctly (in math), 2) have clean APIs, 3) pass test cases
(especially for the last two models), 4) be benchmarked and have speed
tuning with respect to existing implementation.
Any comment is welcome.
BTW, I will keep this thread for all the following work.
Cheers,
Wei Xue
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general