Hi Wei Xue.
I am also not very convinced by the core-set approach.
I'd rather focus on improving the API and fixing issues in the VBGMM and
DPGMM.
I was hoping that Murphy's book has some more details on DPGMM, but I
didn't find any yet. He doesn't seem to talk about variational inference
in Dirichlet processes.
So far I think your proposal looks solid.
It would be great if you could work on some pull requests to support
your application.
Best,
Andy
On 03/16/2015 04:23 PM, Wei Xue wrote:
Hi groups,
I am a PhD student in Florida International University, US. I am
interested in the topic improving GMM. I draft a proposal for this topic.
https://github.com/xuewei4d/scikit-learn/wiki/GSoC-2015-Proposal:-Improve-GMM
Here are some questions I would like to discuss.
1. -1 for coreset. The
paper(http://las.ethz.ch/files/feldman11scalable-long.pdf) is new and
its citations less than 15. The application situations are on
clusters, streaming data, which is (I think) is rare for scikit-learn.
2. Currently, I have gone over the Approximation Inference chapter in
PRML (Bishop's machine learning book) and Blei's 2006 paper. But I
have not dig much into the code, so I don't have a detailed
reimplement plan yet. Do I need to add more details into the 'Theory
and Implementation' part of the proposal?
3. Any feedback is welcome.
Thanks,
Wei Xue
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general