Hi Wei Xue.
I am also not very convinced by the core-set approach.
I'd rather focus on improving the API and fixing issues in the VBGMM and DPGMM. I was hoping that Murphy's book has some more details on DPGMM, but I didn't find any yet. He doesn't seem to talk about variational inference in Dirichlet processes.

So far I think your proposal looks solid.
It would be great if you could work on some pull requests to support your application.

Best,
Andy


On 03/16/2015 04:23 PM, Wei Xue wrote:
Hi groups,

I am a PhD student in Florida International University, US. I am interested in the topic improving GMM. I draft a proposal for this topic.
https://github.com/xuewei4d/scikit-learn/wiki/GSoC-2015-Proposal:-Improve-GMM

Here are some questions I would like to discuss.

1. -1 for coreset. The paper(http://las.ethz.ch/files/feldman11scalable-long.pdf) is new and its citations less than 15. The application situations are on clusters, streaming data, which is (I think) is rare for scikit-learn.

2. Currently, I have gone over the Approximation Inference chapter in PRML (Bishop's machine learning book) and Blei's 2006 paper. But I have not dig much into the code, so I don't have a detailed reimplement plan yet. Do I need to add more details into the 'Theory and Implementation' part of the proposal?

3. Any feedback is welcome.

Thanks,
Wei Xue


------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to