Hi Immanuel,

My gut feeling about your project is that it is an interesting proposal,
but idealy a GSOC project should be more ambitious than a single
algorithm. You could consider a full application problem that the
algorithm is trying to solve and contribute a few different algorithms.
This is what Vlad did last year, with different matrix
factorization/dictionary learning algorithms, and it was very succesful.

Thanks a lot for your proposal,

Gaël

On Tue, Mar 20, 2012 at 08:51:58PM +0100, Immanuel wrote:
> Hello all,

> I followed the mailing list and poked around in the source code for the
> last couple of week.
> Now, I'm absolutely sure that I would enjoy to work on scikit-learn as
> GSoC project.

> I especially like the proposed online NMF project, could you enlighten
> me on the following points?

> There was some discussion about the integration of some NMF code in
> scikit-learn. How will
> this influence the proposed online NMF project?

> @Vlad
> Looks like we have the same interest, I like the robust PCA project too.
> Have you already
> a preference? I guess it makes little sense to pitch against you ;).

> @Olivier
> I did some preliminary reading on the topic and found the following
> paper interesting:
> "Efficient Document Clustering via Online Nonnegative Matrix Factorizations"
> source: http://research.microsoft.com/apps/pubs/default.aspx?id=143211

> It claims:
> * to efficiently handle very large-scale and/or streaming datasets
> * low memory consumption
> Different algorithm versions are presented in the paper. I don't now
> which one would be the most attractive for scikit.


> Finally, some words about me:
> I'm a student at the RWTH Aachen University (Germany) enrolled in
> Computational
> Engineering Science. Currently writing my diploma theses (master
> equivalent) on
> a bioinformatic topic using machine learning techniques. I took classes
> in machine learning,
> optimization, stats, data based modelling etc. I worked as student
> research assistant, doing implementations
> for different projects.

> best,
> Immanuel Bayer

> ------------------------------------------------------------------------------
> This SF email is sponsosred by:
> Try Windows Azure free for 90 days Click Here 
> http://p.sf.net/sfu/sfd2d-msazure
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

-- 
    Gael Varoquaux
    Researcher, INRIA Parietal
    Laboratoire de Neuro-Imagerie Assistee par Ordinateur
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to