There was a discussion
<http://www.mail-archive.com/scikit-learn-general@lists.sourceforge.net/msg06931.html>
on metric learning a while ago, and several people expressed interest to
see (and contribute to) it in sklearn. But, it looks like that attempt
didn't get anywhere.

What about a project to add several metric learning algorithms to be used
with KNN (NCA, ITML, LMNN, etc). Another application is data
transformation: most of the methods learn some PSD matrix A, so we can
transform the data by multiplying it by A^{-1/2}.

The starting point could be an ICML tutorial
<http://www.slideshare.net/zukun/metric-learning-icml2010-tutorial> by
Brian Kulis with a field overview.

On Tue, Feb 24, 2015 at 5:48 PM, Andy <t3k...@gmail.com> wrote:

> Hey Everybody.
>
> Here is my somewhat consolidated list of ideas with minor comments.
> If anything is missing, please let me know. Also, I don't think people
> who want to mentor spoke up yet.
> I'll remove all people listed on the wiki as they were copy and pasted
> from last year, and I'd rather have actual confirmation.
>
> Topics:
> DPGMM / VBGMM:  need to be reimplemented using more standard variational
> updates. The GMM is actually fine atm (after a couple of pending PRs)
>
> spearmint : Using random forest (they actually use ours) for
> hyperparameter optimization. I need to mull this over but I think this
> should be easy enough and pretty helpful.
>
> Online low-rank matrix completion : this is from last year and I'm not
> sure if it is still desirable / don't know the state of the PR
>
> Multiple metric support : This is somewhat API heavy but I think
>
> PLS/CCA : They need love so very much, but I'm not sure we have a mentor
> (if there is one, please speak up!):q
>
> Ensemble Clusters : Proposed by a possible student (Milton) but I think
> it is interesting.
>
> Semi-Supervised Learning : Meta-estimator for self-taught learning. Not
> sure if there is actually much demand for it, but would be nice.
>
> Additive models: Proposed by ragv, but I'm actually not that sold. We
> could include pyearth, but I'm not sure how valuable the other methods
> are. Including a significant amount of algorithms just for completeness
> is not something I feel great about.
>
>
> That being said, ragv has put in a tremendous amount of great work and I
> feel we should definitely find a project for him (as he seems interested).
>
>
> Things that I think shouldn't be GSOC projects:
>
> GPs : Jan Hendrik is doing an awesome job there.
> MLP : Will be finished soon, either by me or possibly by ragv
> data-independent cross-validation : already a bunch of people working on
> that, I don't think we should make it GSOC.
>
> Feedback welcome.
>
> Andy
>
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for
> all
> things parallel software development, from weekly thought leadership blogs
> to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to