Re: [Scikit-learn-general] [GSoC] Metric Learning

Aurélien Bellet Mon, 23 Mar 2015 16:40:49 -0700

Hi everyone,

I don't know a lot about scikit-learn but perhaps I can help answer some 
of the questions about metric learning:


- Like someone mentioned, any Mahalanobis distance metric can be used to 
linearly project data into a new space (based on the square root of the 
learned PSD matrix) where the Euclidean distance is equivalent. This can 
used as a transformer in scikit-learn.

- LMNN, NCA and ITML are indeed the most standard algorithms in metric 
learning and work well in practice (although they may not scale too 
well). Starting with these makes sense.

- Someone said it would be nice to have a more scalable method. I would 
recommend OASIS 
(http://www.jmlr.org/papers/volume11/chechik10a/chechik10a.pdf), which 
scales well to large datasets due to its simple online algorithm. Note 
that it learns a similarity function, not a distance. However it can 
still be used to transform the data if the learned matrix is projected 
onto the PSD cone - then the dot product in the new space is equivalent 
to the learned similarity (see discussion in Section 6 of the paper). 
Other popular online methods include LEGO 
(https://www.cs.utexas.edu/~pjain/pubs/online_nips.pdf) and RDML 
(http://www.cse.msu.edu/~rongjin/publications/nips10-dist-learn.pdf)

- The KPCA trick is a convenient method to make a metric learning 
algorithm nonlinear. Theoretical justification does not hold for all 
algorithms but in practice it is a preprocessing applied to the data 
before running the metric learning algorithm so it can be used together 
with any method. There are also methods that directly learn a nonlinear 
distance, for instance GB-LMNN 
(http://www-bcf.usc.edu/~feisha/pubs/chi2.pdf) or some approaches based 
on deep neural nets.

Aurélien

Le 3/23/15 11:43 PM, Andreas Mueller a écrit :
> Hi Artem.
> I thought that was you, but I wasn't sure.
> Great, I linked to your draft from the wiki overview page, otherwise it
> is hard to find.
> I haven't looked at it in detail yet, though.
>
> 1.1: no, generalizing K-Means is out of scope. Hierarchical should work
> with arbitrary metrics.
> 1.2: matrix-like Y should actually be fine with cross-validation. I
> think it would be nice if we could get some benefit by having a
> classification-like y, but I'm not opposed to also allowing matrix Y.
>
> 2. I'd have to look into it. I don't understand why KPCA wouldn't work.
> It should work for all metrics, right? Having something produce a
> similarity matrix is not ideal, but I think it could be made to work.
> I'd still call it ``transform`` probably, though. It would be a bit
> confusing because it uses the squared transform, but it would make it
> possible to build pipelines with clustering algorithms.
>
> Best,
> Andy
>
>
> On 03/23/2015 06:31 PM, Artem wrote:
>> Hi Andreas
>>
>> My GitHub's name is Barmaley-exe. I put a draft
>> <https://github.com/scikit-learn/scikit-learn/wiki/%5BWIP%5D-GSoC-2015-Proposal:-Metric-Learning-module>
>> of my proposal on wiki, but there are still several unanswered questions:
>>
>>  1. One of the applications of metric learning I envision is a
>>     "somewhat-supervised" clustering, where user can seed in some
>>     knowledge, and then use the resultant metric in clustering. To get
>>     it working following is needed:
>>      1. DistanceMetric-aware Clustering. Turned out, there are already
>>         methods that can do clustering on a similarity matrix, but
>>         should I generalize KMeans / Hierarchical clustering?
>>      2. General scheme of training would require matrix-like y (Like
>>         the one proposed by Joel). What is the consensus on that?
>>  2. Though 2 of 3 methods that are planned to implement are
>>     kernelizable by KPCA, the last one (ITML) is not. So if I
>>     implement it (ITML with a kernel trick), it'd be impossible to
>>     transform the data space. Thus, it won't work as a Transformer.
>>     This problem can be fixed by making it not a Transformer, but an
>>     Estimator that would predict a similarity matrix. What do you think?
>>
>>
>> On Tue, Mar 24, 2015 at 1:09 AM, Andreas Mueller <t3k...@gmail.com
>> <mailto:t3k...@gmail.com>> wrote:
>>
>>     Hi Artem.
>>     I think the overall feedback on your proposal was positive.
>>     Did you get the chance to write it up yet?
>>     Please submit your proposal on melange
>>     https://www.google-melange.com (deadline is this Friday)
>>     and mention / link it in our wiki:
>>     
>> https://github.com/scikit-learn/scikit-learn/wiki/Google-summer-of-code-%28GSOC%29-2015
>>
>>     Btw, what is your github name?
>>
>>     Andy
>>
>>     On 03/18/2015 08:39 AM, Artem wrote:
>>>     Hello everyone
>>>
>>>     Recently I mentioned metric learning as one of possible projects
>>>     for this years' GSoC, and would like to hear your comments.
>>>
>>>     Metric learning, as follows from the name, is about learning
>>>     distance functions. Usually the metric that is learned is a
>>>     Mahalanobis metric, thus the problem reduces to finding a PSD
>>>     matrix A that minimizes some functional.
>>>
>>>     Metric learning is usually done in a supervised way, that is, a
>>>     user tells which points should be closer and which should be more
>>>     distant. It can be expressed either in form of "similar" /
>>>     "dissimilar", or "A is closer to B than to C".
>>>
>>>     Since metric learning is (mostly) about a PSD matrix A, one can
>>>     do Cholesky decomposition on it to obtain a matrix G to transform
>>>     the data. It could lead to something like guided clustering,
>>>     where we first transform the data space according to our prior
>>>     knowledge of similarity.
>>>
>>>     Metric learning seems to be quite an active field of research ([1
>>>     <http://www.icml2010.org/tutorials.html>], [2
>>>     <http://www.ariel.ac.il/sites/ofirpele/DFML_ECCV2010_tutorial/>],
>>>     [3 <http://nips.cc/Conferences/2011/Program/event.php?ID=2543>]).
>>>     There are 2 somewhat up-to date surveys: [1
>>>     <http://web.cse.ohio-state.edu/%7Ekulis/pubs/ftml_metric_learning.pdf>]
>>>     and [2 <http://arxiv.org/abs/1306.6709>].
>>>
>>>     Top 3 seemingly most cited methods (according to Google Scholar) are
>>>
>>>       * MMC by Xing et al.
>>>         
>>> <http://papers.nips.cc/paper/2164-distance-metric-learning-with-application-to-clustering-with-side-information.pdf>
>>>  This
>>>         is a pioneering work and, according to the survey #2
>>>
>>>             The algorithm used to solve (1) is a simple projected
>>>             gradient approach requiring the full
>>>              
>>>             eigenvalue decomposition of
>>>              
>>>             M
>>>              
>>>             at each iteration. This is typically intractable for medium
>>>              
>>>             and high-dimensional problems
>>>
>>>       * Large Margin Nearest Neighbor by Weinberger et al
>>>         
>>> <http://papers.nips.cc/paper/2795-distance-metric-learning-for-large-margin-nearest-neighbor-classification.pdf>.
>>>         The survey 2 acknowledges this method as "one of the most
>>>         widely-used Mahalanobis distance learning methods"
>>>
>>>             LMNN generally performs very well in practice, although
>>>             it is sometimes prone to overfitting due to the absence
>>>             of regularization, especially in high dimension
>>>
>>>       * Information-theoretic metric learning by Davis et al.
>>>         <http://dl.acm.org/citation.cfm?id=1273523> This one features
>>>         a special kind of regularizer called logDet.
>>>       * There are many other methods. If you guys know that other
>>>         methods rock, let me know.
>>>
>>>
>>>     So the project I'm proposing is about implementing 2nd or 3rd (or
>>>     both?) algorithms along with a relevant transformer.
>>>
>>>
>>>     
>>> ------------------------------------------------------------------------------
>>>     Dive into the World of Parallel Programming The Go Parallel Website, 
>>> sponsored
>>>     by Intel and developed in partnership with Slashdot Media, is your hub 
>>> for all
>>>     things parallel software development, from weekly thought leadership 
>>> blogs to
>>>     news, videos, case studies, tutorials and more. Take a look and join the
>>>     conversation now.http://goparallel.sourceforge.net/
>>>
>>>
>>>     _______________________________________________
>>>     Scikit-learn-general mailing list
>>>     Scikit-learn-general@lists.sourceforge.net  
>>> <mailto:Scikit-learn-general@lists.sourceforge.net>
>>>     https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>>     
>> ------------------------------------------------------------------------------
>>     Dive into the World of Parallel Programming The Go Parallel
>>     Website, sponsored
>>     by Intel and developed in partnership with Slashdot Media, is your
>>     hub for all
>>     things parallel software development, from weekly thought
>>     leadership blogs to
>>     news, videos, case studies, tutorials and more. Take a look and
>>     join the
>>     conversation now. http://goparallel.sourceforge.net/
>>     _______________________________________________
>>     Scikit-learn-general mailing list
>>     Scikit-learn-general@lists.sourceforge.net
>>     <mailto:Scikit-learn-general@lists.sourceforge.net>
>>     https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Dive into the World of Parallel Programming The Go Parallel Website, 
>> sponsored
>> by Intel and developed in partnership with Slashdot Media, is your hub for 
>> all
>> things parallel software development, from weekly thought leadership blogs to
>> news, videos, case studies, tutorials and more. Take a look and join the
>> conversation now.http://goparallel.sourceforge.net/
>>
>>
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website, sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for all
> things parallel software development, from weekly thought leadership blogs to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
>
>
>
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] [GSoC] Metric Learning

Reply via email to