>
> ​
> Do you think this interface would be useful enough?

​One of mentioned methods (LMNN) actually uses prior knowledge in exactly
the same way, by comparing labels' equality. Though, it was designed to
facilitate KNN. ​
​
​Authors of the other one (ITML) explicitly mention in the paper that one
can construct those sets S and D from labels.

Do you think it would make sense to use such a transformer in a pipeline
> with a KNN classifier?
> I feel that training both on the same labels might be a bit of an issue
> with overfitting

Pipelining looks like a good way to combine these methods, but overfitting
could be a problem, indeed.
Not sure how severe it can be.

On Wed, Mar 18, 2015 at 10:07 PM, Andreas Mueller <t3k...@gmail.com> wrote:

>
> On 03/18/2015 02:53 PM, Artem wrote:
>
>  I mean that if we were solving classification, we would have y that
> tells us which class each example belongs to. So if we pass this
> classification's ground truth vector y to metric learning's fit, we can
> form S and D inside by saying that observations from the same class should
> be similar.
>
> Ah, I got it now.
>
>
>  ​
>> Only being able to "transform" to a distance to the training set is a bit
>> limiting
>
> ​Sorry, I don't understand what you mean by this. Can you elaborate?​
> ​
> ​
>  The metric does not memorize training samples, it finds a (linear unless
> kernelized) transformation that makes similar examples cluster together.
> Moreover, since the metric is completely determined by a PSD matrix, we can
> compute its square root, and use to transform new data without any
> supervision.​
>
> Ah, I think I misunderstood your proposal for the transformer interface.
> Never mind.
>
>
> Do you think this interface would be useful enough? I can think of a
> couple of applications.
> It would definitely fit well into the current scikit-learn framework.
>
> Do you think it would make sense to use such a transformer in a pipeline
> with a KNN classifier?
> I feel that training both on the same labels might be a bit of an issue
> with overfitting.
>
>
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for
> all
> things parallel software development, from weekly thought leadership blogs
> to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to