Something like this:

class SimilarityTransformer(TransformerMixin):
    def fit(self, X, y):
        self.X_ = X; return self

   def transform(self, X):
       return -euclidean_distances(X, self.X_)

On Thu, Mar 26, 2015 at 6:28 PM, Artem <barmaley....@gmail.com> wrote:

> Yes, the only need for such similarity learners is to use them in a
> pipeline. It's especially convenient if one wants to do non-linear metric
> learning using Kernel PCA trick. Then it'd be just another step in the
> pipeline.
>
> What do you mean by a generic transformer? In order to be usable in a
> pipeline, it needs to be fit-able. Do you mean a wrapper like
> OneVsRestClassifier?
>
> The reason I included similarities is that I want to bring some
> supervision into clustering by introducing meaningful metric. AFAIK, at the
> moment only `AgglomerativeClustering` works well with a custom metric, and
> Spectral Clustering and Affinity Propagation can work with a [n_samples,
> n_samples] affinity matrix.
>
> On Thu, Mar 26, 2015 at 12:08 PM, Mathieu Blondel <math...@mblondel.org>
> wrote:
>
>>
>>
>> On Thu, Mar 26, 2015 at 5:49 PM, Artem <barmaley....@gmail.com> wrote:
>>
>>> 1. Right, forgot to add that parameter. Well, I can apply an RBF kernel
>>> to get a similarity matrix from a distance matrix inside transform.
>>>
>>> 2. Usual transformer returns neither distance, nor similarity, but
>>> transforms the input space so that usual Euclidean distance acts like the
>>> learned Mahalanobis.
>>>
>>
>> I'd really try to avoid duplicating all classes. As you said the
>> Euclidean distance can be used on the transformed data. So we can get a
>> similarity matrix in just two lines:
>>
>> X_transformed = LMNN().fit_transform(X, y)
>> S = -euclidean_distances(X_transformed)
>>
>> The only benefit I see of being able to transform to a similarity matrix
>> is for pipelines. This can be done as I said using a generic transformer X
>> -> S. However, I am not completely sure this is even needed since all our
>> algorithms work on X of shape [n_samples, n_features] by default.
>>
>> M.
>>
>>
>>
>
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to