On 03/21/2015 08:54 PM, Artem wrote:
Are there any objections on Joel's variant of y? It serves my needs,
but is quite different from what one can usually find in scikit-learn.
------
Another point I want to bring up is metric-aware KMeans. Currently it
works with Euclidean distance only, which is not a problem for a
Mahalanobis distance, but as (and if) we move towards kernel metrics,
it becomes impossible to transform the data in a way that the
Euclidean distance between the transformed points accurately reflects
the distance between the points in a space with the learned metric.
I think it'd nice to have "non-linear" metrics, too. One of the
possible approaches (widely recognized among researchers on metric
learning) is to use KernelPCA before learning the metric. This would
work really well with sklearn's Pipelines.
But not all the methods are justified to be used with Kernel PCA.
Namely, ITML uses a special kind of regularization that breaks all
theoretical guarantees.
This can also be done using the Nystroem kernel approximation class,
which just transforms data into the subspace of the Hilbert space
spanned by the training examples (or a subset of these).
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general