On 21 September 2011 09:50, Gael Varoquaux <[email protected]>wrote:
> On Tue, Sep 20, 2011 at 04:25:58PM -0700, Jacob VanderPlas wrote:
> > I recently was contacted by someone interested in using manifold
> > learning methods on abstract metric spaces: that is, the training data
> > is a matrix of pairwise distances rather than a set of points. It would
> > be fairly straightforward to implement this for basic LLE and Isomap,
> > and could probably be done for the other manifold methods as well. Two
> > questions:
>
> Just a quick answer from someone who does too many things:
>
> - It is a general pattern that can be found with many other algorithms,
> therefore I think that it should be in the scikit
>
> - I don't know what interface is the right, but the problem pops up at
> many different places in the scikit, and we should give it some
> thoughts.
>
> my 2 cents,
>
> G
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2dcopy1
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
I do really like the metric='precomputed' concept, which allows both
implementing actual metrics (euclidean, manhattan) as well as passing a
precomputed array in. If the algorithm doesn't allow it for whatever reason*
throw an error. The same interface works with kernels as well.
* k-means springs to mind - its only 'proven' for Euclidean distance, which
means that it should error if anything else is passed to it. I have an
implementation that works solely using a distance matrix, but I don't know
if it retains the qualities that the base algorithm does.
--
My public key can be found at: http://pgp.mit.edu/
Search for this email address and select the key from "2011-08-19" (key id:
54BA8735)
Older keys can be used, but please inform me beforehand (and update when
possible!)
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general