2012/1/4 Mathias Verbeke <[email protected]>:
> Dear all,
>
> I just started working with Scikit Learn and I'm currently using the Nearest
> Neighbors module. In the documentation is stated that it currently only
> supports the Euclidean distance metric, and I was wondering if it would be
> easy to extend it with other distance metrics? Since it uses the
> scipy.sparse matrices as input, I was thinking about the distance metrics in
> scipy.distance.spatial.

scipy.spatial.distance does not work on scipy.sparse matrices, only on
numpy arrays AFAIK. The kNN classifier only works with sparse matrices
with the "bruteforce" mode as BallTree and kd-tree do not work with
scipy.sparse matrices either.

> Would that be possible, or were there certain
> considerations to only allow for Euclidean distance?

Would be great to make this pluggable indeed. This should be quite
easy for the brute force mode. For the ball tree mode that will
require to dive into the cython code and read the reference paper to
check whether any assumption on the metrics is used or not (or just
ask Jake :).

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual 
desktops for less than the cost of PCs and save 60% on VDI infrastructure 
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to