2012/1/4 Mathias Verbeke <[email protected]>: > Dear all, > > I just started working with Scikit Learn and I'm currently using the Nearest > Neighbors module. In the documentation is stated that it currently only > supports the Euclidean distance metric, and I was wondering if it would be > easy to extend it with other distance metrics? Since it uses the > scipy.sparse matrices as input, I was thinking about the distance metrics in > scipy.distance.spatial.
scipy.spatial.distance does not work on scipy.sparse matrices, only on numpy arrays AFAIK. The kNN classifier only works with sparse matrices with the "bruteforce" mode as BallTree and kd-tree do not work with scipy.sparse matrices either. > Would that be possible, or were there certain > considerations to only allow for Euclidean distance? Would be great to make this pluggable indeed. This should be quite easy for the brute force mode. For the ball tree mode that will require to dive into the cython code and read the reference paper to check whether any assumption on the metrics is used or not (or just ask Jake :). -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex infrastructure or vast IT resources to deliver seamless, secure access to virtual desktops. With this all-in-one solution, easily deploy virtual desktops for less than the cost of PCs and save 60% on VDI infrastructure costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
