Emanuele, I should also note that a distinct advantage of cover trees is that, unlike ball tree, there is no need to compute the mean/median point of each node. This means that their storage can be much more compact, and they'd be very suitable to computing distances within sparse data. For that reason alone they'd be well worth pursuing. Jake
Emanuele Olivetti wrote: > On 01/04/2012 04:22 PM, Mathias Verbeke wrote: > >> Dear all, >> >> I just started working with Scikit Learn and I'm currently using the Nearest >> Neighbors >> module. In the documentation is stated that it currently only supports the >> Euclidean >> distance metric, and I was wondering if it would be easy to extend it with >> other >> distance metrics? Since it uses the scipy.sparse matrices as input, I was >> thinking about >> the distance metrics in scipy.distance.spatial. Would that be possible, or >> were there >> certain considerations to only allow for Euclidean distance? >> >> >> > > Hi, > > Recently I started to investigate cover trees [0] which are another > data structure like the ball tree or kd-tree with some interesting > properties. I stumbled upon PyCoverTree which is a pure python > implementation of cover trees which I am tweaking to make > it more usable (e.g. initially it worked only on Python >=v2.7) and > eventually faster: > > https://github.com/emanuele/PyCoverTree > > This algorithm might be of interest to you because it is like the BallTree > class but it accepts a generic distance functionm thus non-vectorial input > too. Of course it is (much) slower than the BallTree during insertion. > But if that is not a big problem, then queries are not that much slower :-) > > Time permitting I'd like to improve PyCoverTree and make it sklearn-compliant. > But do not expect too much in the short term ;-) > > Note: the current LICENSE.txt of PyCoverTree says "all rights reserved" > but I am in the process of getting an appropriate open source license > from the original authors. If there is interest I'll post news here. > > Best, > > Emanuele > > [0]: http://hunch.net/~jl/projects/cover_tree/cover_tree.html > > ------------------------------------------------------------------------------ > Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex > infrastructure or vast IT resources to deliver seamless, secure access to > virtual desktops. With this all-in-one solution, easily deploy virtual > desktops for less than the cost of PCs and save 60% on VDI infrastructure > costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > ------------------------------------------------------------------------------ Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex infrastructure or vast IT resources to deliver seamless, secure access to virtual desktops. With this all-in-one solution, easily deploy virtual desktops for less than the cost of PCs and save 60% on VDI infrastructure costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
