On Tue, Jul 30, 2013 at 10:07 AM, Olivier Grisel
<[email protected]>wrote:

> According to your tests, sklearn KDtree seem often faster at test time
> which is the most important IMHO.
>
> Also BallTree is mostly interesting to be able to plug custom metrics
> (non axis aligned as KD-tree imposes).
>
> Futhermore, sklearn model are fast to pickle / unpickle.
>

Additionally, the results of the benchmarks will be highly dependent on the
structure of the data.  Data with a low intrinsic dimensionality is usually
handled better by Ball Tree, while dense data (even if it is in blobs) will
not be handled especially well by any method.  I delve into some of these
details on a blog post I wrote a while ago:
http://jakevdp.github.io/blog/2013/04/29/benchmarking-nearest-neighbor-searches-in-python/

The build time for cKDTree is faster because they use a less sophisticated
strategy: the bounds of subnodes are fixed at the bounds of parent nodes.
 This means fewer calculations, but in the case of realistic structured
data (though not in the case of many generated test sets) the slight
additional build cost leads to much faster queries.

Short answer: the speed of a nearest-neighbor search depends on lots of
little details about the data, including dimensionality, size, structure,
etc.  No one algorithm will be better than all others in all situations.
    Jake



>
> --
> Olivier
>
>
> ------------------------------------------------------------------------------
> Get your SQL database under version control now!
> Version control is standard for application code, but databases havent
> caught up. So what steps can you take to put your SQL databases under
> version control? Why should you start doing it? Read more to find out.
> http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Get your SQL database under version control now!
Version control is standard for application code, but databases havent 
caught up. So what steps can you take to put your SQL databases under 
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to