2014-03-14 19:40 GMT+01:00 Daniel Vainsencher <[email protected]>: > Hi Olivier, > > Have you looked at the LSH-Forest (and to a lesser extent, Multiprobe) > paper? I've used it in practice, and it makes a significant improvement > vs. basic LSH. I've proposed in recent emails to the list how to > implement it based on sorting and binary search (in order to avoid the > dependency/maintenance burden of range-queries).
No I did not read those papers (yet). As Maheshakya said it might very well be the method implemented in Annoy. Do you have sample python code around? > I think evaluating and tweaking vanilla LSH is a waste of time, when an > obvious improvement that fixes a significant real world flaw (control of > number of candidates seen) in the method is available. If the obvious improvement does not come with a significant increment in code complexity (and maintenance burden) then I agree, yes. Getting rid of hyperparameters to select manually of via grid-search is a big usability improvement and can make the method much more practical to use. -- Olivier ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
