2012/1/23 Alexandre Gramfort <[email protected]>: > I am not sure it is what you want but you could use: > > K = radius_neighbors_graph(X, radius, mode='distance') > K.data **= 2 > K.data *= -gamma > np.exp(K.data, out=K.data) > > no?
+1 for the dense case But ball tree does not work for high dim sparse data. We would also need some truncated kernels (e.g. cosine similarity for positive data or RBF in the general case) probably implemented in cython for the high dim sparse case where the dense output shape (n_samples, n_neighbors) is preallocated in advance (and assumed to fit in memory while a dense array for (n_samples, n_samples) or (n_samples, n_features) would not). That would be very useful to make SpectralClustering work on text data. That should also help with the "over-convergence" issues I observe on the power iteration clustering branch when n_samples is too big. Using LSH (or some variant of random projection) might indeed interesting to quickly the approximate nearest neighbors graph of high dim sparse data (but I think a cython version for the exact case truncated case would still be useful, at least as a control reference for the approximate case). BTW, I am making some progress on the Random Projection branch: I have started integrating murmurhash to simulate random projection by a sparse matrix that is never materialized in memory. The example looks good too. It still need some work on the hashing part and on the narrative doc. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2 _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
