Hey there! Currently I am working on very large sparse vectors and have to calculate similarity between all pairs of them.
I have now looked into the available code in scikit-learn and also at corresponding literature. So I stumbled upon this paper [1] and the corresponding implementation [2]. I was now thinking, if this would be a potential improvement / help for scikit-learn for working with very large feature files where it is still necessary to calculate the pair-wise similarity of vectors for different classificators or other tasks. So the goal would be to speed this whole thing up. I am by far no expert in this thing, but just wanted to ask you guys about your opinion ;) Regards, Philipp [1] http://www.bayardo.org/ps/www2007.pdf [2] http://code.google.com/p/google-all-pairs-similarity-search/ ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general