Hey there!

Currently I am working on very large sparse vectors and have to 
calculate similarity between all pairs of them.

I have now looked into the available code in scikit-learn and also at 
corresponding literature.
So I stumbled upon this paper [1] and the corresponding implementation [2].

I was now thinking, if this would be a potential improvement / help for 
scikit-learn for working with very large feature files where it is still 
necessary to calculate the pair-wise similarity of vectors for different 
classificators or other tasks. So the goal would be to speed this whole 
thing up.

I am by far no expert in this thing, but just wanted to ask you guys 
about your opinion ;)

Regards,
Philipp

[1] http://www.bayardo.org/ps/www2007.pdf
[2] http://code.google.com/p/google-all-pairs-similarity-search/

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to