On 02/02/2013 01:53 PM, Radim Rehurek wrote:
Hello scikitters,

I received a pull request with some modified scikit-learn code inside: https://github.com/piskvorky/gensim/blob/a73c84e21aecd3cc77ba2d752912f73b712bc60a/gensim/models/selectkbest.py

Since I'm not familiar with the scikit-learn code base, can someone please tell me whether these changes are worth being integrated into scikit-learn (instead of gensim)? I'm not very thrilled about maintaining code forked from another project. I'd rather link to scikit-learn "dynamically", by a normal import, than this.

Cheers,
Radim


Hi Radim.
Thanks for reaching out.
It would definitely be better from a maintenance point of view for both projects if you wouldn't duplicate code. I didn't know the memory usage of f_classif was a problem.
I don't think this code has been modified in a while.
If the code is as fast and equivalent, we could certainly use some more efficient code. Looking at the function it seems like it might even be better to implement in cython,
if time is an issue.

I am not so familiar with this particular function but we definitely want to have scalable implementations.

Cheers,
Andy
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to