2011/12/3 Denis Kochedykov <[email protected]>: > Hi all, > > I'm looking for an ML library for Python for our research team. I found > a quite comprehensive one - Orange - and a relatively new one - > scikits.learn. > Orange definitely look good given the number of methods implemented in > it, maturity and its GUI as a bonus. > But I'm a bit confused - if you guys started a new library, maybe there > is something wrong with Orange? Why do you need to re-implement what has > been already done, instead of using that lib as a foundation and > concentrate on adding a new cool stuff or improving existing?
Hi Denis, I my opinion here are the main reasons why scikit-learn cannot reuse orange: - scikit-learn is a scikit (scientific python toolkit): it is meant to be used by he scipy community and to play by its tacit rules: the primary data structure is plain old numpy array (or scipy.sparse.matrix): no machine learning specific class for samples, features, datasets... - scikit-learn has only dependencies on non viral open source licenses (python, numpy, scipy and joblib all are BSD-like): hence scikit-learn is BSD-like as well to play fair in this permissive ecosystem (being a able to copy and paste any function or modules of scikit-learn source code anywhere else is perfectly OK) - scikit-learn focuses on implementing machine learning with as few framework code as possible and let other framework oriented projects reuse some of scikit-learn modules if they want to do so: i.e. to build datamining GUI for instance. Other scikit-learn contributors might have their own reasons to contribute to scikit-learn rather than Orange. Also on a more trivial perspective, I like working on github using pull-request based reviews as the main inter-developer communication medium for code contributions. svn is such a pain once you tasted a decentralized tool like git or hg. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
