On Sun, Dec 04, 2011 at 09:16:56PM +0800, Denis Kochedykov wrote: > > Hi David, > > Thanks, very good points. That is > > 1. C++ rather than Python (in fact this, looks like a plus for me - > performance, universality, etc)
I agree from the perspective of universality, but beware of the trap of making speed generalizations about languages. A lot of the speed-critical parts of sklearn are quite heavily optimized in Cython. I recall that their coordinate descent (for generalized linear models) implementation compares quite favourably against a widely used and cleverly written Fortran implementation. Sounds like Brian has found the decision tree implementation to be quite speedy as well. Suffice it to say, it's possible to write quite fast Python code (and in my experience, almost always possible to achieve C-like speeds with a dash of Cython), and it's also possible to really drop the ball and write very slow C/C++ code. David ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
