Re: [Scikit-learn-general] motivation for the lib, why re-implement existing stuff

Denis Kochedykov Sun, 04 Dec 2011 04:58:42 -0800

Hi Brian,

Thanks, all points are quite important for me (for most users, I think).
Performance problems are surprising, considering Orange is mainly C++.


Denis.

On 04.12.2011 16:57, [email protected] wrote:
> Hi Denis,
>
> My main motivation is mostly usability.  In terms of development though,  
> I've only really worked on decision trees, so my comments are heavily 
> influenced by that experience.
> Here are the three main reasons why I use scikit-learn:
>
> Simplicity (taking the cue from Olivier).  If you've seen how difficult it is 
> to prepare your dataset into Orange format, you will appreciate any package 
> that operates directly on numpy arrays.
>
> Speed. The decision tree implementation of Orange takes about 25 seconds to 
> train on the Madelon dataset, whereas the optimised version of scikit-learn 
> takes well under a second. I can't really comment on other algorithms though.
>
> Readability. Algorithms implemented in scikit-learn are meant to be easily 
> understood, to the point where anyone with enough knowledge of the algorithm 
> should be able to go in and make changes if they wish. I like to think of it 
> as executable pseudocode.
>
> These are the main reasons why I use it, but the other ones mentioned 
> (distributed code,  licensing) are important too.
>
> Regards
> Brian

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] motivation for the lib, why re-implement existing stuff

Reply via email to