Re: [Scikit-learn-general] motivation for the lib, why re-implement existing stuff

David Warde-Farley Sun, 04 Dec 2011 01:52:07 -0800

On Sun, Dec 04, 2011 at 02:49:30PM +0800, Denis Kochedykov wrote:
> Hi Olivier,
> 
> Thanks for comments!
> 
> So, summarizing, sklearn versus Orange is:
> - use plain arrays instead of classes for storing data-sets, features, etc
> - use BSD rather than GPL license
> - no framework, plain library of methods
> 
> If I got it right, seems like creating sklearn was not a question of 
> Orange quality/usability, but more a question of another development 
> style/community.
> That is, for users who're not going to sell their software (which is not 
> permitted by GPL), there is not much difference?


The GPL does not prohibit you from selling your software. The only
stipulation is that anyone who receives the software in binary form must also
receive the source code, and a copy of the license. 

*That* person is then free to redistribute the software under the terms of
the GPL, including giving it away for free. So while you are free to sell
software, those you sell it to are free to give it away, and so forth.

That, and the GPL is viral, so the moment you import a GPLed library or copy
and paste a snippet of GPLed code, your entire project becomes a "derivative
work" as far as the FSF is concerned (and possibly the copyright law of some
country, AFAIK it's never been litigated).

> Of course, convenience for developers and simplicity means more viable 
> library in a long term.

When I last tried out Orange, it was very much a C++ library trying and
failing to masquerade as a Python library. The API was complicated and
prosaic, it didn't build very easily, and it was prone to hard crashes that
brought the interpreter down in flames. I don't know if things have improved
since then (this was probably 2008ish).

I've since moved into mostly dabbling in the kinds of algorithms that neither
scikit-learn nor Orange implement, but when I do require the use of an
off-the-shelf algorithm, I greatly prefer scikit-learn's approach to APIs
because, as a seasoned NumPy user, there's very little else I need to grasp
in order to use it. I don't need to spend half a day piecing together
somebody's notions of how best to decompose a learning task into a 30-piece
C++ class hierarchy: I look up the class I'm interested in, look at the
docstring for __init__() and fit(), and I'm done.

David

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] motivation for the lib, why re-implement existing stuff

Reply via email to