Dear SFC reviewers,
Here is the application for the scikit-learn project:
*** Why does your project want to join Conservancy? Specifically, what
benefits do you expect to take advantage of immediately and within a
few years?
The project wants to join the Conservancy for legal and financial backing.
Specifically, the lack of clearly established financial references has made
sponsoring of the project difficult. In the long run, being a member of the
Conservancy also guarantees more independence from funding sources, so that
those will not try to establish themselves as a "governing" organisation.
*** Please give a detailed description of the project.
Scikit-learn is a Python module integrating a wide range of state-of-the-art
machine learning algorithms for medium-scale problems (very simply put,
machine learning is a subfield of artificial intelligence concerned with the
creation of programs that can "learn" from data). This package focuses
on bringing
machine learning to non-specialists using a general-purpose,
high-level programming
language.
*** What FLOSS License(s) does your project use? Please include the
primary license, and list other licenses for code that is included.
(e.g., "The project as a whole is GPLv3-or-later, but about a dozen
files in the directory src/external/ are under the Apache-2.0
license")
The project is distributed under the terms of the Revised (3-clause) BSD
license. In addition, external data sources from statlib and the UCI
machine learning repository were integrated in sklearn/datasets/data.
These files are in the public domain.
*** Please give us your roadmap and plans for future development of the
project, including both code and community plans.
Code will continue to grow as we focus on better core machine learning
techniques: integrating more standard machine learning algorithms, and
better model selection strategies. Community-wise, the project has
completely out-grown its original contributors and is now driven actively
by a dozen of contributors from different institution with commit rights,
and that do the releases. The project will thus stick to a
community-driven governance model.
*** Please give us the main link to the projects primary website.
http://scikit-learn.org
*** Please give us a URL to a code repository we can clone and/or
checkout.
https://github.com/scikit-learn/scikit-learn
*** Have you ever had funds held by the project, or by any individual on
behalf of the project? How and for what did you spend those funds?
Are there funds remaining? If so, who is holding them now?
Funds were held by INRIA on behalf of the project. They were used for
organizing a code sprint (travel expenses) in Granada, after a Machine
Learning conference (NIPS) in Dec. 2011.
*** Do you have any ongoing fundraising programs for your project? How do
they operate, and how much funding is brought in through these
mechanisms currently?
We have no ongoing fundraising, but occasionally we consider raising
funds for travel on the order of a thousand euros.
*** Does your project owe funds to anyone?
No.
*** Has your project ever had legal trouble, been involved in legal
proceedings or received a letter accusing your project of patent,
copyright, trademark or other types of infringement?
No.
*** Please give a brief history of the project, focusing on how the
community developed and the general health of the community. Be sure
to include information on any forks or other disputes that have
occurred in the community.
This project was started in 2007 as a Google Summer of Code project by David
Cournapeau. Later that year, Matthieu Brucher started work on this project as
part of his PhD thesis.
In 2009 Fabian Pedregosa, Gael Varoquaux, Alexandre Gramfort and Vincent
Michel of INRIA took leadership of the project and made the first public
release on February 1st 2010. Importantly, INRIA payed a full-time
junior engineer for 2 years on the project (2009-2011). Since then,
releases were made following a ~3 month cycle, and a striving
international community has been leading the development. In two years,
more than 70 developers have contributed to scikit-learn, each release
having more than 20 contributors. INRIA has allocated another 2 years of
junior engineer funding for 2012-2014.
In 2011 and 2012, several Google Summer of Code projects were funded (1
in 2011, 3 in 2012).
*** Please explain how your project is governed. Who makes the decisions
in the project? How do you resolve disputes, particularly about
non-code issues?
The project is community-driven: all patches are reviewed through the GitHub
Pull Request interface, and all major decisions are discussed on the mailing
list. There is a wide list of contributors with commit rights. When
disputes arise, there is a strong attempt to resolve them by consensus.
If no consensus can be reached, the view of the most senior contender
is chosen. So far, this situation has not really occurred, as consensus
has always been reached via compromises or voting on the mailing list.
*** If your project runs on Linux-based systems, please list all the
distributions that include your project, and what "repository area"
the package appears in. If you aren't packaged for any major
distributions, please tell us why you believe your project hasn't been
packaged yet.
- Debian, main
- Ubuntu, main
- Mandriva, contrib
- ArchLinux
*** Does your project have any existing for-profit or non-profit
affiliations, funding relationships, or other agreements between the
project and/or key leaders of your project and other organizations?
Has the project had such affiliations in the past? Please list of all
of them in detail and explain their nature. Even tangential
affiliations and relationships, or potential affiliations that you
plan to create should be included.
- INRIA is funding a full time developer on scikit-learn.
- Google has funded 1 GSoC project in 2011, and is funding 3 GSoC projects in
2012.
*** Approximately how many users does your project have, and what items
lead you to believe your userbase is of a particular size (e.g., post
counts to your user mailing list)?
The user count is hard to establish. We believe that a lower bound is of
a few thousands of active users. We have 513 followers on github.
Popularity contests on Ubuntu shows that one out of 3000 Ubuntu installs
have scikit-learn. There are between 250 and 500 emails per month on the
mailing list. Scikit-learn is included in Enthought Python Distribution
that has tens of thousands of users in research labs, industry and
academia.
*** Please list the names, email addresses, and affiliations (e.g.,
employer) of key developers and major contributors. Include both
current and past contributors and developers. Please include date
ranges of when those developers/contributors were active.
Please make this list as extensive and complete as possible. You need
not include every last person who sent one patch, but please include
at least those who regularly sent patches or were/are regular
contributors. If you project has contributors who have been inactive
for more than five years, you need only to list such inactive
contributors if they made substantial contributions.
Alexandre Gramfort <[email protected]>, INRIA, 2009 - Now
Alexandre Passos <[email protected]>
Andreas Mueller <[email protected]>, University of Bonn, 2010 - Now
Bertrand Thirion <[email protected]>, INRIA, 2009 - Now
Brian Holt <[email protected]>
Clay Woolam <[email protected]>
Conrad Lee <[email protected]>
Dan Yamins <[email protected]>
David Cournapeau <[email protected]>, Enthought, 2007 - 2008
David Warde-Farley <[email protected]>, University of
Montreal, 2011 - Now
Edouard Duchesnay <[email protected]>, CEA, 2009 - 2010
Fabian Pedregosa <[email protected]>, INRIA, 2009 - Now
Gael Varoquaux <[email protected]>, INRIA, 2009 - Now
Gilles Louppe <[email protected]>, University of Liège, 2011 - Now
Jake Vanderplas <[email protected]>, University of Washington
James Bergstra <[email protected]>, MIT, 2010 - Now
Jaques Grobler <[email protected]>, INRIA, 2012 - Now
Jean Kossaifi <[email protected]>, Student, 2011
Kenneth C. Arnold <[email protected]>,
Lars Buitinck <[email protected]>, University of Amsterdam, 2011 - Now
Mathieu Blondel <[email protected]>, Kobe University, 2010 - Now
Matthieu Brucher <[email protected]>, 2008 - 2010
Matthieu Perrot <[email protected]>
Nelle Varoquaux <[email protected]>, Mines ParisTech, 2011 - Now
Nicolas Pinto <[email protected]>
Olivier Grisel <[email protected]>, Nuxeo, 2010 - Now
Paolo Losi <[email protected]>
Peter Prettenhofer <[email protected]>
Pietro Berkes <[email protected]>
Robert Layton <[email protected]>
Ron Weiss <[email protected]>, Google
Satrajit Ghosh <[email protected]>, MIT, 2011 - Now
Shiqiao Du <[email protected]>
Thouis (Ray) Jones <[email protected]>, Institut Curie, 2011 - Now
Vincent Dubourg <[email protected]>
Vincent Michel <[email protected]>, Logilab, 2009 - Now
Virgile Fritsch <[email protected]>, INRIA, 2010 - Now
Vlad Niculae <[email protected]>, University of Bucharest, 2011 - Now
Xinfan Meng <[email protected]>
Yaroslav Halchenko <[email protected]>
*** Please include any other pertinent information not given above that
you feel we should review with your application.
A paper about scikit-learn was published in the peer-reviewed Journal of
Machine Learning Research:
F. Pedregosa et al. (2011). Scikit-learn: machine learning in Python. JMLR
12:2825-2830. http://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html
For reference here is an online copy of this application:
https://raw.github.com/scikit-learn/administrative/master/software_freedom_conservancy/application.txt
Regards,
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general