As part of my PhD research on code authorship, we calculated the Truck
Factor (TF) of some popular GitHub repositories.

As you probably know, the Truck (or Bus) Factor designates the minimal
number of developers that have to be hit by a truck (or quit) before a
project is incapacitated. In our work, we consider that a system is in
trouble if more than 50% of its files become orphan (i.e., without a main
author).

More details on our work in this preprint: https://peerj.com/preprints/1233

We calculated the TF for scikit-learn and obtained a value of 7.

The developers responsible for this TF are:

Fabian Pedregosa - author of 22% of the files
Gael varoquaux - author of 13% of the files
Andreas Mueller - author of 12% of the files
Olivier Grisel - author of 10% of the files
Lars Buitinck - author of 10% of the files
Jake Vanderplas - author of 6% of the files
Vlad Niculae - author of 5% of the files

To validate our results, we would like to ask scikit-learn developers the
following three brief questions:

(a) Do you agree that the listed developers are the main developers of
scikit-learn?

(b) Do you agree that scikit-learn will be in trouble if the listed
developers leave the project (e.g., if they win in the lottery, to be less
morbid)?

(c) Does scikit-learn have some characteristics that would attenuate the
loss of the listed developers (e.g., detailed documentation)?

Thanks in advance for your collaboration,

Guilherme Avelino
PhD Student
Applied Software Engineering Group (ASERG)
UFMG, Brazil
http://aserg.labsoft.dcc.ufmg.br/

-- 
Prof. Guilherme Amaral Avelino
Universidade Federal do Piauí
Departamento de Computação
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to