Hi Daniel.
This hasn't been brought up before so there is no "official position".
I am generally in favor, though I'm not sure how doable it is.
We are generally pretty generous in accepting all kinds of inputs, and
many of our options can have different types: (None, int, float, string,
nd-array) is relatively common as a type for an option.
As we still support 2.6, we would need to do comments or external files.
As a user, you are probably most interested in the outputs, right? The
types returned by scikit-learn could probably be auto-generated.
I'm curious to see what others think.
I'd be surprised if anyone is willing to invest a large amount of time
on this, though if you guys want to contribute,
we might be able to work something out.
Andy
On 07/27/2016 03:17 PM, Daniel Moisset wrote:
Hi,
[If you're also on the numpy mailing list and get a similar version of
the message, I apologise for that]
I work at Machinalis were we use a lot of scikit-learn (and the pydata
stack in general). Recently we've also been getting involved with
mypy, which is a tool to type check (not on runtime, think of it as a
linter) annotated python code (the way of annotating python types has
been recently standarized in PEP 484).
As part of that involvement we've started creating type annotations
for the Python libraries we use most, which include both numpy and
scikit-learn. Mypy provides a way to specify types with annotations in
separate files in case you don't have control over a library, so we
have created an initial proof of concept for numpy at [1], and we are
actively improving it. You can find some additional information about
it and some problems we've found on the way at this blogpost [2]. We
were planning to also start some work on scikit-learn (which has a
much larger surface area than numpy, so probably focusing on small
parts for now); we had to start with numpy anyway given that SKL
depends on it.
What I wanted to ask is if the people involved on the SKL project are
aware of PEP484 annotations and if you have some interest in starting
using them. The main benefit is that annotations serve as clear (and
automatically testable) documentation for users, and secondary
benefits is that users discovers bugs more quickly and that some IDEs
(like pycharm) are starting to use this information for smart editor
features (autocompletion, online checking, refactoring tools);
eventually tools like jupyter could take advantage of these
annotations in the future. And the cost of writing and including these
are relatively low.
We're doing the work anyway, but contributing our typespecs back could
make it easier for users to benefit from this, and for us to maintain
it and keep it in sync with future releases.
If you've never heard about PEP484 or mypy (it happens a lot) I'll be
happy to clarify anything about it that might helpunderstand this
situation
Thanks!
D.
[1] https://github.com/machinalis/mypy-data
[2] http://www.machinalis.com/blog/writing-type-stubs-for-numpy/
--
Daniel F. Moisset - UK Country Manager
www.machinalis.com <http://www.machinalis.com>
Skype: @dmoisset
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn