I certainly see the benefit, and think we would benefit also from finding test coverage holes wrt input type.
But I think without ndarray/sparse matrix type support, we're not going to be able to annotate most of our code in sufficient detail. On 2 August 2016 at 23:34, Daniel Moisset <dmois...@machinalis.com> wrote: > A couple of things I forgot to mention: > > * One relevant consequence is that, to add annotations on the code, > scikit-learn should depend on the "typing"[1] module which contains some of > the basic names imported and used in annotations. It's a stdlib module in > python 3.5, but the PyPI package backports it to python 2.7 and newer (I'm > not sure how it works with Python 2.6, which might be an issue) > * As an example of the kind of bugs that mypy can find, someone here > already found a documentation bug in the sklearn.svm.SVC() initializer; the > "kernel" parameter is described as "string"[2], when it's actually a > "string or callable" (which can be read in the "small print" description of > the argument). That kind of slips would be automatically prevented if > declared as an annotation with mypy on the CI. Also it would be more clear > what is the signature of the callable directly instead of looking up > additional documentation on kernel functions or digging into the source > > [1] https://pypi.python.org/pypi/typing > [2] > http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC > > > On Mon, Aug 1, 2016 at 5:15 PM, Daniel Moisset <dmois...@machinalis.com> > wrote: > >> On Fri, Jul 29, 2016 at 8:57 PM, Gael Varoquaux < >> gael.varoqu...@normalesup.org> wrote: >> >>> >>> Can you summarize once again in very simple terms what would be the big >>> benefits? >>> >> >> Benefits for regular scikit-learn users >> >> 1. Reliable information on method signatures in a standarized way >> ("reliable" in the sense of "automatically verified") >> 2. Better integration with tools supporting PEP-484 (editors, >> documentation tools). This is a small set now, but I expect it to grow (and >> it's also an egg and chicken problem, support has to start somewhere) >> >> Benefits for scikit-learn users also using mypy and/or PEP-484 (probably >> not a large set, but I know a few people :) ) >> >> 0. Same as the rest of the users >> 1. Early detection of errors in own code while writing code based on SKL >> 2. Making own code more readable/explicit by annotating functions that >> receive/return SKL types (and verifying that annotations) >> >> Benefits for scikit-learn developers >> >> 1. Some extra checks that changes keep internal consistency >> 2. (Future) possible simplification of typing information in docstrings, >> which would make themselves redundant (this would require updating doc >> generators) >> >> Regarding the cost for contributing, an scenario where you get a CI error >> due to mypy would be because: >> >> * the change in the code somewhat changed the existing accepted/returned >> types, which is a change in the API and should actually be verified >> * the change in the code extended the signature of an existing function >> (what Andreas mentioned); in this situation it's similar to a PR that adds >> an argument and doesn't update the docstring (only that this is >> automatically caught). >> >> WRT to the second issue, the error here might be confusing when using the >> "one line" syntax because arguments may "misalign" with their signatures. >> The multiline version (or the python3-only form) is safer in that sense (in >> fact, adding an argument there will not produce a CI problem because its >> unannotated and assumed to be "any type"). >> >> Adding new modules/methods without no annotations wouldn't produce an >> error, just an incompleteness in the annotations >> >> A possible source of problems like the one you mention is that the >> implementation of the annotated methods will be checked, and sometimes >> you'll get a warning about a local variable if mypy can't infer its type >> (it happens sometimes when assigning an empty list to a local, where mypy >> knows that it's a list but doesn't know the element type). But in that case >> I think the message you get is very obvious. >> >> -- >> Daniel F. Moisset - UK Country Manager >> www.machinalis.com >> Skype: @dmoisset >> > > > > -- > Daniel F. Moisset - UK Country Manager > www.machinalis.com > Skype: @dmoisset > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn