That's probably one-time overhead so it is not a big issue.  In my opinion,
a bigger one is possible complexity. Annotations tend to introduce a lot of
cyclic dependencies in Spark codebase. This can be addressed, but don't
look great.


This is not true (anymore). With Python 3.6 you can add string annotations
-> 'DenseVector', and in the future with Python 3.7 this is fixed by having
postponed evaluation: https://www.python.org/dev/peps/pep-0563/

Merging stubs into project structure from the other hand has almost no
overhead.


This feels awkward to me, this is like having the docstring in a separate
file. In my opinion you want to have the signatures and the functions
together for transparency and maintainability.

I think DBT is a very nice project where they use annotations very well:
https://github.com/fishtown-analytics/dbt/blob/dev/marian-anderson/core/dbt/graph/queue.py

Also, they left out the types in the docstring, since they are available in
the annotations itself.

In practice, the biggest advantage is actually support for completion, not
type checking (which works in simple cases).


Agreed.

Would you be interested in writing up the Outreachy proposal for work on
this?


I would be, and also happy to mentor. But, I think we first need to agree
as a Spark community if we want to add the annotations to the code, and in
which extend.

At some point (in general when things are heavy in generics, which is the
case here), annotations become somewhat painful to write.


That's true, but that might also be a pointer that it is time to refactor
the function/code :)

For now, I tend to think adding type hints to the codes make it difficult
to backport or revert and more difficult to discuss about typing only
especially considering typing is arguably premature yet.


This feels a bit weird to me, since you want to keep this in sync right? Do
you provide different stubs for different versions of Python? I had to look
up the literals: https://www.python.org/dev/peps/pep-0586/

Cheers, Fokko

Op wo 22 jul. 2020 om 09:40 schreef Maciej Szymkiewicz <
mszymkiew...@gmail.com>:

>
> On 7/22/20 3:45 AM, Hyukjin Kwon wrote:
> > For now, I tend to think adding type hints to the codes make it
> > difficult to backport or revert and
> > more difficult to discuss about typing only especially considering
> > typing is arguably premature yet.
>
> About being premature ‒ since typing ecosystem evolves much faster than
> Spark it might be preferable to keep annotations as a separate project
> (preferably under AST / Spark umbrella). It allows for faster iterations
> and supporting new features (for example Literals proved to be very
> useful), without waiting for the next Spark release.
>
> --
> Best regards,
> Maciej Szymkiewicz
>
> Web: https://zero323.net
> Keybase: https://keybase.io/zero323
> Gigs: https://www.codementor.io/@zero323
> PGP: A30CEF0C31A501EC
>
>
>

Reply via email to