Fokko commented on pull request #29180:
URL: https://github.com/apache/spark/pull/29180#issuecomment-663441753
> For example, Python type hinting is optional and still premature.
This is definitely not true anymore with Python 3.6. You could say this with
Python <=3.5, but it has evolved quite well over time. I haven't seen any
project that got worse by having type-hints.
> It would mean going through the incubation process. Which is doable, but
rather a lot of work and not something I would suggest someone else up to do if
I wanted them to talk to me again. That being said if we don't have a consesus
on bringing it in tree in some form, I'd be happy to serve as one of the
mentors for incubation (but I'd rather not have the types in a seperate
project).
I think this should be possible, but this requires a lot of documents to be
signed. Not sure if it requires a full incubation process, or if we can adopt
it as a part of the Spark repository. An example that we had at Airflow, Google
donated the Airflow Kubernetes operator, and this was just a separate repo
under the Airflow project.
However, having it in a separate repo would make it more cumbersome to keep
everything in sync. For example, if you change a signature, you need to do this
in both repositories.
> Python we should support Python 3.6+ right? It is accepted to the latest
Python 3.8. If the fundamental way of typing changes in the latest Python
version, I would say it's still premature and evolving. I guess certainly we
don't want to manage multiple versions of stub files or have such typing in
codes with if-else of Python versions.
From PEP544 itself: Therefore, in this PEP we do not propose to replace the
nominal subtyping described by PEP 484 with structural subtyping completely.
Instead, protocol classes as specified in this PEP complement normal classes,
and users are free to choose where to apply a particular solution. See section
on rejected ideas at the end of this PEP for additional motivation.
Also, in the numpy annotations, we already have some awkward stuff:
```python
if sys.version_info >= (3, 8):
from typing import Literal, Protocol
else:
from typing_extensions import Literal, Protocol
```
But I guess this is something that we have to accept. For the current
annotations, there is also one single version for all the python versions, and
I would suggest to keep it like that for maintainability's sake.
> Python we should support Python 3.6+ right? It is accepted to the latest
Python 3.8. If the fundamental way of typing changes in the latest Python
version, I would say it's still premature and evolving. I guess certainly we
don't want to manage multiple versions of stub files or have such typing in
codes with if-else of Python versions.
Exactly, I would keep one single version for all version of Python that we
support :)
> Ok sounds like that's our path forward then if @Fokko , @zero323 and other
folks are K with that?
Sounds good to me. I'm in favor of having it inline, but having the
annotation in the pyi is a good first step :)
> Yes, so I meant to port stubs files into python/pyspark. Maybe we could
discuss more about how we'll port in the ongoing dev mailing thread.
Sounds good. I also checked that we can keep track of the type coverage
using MyPy, so we can keep track of what's covered and what not. We could
generate a list of files that lack coverage and ask the community to help to
increase coverage. I already know a lot of people who want to spend some time
on this, and this would also be a good candidate for the Outreachy Program.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]