[
https://issues.apache.org/jira/browse/ARROW-17335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17577564#comment-17577564
]
Joris Van den Bossche commented on ARROW-17335:
-----------------------------------------------
Mypy doesn't use pyi files when eg doing `mypy pyarrow`?
> [Python] Type checking support
> ------------------------------
>
> Key: ARROW-17335
> URL: https://issues.apache.org/jira/browse/ARROW-17335
> Project: Apache Arrow
> Issue Type: New Feature
> Components: Python
> Reporter: Jorrick Sleijster
> Priority: Major
> Original Estimate: 10h
> Remaining Estimate: 10h
>
> h1. mypy and static type checking
> As of Python3.6, it has been possible to introduce typing information in the
> code. This became immensely popular in a short period of time. Shortly after,
> the tool `mypy` arrived and this has become the industry standard for static
> type checking inside Python. It is able to check very quickly for invalid
> types which makes it possible to serve as a pre-commit. It has raised many
> bugs that I did not see myself and has been a very valuable tool.
> h2. Now what does this mean for PyArrow?
> When we run mypy on code that uses PyArrow, you will get error message as
> follows:
> ```
> some_util_using_pyarrow/hdfs_utils.py:5: error: Skipping analyzing "pyarrow":
> module is installed, but missing library stubs or py.typed marker
> some_util_using_pyarrow/hdfs_utils.py:9: error: Skipping analyzing "pyarrow":
> module is installed, but missing library stubs or py.typed marker
> some_util_using_pyarrow/hdfs_utils.py:11: error: Skipping analyzing
> "pyarrow.fs": module is installed, but missing library stubs or py.typed
> marker
> ```
> More information is available here:
> [https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-library-stubs-or-py-typed-marker]
> h2. You can solve this in three ways:
> # Ignore the message. This, however, will put all types from PyArrow to
> `Any`, making it unable to find user errors with the PyArrow library
> # Create a Python stub file. This is what previously used to be the
> standard, however, it no longer a popular option. This is because stubs are
> extra, next to the source code, while you can also inline the code with type
> hints, which brings me to our third option.
> # Create a `py.typed` file and use inline type hints. This is the most
> popular option today because it requires no extra files (except for the
> py.typed file), allows all the type hints to be with the code (like now in
> the documentation) and not only provides your users but also the developers
> of the library themselves with type hints (and hinting of issues inside your
> IDE).
>
> My personal opinion already shines through the options, it is 3 as this has
> shortly become the industry standard since the introduction.
> h2. What should we do?
> I'd very much like to work on this, however, I don't feel like wasting time.
> Therefore, I am raising this ticket to see if this had been considered before
> or if we just didn't get to this yet.
> I'd like to open the discussion here:
> # Do you agree with number #3 as type hints.
> # Should we remove the documentation annotations for the type hints given
> they will be inside the functions? Or should we keep it and specify it in the
> code? Which would make it double.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)