[ 
https://issues.apache.org/jira/browse/ARROW-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Santamaria updated ARROW-17901:
--------------------------------------
    Description: 
I understand that, in general, {{pyarrow}} does not support type hints. 
However, I think it is still sensible to add a {{py.typed}} marker file to the 
library. Let me demonstrate why,

 
{code:java}
$ pip install mypy pyarrow {code}
 

 
{code:java}
# test.py
import pyarrow as pa
 
table = pa.Table()
 
reveal_type(table) {code}
 

 
{code:java}
$ mypy test.py
test.py:1: error: Skipping analyzing "pyarrow": module is installed, but 
missing library stubs or py.typed marker
test.py:1: note: See 
https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
test.py:5: note: Revealed type is "Any"
Found 1 error in 1 file (checked 1 source file) {code}
 

Note that {{mypy}} identifies {{table}} as being an {{Any}} type, when 
obviously it is a {{{}Table{}}}. If we include a {{py.typed}} file, {{mypy}} 
will be able to make these trivial inferences. The motivating example is this,

 
{code:java}
@overload
def from_arrow(a: pa.Table) -> DataFrame:
    ...

@overload
def from_arrow(a: pa.Array | pa.ChunkedArray) -> Series:
    ...

def from_arrow(a: pa.Table | pa.Array | pa.ChunkedArray) -> DataFrame | Series:
    pass {code}
 

The problem is that since all of {{{}pa.Table{}}}, {{{}pa.Array{}}}, and 
{{pa.ChunkedArray}} are determined to be {{{}Any{}}}, so the overloads 
effectively become 

 
{code:java}
@overload
def from_arrow(a: Any) -> DataFrame:
    ...

@overload
def from_arrow(a: Any) -> Series:
    ... {code}
 

and {{mypy}} complains that overload 2 is covered entirely by overload 1.

 

I tried to test what adding a {{py.typed}} file would do, but I ran into 
compilation issues. I was hoping someone with a little more experience here 
could quickly test this out for me :)

  was:
I understand that, in general, `pyarrow` does not support type hints. However, 
I think it is still sensible to add a `py.typed` marker file to the library. 
Let me demonstrate why,

 

```

$ pip install mypy pyarrow

```

 

```python

# test.py

import pyarrow as pa

 

table = pa.Table()

 

reveal_type(table)

```

 

```

$ mypy test.py

test.py:1: *error:* Skipping analyzing {*}"pyarrow"{*}: module is installed, 
but missing library stubs or py.typed marker

test.py:1: note: See 
https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports

test.py:5: note: Revealed type is *"Any"*

*Found 1 error in 1 file (checked 1 source file)*

```

 

Note that `mypy` identifies `table` as being an `Any` type, when obviously it 
is a `Table`. If we include a `py.typed` file, `mypy` will be able to make 
these trivial inferences. 

 

The motivating example is this,

 

```python

@overload
def from_arrow(a: pa.Table) -> DataFrame:
    ...


@overload
def from_arrow(a: pa.Array | pa.ChunkedArray) -> Series:
    ...


def from_arrow(a: pa.Table | pa.Array | pa.ChunkedArray) -> DataFrame | Series:
    pass

```

 

The problem is that all of `pa.Table`, `pa.Array`, and `pa.ChunkedArray` are 
determined to be `Any`, so the overloads effectively become 

 

```python

@overload
def from_arrow(a: Any) -> DataFrame:
    ...


@overload
def from_arrow(a: Any) -> Series:
    ...

```

 

and `mypy` complains that overload 2 is covered entirely by overload 1.

 

I tried to test what adding a `py.typed` file would do, but I ran into 
compilation issues. I was hoping someone with a little more experience could 
quickly test this out for me :)


> `pyarrow` missing `py.typed` marker
> -----------------------------------
>
>                 Key: ARROW-17901
>                 URL: https://issues.apache.org/jira/browse/ARROW-17901
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>            Reporter: Matteo Santamaria
>            Priority: Minor
>
> I understand that, in general, {{pyarrow}} does not support type hints. 
> However, I think it is still sensible to add a {{py.typed}} marker file to 
> the library. Let me demonstrate why,
>  
> {code:java}
> $ pip install mypy pyarrow {code}
>  
>  
> {code:java}
> # test.py
> import pyarrow as pa
>  
> table = pa.Table()
>  
> reveal_type(table) {code}
>  
>  
> {code:java}
> $ mypy test.py
> test.py:1: error: Skipping analyzing "pyarrow": module is installed, but 
> missing library stubs or py.typed marker
> test.py:1: note: See 
> https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
> test.py:5: note: Revealed type is "Any"
> Found 1 error in 1 file (checked 1 source file) {code}
>  
> Note that {{mypy}} identifies {{table}} as being an {{Any}} type, when 
> obviously it is a {{{}Table{}}}. If we include a {{py.typed}} file, {{mypy}} 
> will be able to make these trivial inferences. The motivating example is this,
>  
> {code:java}
> @overload
> def from_arrow(a: pa.Table) -> DataFrame:
>     ...
> @overload
> def from_arrow(a: pa.Array | pa.ChunkedArray) -> Series:
>     ...
> def from_arrow(a: pa.Table | pa.Array | pa.ChunkedArray) -> DataFrame | 
> Series:
>     pass {code}
>  
> The problem is that since all of {{{}pa.Table{}}}, {{{}pa.Array{}}}, and 
> {{pa.ChunkedArray}} are determined to be {{{}Any{}}}, so the overloads 
> effectively become 
>  
> {code:java}
> @overload
> def from_arrow(a: Any) -> DataFrame:
>     ...
> @overload
> def from_arrow(a: Any) -> Series:
>     ... {code}
>  
> and {{mypy}} complains that overload 2 is covered entirely by overload 1.
>  
> I tried to test what adding a {{py.typed}} file would do, but I ran into 
> compilation issues. I was hoping someone with a little more experience here 
> could quickly test this out for me :)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to