Re: [D] A new home for pyarrow-stubs? [arrow]


GitHub user dangotbanned added a comment to the discussion: A new home for 
pyarrow-stubs?


> Pandas doesn't seem to have the kind of detailed type information that is 
> being proposed here.

I think a fairer comparison would be to their stubs

https://github.com/pandas-dev/pandas-stubs/blob/ffa88e5b90c4088e1454ee2fff231db27899b0e1/pandas-stubs/core/series.pyi

But really I think `pyarrow` is closer to `numpy` in terms of API.
The complexity of the typing is because most things are polymorphic functions 
that can return very different types based on their inputs.


Even the comparison to `numpy` isn't 1-to-1.

`np.array` is dependent on it's inputs, but for the most part will always 
return `np.ndarray`

https://github.com/numpy/numpy/blob/29caecb9d4761938aa80f9b1f01fe3b2e77a6044/numpy/_core/multiarray.pyi#L441-L500

I guess you could argue `pa.array` mostly returns a `pa.Array`, but really it 
returns one of many concrete types

https://github.com/zen-xu/pyarrow-stubs/blob/dfe07a8415e516fbd90c0a46b0e1dddf2292a6f3/pyarrow-stubs/__lib_pxi/array.pyi#L61-L531

Where a subset of those have additional methods or are at least generic over 
some type

https://github.com/zen-xu/pyarrow-stubs/blob/dfe07a8415e516fbd90c0a46b0e1dddf2292a6f3/pyarrow-stubs/__lib_pxi/array.pyi#L2393-L4170

Then if you want to use one of these `Array`s in `pyarrow.compute` - you 
quickly run into kernel errors if you assume any `Array` can be used with any 
function.
So the verbosity helps with that pain point 🙂 

GitHub link: 
https://github.com/apache/arrow/discussions/45919#discussioncomment-14294980

----
This is an automatically sent email for user@arrow.apache.org.
To unsubscribe, please send an email to: user-unsubscr...@arrow.apache.org

Re: [D] A new home for pyarrow-stubs? [arrow]

Reply via email to