[
https://issues.apache.org/jira/browse/ARROW-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835713#comment-16835713
]
Antoine Pitrou commented on ARROW-5287:
---------------------------------------
Since it's ambiguous, I'm not sure it's a good idea to support it. The working
inference case for list arrays is a list of lists:
{code:python}
>>> pa.array([[1,2,3],[4,5]])
>>>
>>>
<pyarrow.lib.ListArray object at 0x7f114319eb38>
[
[
1,
2,
3
],
[
4,
5
]
]
{code}
> [Python] automatic type inference for arrays of tuples
> ------------------------------------------------------
>
> Key: ARROW-5287
> URL: https://issues.apache.org/jira/browse/ARROW-5287
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Joris Van den Bossche
> Priority: Major
>
> Arrays of tuples are support to be converted to either ListArray or
> StructArray, if you specify the type explicitly:
> {code}
> In [6]: pa.array([(1, 2), (3, 4, 5)], type=pa.list_(pa.int64()))
> Out[6]:
> <pyarrow.lib.ListArray object at 0x7f1b01a4d408>
> [
> [
> 1,
> 2
> ],
> [
> 3,
> 4,
> 5
> ]
> ]
> In [7]: pa.array([(1, 2), (3, 4)], type=pa.struct([('a', pa.int64()), ('b',
> pa.int64())]))
> Out[7]:
> <pyarrow.lib.StructArray object at 0x7f1b01a51b88>
> -- is_valid: all not null
> -- child 0 type: int64
> [
> 1,
> 3
> ]
> -- child 1 type: int64
> [
> 2,
> 4
> ]
> {code}
> But not when no type is specified:
> {code}
> In [8]: pa.array([(1, 2), (3, 4)])
>
>
> ---------------------------------------------------------------------------
> ArrowInvalid Traceback (most recent call last)
> <ipython-input-8-ab2d80c7486d> in <module>
> ----> 1 pa.array([(1, 2), (3, 4)])
> ~/scipy/repos/arrow/python/pyarrow/array.pxi in pyarrow.lib.array()
> ~/scipy/repos/arrow/python/pyarrow/array.pxi in
> pyarrow.lib._sequence_to_array()
> ~/scipy/repos/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status()
> ArrowInvalid: Could not convert (1, 2) with type tuple: did not recognize
> Python value type when inferring an Arrow data type
> {code}
> Do we want to do automatic type inference for tuples as well? (defaulting to
> the ListArray case, just as arrays of python lists are supported)
> Or was there a specific reason to not support this by default?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)