[
https://issues.apache.org/jira/browse/ARROW-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185791#comment-17185791
]
Joris Van den Bossche commented on ARROW-5566:
----------------------------------------------
[~arw2019] thanks for checking!
Now, the failing example was used to illustrate a general need to refactor type
inference (combine numpy and python type unification). [~wesm] [~kszucs] that's
still a worthwile goal? (and thus to keep this issue open)
> [Python] Overhaul type unification from Python sequence in
> arrow::py::InferArrowType
> ------------------------------------------------------------------------------------
>
> Key: ARROW-5566
> URL: https://issues.apache.org/jira/browse/ARROW-5566
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Wes McKinney
> Priority: Major
>
> I'm working on ARROW-4324 and there's some technical debt lying in
> arrow/python/inference.cc because the case where NumPy scalars are mixed with
> non-NumPy Python scalar values, all hell breaks loose. In particular, the
> innocuous {{numpy.nan}} is a Python float, not a NumPy float64, so the
> sequence {{[np.float16(1.5), np.nan]}} can be converted incorrectly.
> Part of what's messy is that NumPy dtype unification is split from general
> type unification. This should all be combined together with the NumPy types
> mapping onto an intermediate value (for unification purposes) that then maps
> ultimately onto an Arrow type
--
This message was sent by Atlassian Jira
(v8.3.4#803005)