[
https://issues.apache.org/jira/browse/ARROW-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341255#comment-16341255
]
ASF GitHub Bot commented on ARROW-1999:
---------------------------------------
jcrist opened a new pull request #1523: ARROW-1999: [Python] Type checking in
`from_numpy_dtype`
URL: https://github.com/apache/arrow/pull/1523
- Adds type checking to the C++ `NumPyDtypeToArrow` and `GetTensorType` to
ensure `dtype` is actually a dtype object.
- Add conversion of non-dtype objects in `pa.from_numpy_dtype`.
- Adds tests to check a wider variety of inputs to
`pa.from_numpy_dtype`, and ensure proper errors.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> [Python] from_numpy_dtype returns wrong types
> ---------------------------------------------
>
> Key: ARROW-1999
> URL: https://issues.apache.org/jira/browse/ARROW-1999
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.8.0
> Environment: Windows 10 Build 15063.850
> Python: 3.6.3
> Numpy: 1.14.0
> Reporter: Victor Jimenez
> Assignee: Phillip Cloud
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.9.0
>
>
> The following code shows multiple issues when using {{from_numpy_dtype}}:
> {code}
> import numpy as np
> import pyarrow as pa
> pa.from_numpy_dtype(np.unicode) # returns DataType(bool)
> pa.from_numpy_dtype(np.int) # returns DataType(bool)
> pa.from_numpy_dtype(np.int64) # Fails with the following message:
> #
> # ArrowNotImplementedError Traceback (most recent call last)
> # <ipython-input-14-ca0855a7dda8> in <module>()
> # ----> 1 pa.from_numpy_dtype(np.int64)
> # 2
> #
> # types.pxi in pyarrow.lib.from_numpy_dtype()
> #
> # error.pxi in pyarrow.lib.check_status()
> #
> # ArrowNotImplementedError: Unsupported numpy type 32760
> {code}
> Additionally, a potentially related issue is also seen when using
> {{to_pandas_dtype}}:
> {code}
> pa.DataType.to_pandas_dtype(pa.string()) # Returns numpy.object_
> # (shouldn't it be numpy.unicode?)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)