[
https://issues.apache.org/jira/browse/ARROW-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16192258#comment-16192258
]
ASF GitHub Bot commented on ARROW-1633:
---------------------------------------
GitHub user wesm opened a pull request:
https://github.com/apache/arrow/pull/1167
ARROW-1633: [Python] Support NumPy string and unicode types in
pyarrow.array, Array.from_pandas
I suppose this could have been worse. If anyone has any better ideas about
alternatives to my onion router of UTF32 bytes -> PyUnicode -> PyBytes (UTF8)
-> StringBuilder, I'm open to them. Since we support gcc 4.8 we don't have the
option of using `std::codecvt`.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/wesm/arrow ARROW-1633
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/arrow/pull/1167.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1167
----
commit 9304216247dea43c2afed84f5d69575756424e6f
Author: Wes McKinney <[email protected]>
Date: 2017-10-04T23:44:10Z
Convert NumPy string (ascii) arrays to arrow binary arrays, handle mask and
strided versions
Change-Id: I4ea288ce5db1d6ab80262b865a4ca1f33cbf17e8
commit ca59ada36cbcc083d0c948ef336511d6407de3f0
Author: Wes McKinney <[email protected]>
Date: 2017-10-05T00:21:48Z
Implement NumPy unicode to Arrow utf8 conversion, deal with masks,
truncated values
Change-Id: I1ba2df9e3720bfa56efa9e7e6399e579eaa15471
----
> [Python] numpy "unicode" arrays not understood
> ----------------------------------------------
>
> Key: ARROW-1633
> URL: https://issues.apache.org/jira/browse/ARROW-1633
> Project: Apache Arrow
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Nick White
> Assignee: Wes McKinney
> Labels: pull-request-available
> Fix For: 0.8.0
>
>
> {code}
> import numpy as np
> pa.StringArray.from_pandas(np.empty(1, np.unicode))
> {code}
> Throws:
> {noformat}
> ---------------------------------------------------------------------------
> ArrowNotImplementedError Traceback (most recent call last)
> <ipython-input-68-f9bc946f2c0a> in <module>()
> 1 import numpy as np
> ----> 2 pa.StringArray.from_pandas(np.empty(1, np.unicode))
> array.pxi in pyarrow.lib.Array.from_pandas()
> error.pxi in pyarrow.lib.check_status()
> ArrowNotImplementedError: Unsupported numpy type 19
> {noformat}
> np.object arrays work, though...
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)