[
https://issues.apache.org/jira/browse/ARROW-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671614#comment-16671614
]
Maarten Breddels commented on ARROW-3685:
-----------------------------------------
I tried to make a PR, but it's opening a whole can of worms, so maybe this part
should be vaex specific, or maybe go into the docs.
> [Python] Use fixed size binary for NumPy fixed-size string dtypes
> -----------------------------------------------------------------
>
> Key: ARROW-3685
> URL: https://issues.apache.org/jira/browse/ARROW-3685
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Affects Versions: 0.11.1
> Reporter: Maarten Breddels
> Priority: Major
>
> I'm working on getting support for arrow in vaex (out of core dataframe
> library for Python) in this PR:
> [https://github.com/maartenbreddels/vaex/pull/116]
> And I fixed length binary arrays for numpy (say dtype='S42') will be
> converted to a non-fixed length array. Trying to convert that back to numpy
> will fail, since there is no such conversion.
> It makes more sense to convert dtype='S42', to an arrow array with
> pyarrow.binary(42) type. As I do in:
> https://github.com/maartenbreddels/vaex/blob/4b4facb64fea9f83593ce0f0b82fc26ddf96b506/packages/vaex-arrow/vaex_arrow/convert.py#L4
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)