[ https://issues.apache.org/jira/browse/ARROW-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16326659#comment-16326659 ]
Licht Takeuchi commented on ARROW-1992: --------------------------------------- I am working on this. I will make a PR after ARROW-1997(https://issues.apache.org/jira/browse/ARROW-1997) merged. > [Python] to_pandas crashes when using strings_to_categoricals on empty string > cols on 0.8.0 > ------------------------------------------------------------------------------------------- > > Key: ARROW-1992 > URL: https://issues.apache.org/jira/browse/ARROW-1992 > Project: Apache Arrow > Issue Type: Bug > Affects Versions: 0.8.0 > Environment: OS: Windows > Python: PY36 x64 > Pandas: 0.22.0 > pyarrow: 0.8.0 > Reporter: Victor Uriarte > Assignee: Licht Takeuchi > Priority: Major > Fix For: 0.9.0 > > > When trying to read back a table, Python crashes when pyarrow is used to > read/convert a table that has a column of 0 length `strings and > strings_to_categorical=True`. Example code below. > This same test ran ok with pyarrow 0.7.1 > {code:none} > import pandas as pd > import pyarrow as pa > df = pd.DataFrame({ > 'Foo': ['A', 'A', 'B', 'B', 'C'], > 'Bar': ['A1', 'A2', 'B2', 'D3', ''], > 'Baz': ['', '', '', '', ''], > }) > table = pa.Table.from_pandas(df) > df = table.to_pandas(strings_to_categorical=False) # Works > print('Categoricals=False', len(df)) > df = table.to_pandas(strings_to_categorical=True) # Crashes > print('Categoricals=True', len(df)) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)