[ https://issues.apache.org/jira/browse/ARROW-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dave Challis closed ARROW-2406. ------------------------------- Resolution: Fixed Fix Version/s: (was: 0.10.0) 0.9.0 > [Python] Segfault when creating PyArrow table from Pandas for empty string > column when schema provided > ------------------------------------------------------------------------------------------------------ > > Key: ARROW-2406 > URL: https://issues.apache.org/jira/browse/ARROW-2406 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.8.0 > Environment: Mac OS High Sierra > Python 3.6.3 > Reporter: Dave Challis > Priority: Major > Fix For: 0.9.0 > > > Minimal example to recreate: > {code} > import pandas as pd > import pyarrow as pa > df = pd.DataFrame({'a': []}) > df['a'] = df['a'].astype(str) > schema = pa.schema([pa.field('a', pa.string())]) > pa.Table.from_pandas(df, schema=schema){code} > > This causes the python interpreter to exit with "Segmentation fault: 11". > The following examples all work without any issue: > {code} > # column 'a' is no longer empty > df = pd.DataFrame({'a': ['foo']}) > df['a'] = df['a'].astype(str) > schema = pa.schema([pa.field('a', pa.string())]) > pa.Table.from_pandas(df, schema=schema) > {code} > {code} > # column 'a' is empty, but no schema is specified > df = pd.DataFrame({'a': []}) > df['a'] = df['a'].astype(str) > pa.Table.from_pandas(df) > {code} > {code} > # column 'a' is empty, but no type 'str' specified in Pandas > df = pd.DataFrame({'a': []}) > schema = pa.schema([pa.field('a', pa.string())]) > pa.Table.from_pandas(df, schema=schema) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)