[
https://issues.apache.org/jira/browse/ARROW-13784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406585#comment-17406585
]
Joris Van den Bossche commented on ARROW-13784:
-----------------------------------------------
Indeed, a schema cannot be created if there is no type information available
(which gets extracted from the arrays). Instead of creating an empty table, I
think we should rather raise an error in this case (that the passed arrays and
column names don't match).
If you want to create an actual empty table with columns, you will need to pass
a list of zero-length arrays.
> [Python] Table.from_arrays should return a schema when array is empty but
> names is not
> --------------------------------------------------------------------------------------
>
> Key: ARROW-13784
> URL: https://issues.apache.org/jira/browse/ARROW-13784
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 4.0.1
> Reporter: Abderrahmane Jaidi
> Priority: Major
>
> The `Table.from_arrays` method returns an empty schema when supplying an
> empty arrays list but providing column names. As a result, the subsequent
> `to_pandas` method returns an empty data frame with no column names.
> ```
> {{import pyarrow as pa}}
> {{arrays = []}}
> {{cols_names = ["col1", "col2"]}}
> {{table = pa.Table.from_arrays(arrays=arrays, names=cols_names)}}
> {{table.schema # returns nothing}}
> {{df = table.to_pandas()}}
> {{df.head()}}
> {{Empty DataFrame
> Columns: []
> Index: [] # Expected column names to be visible here}}
> ```
> I assume that this is because a schema cannot be built without data types?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)