[
https://issues.apache.org/jira/browse/ARROW-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16219886#comment-16219886
]
ASF GitHub Bot commented on ARROW-1732:
---------------------------------------
wesm opened a new pull request #1252: ARROW-1732: [Python] Permit creating
record batches with no columns, test pandas roundtrips
URL: https://github.com/apache/arrow/pull/1252
I ran into this rough edge today, invariably serialization code paths will
need to send across a DataFrame with no columns, this will need to work even if
`preserve_index=False`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> [Python] RecordBatch.from_pandas fails on DataFrame with no columns when
> preserve_index=False
> ---------------------------------------------------------------------------------------------
>
> Key: ARROW-1732
> URL: https://issues.apache.org/jira/browse/ARROW-1732
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Reporter: Wes McKinney
> Assignee: Wes McKinney
> Labels: pull-request-available
> Fix For: 0.8.0
>
>
> I believe this should have well-defined behavior and not raise an error:
> {code}
> In [5]: pa.RecordBatch.from_pandas(pd.DataFrame({}), preserve_index=False)
> ---------------------------------------------------------------------------
> ValueError Traceback (most recent call last)
> <ipython-input-5-4dda72b47dbd> in <module>()
> ----> 1 pa.RecordBatch.from_pandas(pd.DataFrame({}), preserve_index=False)
> ~/code/arrow/python/pyarrow/table.pxi in pyarrow.lib.RecordBatch.from_pandas
> (/home/wesm/code/arrow/python/build/temp.linux-x86_64-3.5/lib.cxx:39957)()
> 586 df, schema, preserve_index, nthreads=nthreads
> 587 )
> --> 588 return cls.from_arrays(arrays, names, metadata)
> 589
> 590 @staticmethod
> ~/code/arrow/python/pyarrow/table.pxi in pyarrow.lib.RecordBatch.from_arrays
> (/home/wesm/code/arrow/python/build/temp.linux-x86_64-3.5/lib.cxx:40130)()
> 615
> 616 if not number_of_arrays:
> --> 617 raise ValueError('Record batch cannot contain no arrays
> (for now)')
> 618
> 619 num_rows = len(arrays[0])
> ValueError: Record batch cannot contain no arrays (for now)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)