[ 
https://issues.apache.org/jira/browse/ARROW-4432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835422#comment-16835422
 ] 

Joris Van den Bossche commented on ARROW-4432:
----------------------------------------------

With latest master, the only difference is the metadata (I suppose that before 
the RangeIndex serialization change, the difference was also the index column):

{code}
In [3]: table = pa.Table.from_arrays([])                                        
                                                                              

In [4]: df = table.to_pandas()                                                  
                                                                              

In [5]: df                                                                      
                                                                              
Out[5]: 
Empty DataFrame
Columns: []
Index: []

In [6]: table_ = pa.Table.from_pandas(df)                                       
                                                                              

In [7]: table_                                                                  
                                                                              
Out[7]: 
pyarrow.Table

metadata
--------
{b'pandas': b'{"index_columns": [{"kind": "range", "name": null, "start": 0, "'
            b'stop": 0, "step": 1}], "column_indexes": [{"name": null, "field_'
            b'name": null, "pandas_type": "empty", "numpy_type": "object", "me'
            b'tadata": null}], "columns": [], "creator": {"library": "pyarrow"'
            b', "version": "0.13.1.dev130+gdd335952"}, "pandas_version": "0.24'
            b'.2"}'}

In [8]: table_ = table_.replace_schema_metadata(None)                           
                                                                              

In [9]: table == table_                                                         
                                                                              
Out[9]: True
{code}

> [Python][Hypothesis] Empty table - pandas roundtrip produces inequal tables
> ---------------------------------------------------------------------------
>
>                 Key: ARROW-4432
>                 URL: https://issues.apache.org/jira/browse/ARROW-4432
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>            Reporter: Krisztian Szucs
>            Priority: Major
>              Labels: hypothesis
>             Fix For: 0.14.0
>
>
> The following test case fails for empty tables:
> {code:python}
> import hypothesis as h
> import pyarrow.tests.strategies as past
> @h.given(past.all_tables)
> def test_pandas_roundtrip(table):
>     df = table.to_pandas()
>     table_ = pa.Table.from_pandas(df)
>     assert table == table_
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to