[jira] [Commented] (ARROW-1895) [Python] Add field_name to pandas index metadata

ASF GitHub Bot (JIRA) Wed, 06 Dec 2017 14:16:16 -0800

    [ 
https://issues.apache.org/jira/browse/ARROW-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16281016#comment-16281016
 ]


ASF GitHub Bot commented on ARROW-1895:
---------------------------------------

jorisvandenbossche commented on issue #1397: ARROW-1895: [Python] Add 
field_name to pandas index metadata
URL: https://github.com/apache/arrow/pull/1397#issuecomment-349792851
 
 
   One special case that I encountered in 
https://github.com/apache/arrow/pull/1386 is a DataFrame with column name 
`None` (from ipc when serializing a Series without name). 
   This case is not yet handled here:
   
   ```
   In [6]: pa.Table.from_pandas(pd.DataFrame({None: [1,2,3]}))
   Out[6]: 
   pyarrow.Table
   None: int64
   __index_level_0__: int64
   metadata
   --------
   {b'pandas': b'{"index_columns": ["__index_level_0__"], "column_indexes": 
[{"na'
               b'me": null, "pandas_type": "mixed", "numpy_type": "object", 
"meta'
               b'data": null}], "columns": [{"name": null, "field_name": null, 
"p'
               b'andas_type": "int64", "numpy_type": "int64", "metadata": 
null}, '
               b'{"name": null, "field_name": "__index_level_0__", 
"pandas_type":'
               b' "int64", "numpy_type": "int64", "metadata": null}], 
"pandas_ver'
               b'sion": "0.22.0.dev0+260.g5da3759"}'}
   ```
   
   So for the column, `"name": null, "field_name": null,` are both null, while 
field_name should be "None"

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> [Python] Add field_name to pandas index metadata
> ------------------------------------------------
>
>                 Key: ARROW-1895
>                 URL: https://issues.apache.org/jira/browse/ARROW-1895
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.7.1
>            Reporter: Phillip Cloud
>            Assignee: Phillip Cloud
>              Labels: pull-request-available
>             Fix For: 0.8.0
>
>
> See the discussion here for details:
> https://github.com/pandas-dev/pandas/pull/18201
> In short we need a way to map index column names to field names in an arrow 
> Table.
> Additionally, we're depending on the index columns being written at the end 
> of the table and fixing this would allow us to read metadata written by other 
> systems (e.g., fastparquet) that don't make this assumption.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (ARROW-1895) [Python] Add field_name to pandas index metadata

Reply via email to