[ 
https://issues.apache.org/jira/browse/ARROW-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16281090#comment-16281090
 ] 

ASF GitHub Bot commented on ARROW-1895:
---------------------------------------

jorisvandenbossche commented on a change in pull request #1397: ARROW-1895: 
[Python] Add field_name to pandas index metadata
URL: https://github.com/apache/arrow/pull/1397#discussion_r155387915
 
 

 ##########
 File path: python/pyarrow/tests/test_convert_pandas.py
 ##########
 @@ -160,9 +160,28 @@ def test_integer_index_column(self):
         df = pd.DataFrame([(1, 'a'), (2, 'b'), (3, 'c')])
         _check_pandas_roundtrip(df, preserve_index=True)
 
+    def test_index_metadata_field_name(self):
+        df = pd.DataFrame(
+            [(1, 'a'), (2, 'b'), (3, 'c')],
+            index=pd.MultiIndex.from_arrays(
+                [['c', 'b', 'a'], [3, 2, 1]],
+                names=[None, 'foo']
+            )
+        )
+        t = pa.Table.from_pandas(df, preserve_index=True)
+        raw_metadata = t.schema.metadata
+
+        js = json.loads(raw_metadata[b'pandas'].decode('utf8'))
+
+        _, _, idx0, foo = js['columns']
+        idx0_name, foo_name = js['index_columns']
+        assert idx0_name == '__index_level_0__'
+        assert idx0['field_name'] == idx0_name
+
+        assert foo_name == '__index_level_1__'
+        assert foo['name'] == 'foo'
+
 
 Review comment:
   For completeness, I would assert that for the other columns, the name and 
field_name are equal (this is obvious from the current code, but not explicitly 
tested):
   
   ```
   col1, col2, idx0, foo = js['columns']
   
   assert col1['name'] == col1['field_name']
   assert col2['name'] == col2['field_name']
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> [Python] Add field_name to pandas index metadata
> ------------------------------------------------
>
>                 Key: ARROW-1895
>                 URL: https://issues.apache.org/jira/browse/ARROW-1895
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.7.1
>            Reporter: Phillip Cloud
>            Assignee: Phillip Cloud
>              Labels: pull-request-available
>             Fix For: 0.8.0
>
>
> See the discussion here for details:
> https://github.com/pandas-dev/pandas/pull/18201
> In short we need a way to map index column names to field names in an arrow 
> Table.
> Additionally, we're depending on the index columns being written at the end 
> of the table and fixing this would allow us to read metadata written by other 
> systems (e.g., fastparquet) that don't make this assumption.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to