[
https://issues.apache.org/jira/browse/ARROW-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730047#comment-15730047
]
Vincent Pham commented on ARROW-376:
------------------------------------
Hi, if this is not taken, I would love to contribute to Apache Arrow.
> Python: Convert non-range Pandas indices (optionally) to Arrow
> --------------------------------------------------------------
>
> Key: ARROW-376
> URL: https://issues.apache.org/jira/browse/ARROW-376
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Uwe L. Korn
> Priority: Minor
> Labels: newbie
>
> Currently the indices of a Pandas DataFrame are totally ignored on the Pandas
> to Arrow conversion. We should add an option to also convert the index to an
> Arrow column if they are not a simple range index.
> The condition for a simple index should be {{isinstance(df.index,
> pd.RangeIndex) && (df.index._start == 0) && (df.index._stop == len(df.index))
> && (df.index._step == 1)}}. In this case, we can always skip the index
> conversion. Otherwise, a new column in the Arrow table shall be created using
> the index' name as the name of the column. Additionally there should be some
> metadata annotation of that column that it is derived of an Pandas Index, so
> that for roundtrips, we'll use it again as the index of a DataFrame.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)