[jira] [Updated] (ARROW-2428) [Python] Support ExtensionArrays in to_pandas conversion
[ https://issues.apache.org/jira/browse/ARROW-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated ARROW-2428: --- Labels: (was: beginner) > [Python] Support ExtensionArrays in to_pandas conversion > > > Key: ARROW-2428 > URL: https://issues.apache.org/jira/browse/ARROW-2428 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Uwe L. Korn >Priority: Major > Fix For: 1.0.0 > > > With the next release of Pandas, it will be possible to define custom column > types that back a {{pandas.Series}}. Thus we will not be able to cover all > possible column types in the {{to_pandas}} conversion by default as we won't > be aware of all extension arrays. > To enable users to create {{ExtensionArray}} instances from Arrow columns in > the {{to_pandas}} conversion, we should provide a hook in the {{to_pandas}} > call where they can overload the default conversion routines with the ones > that produce their {{ExtensionArray}} instances. > This should avoid additional copies in the case where we would nowadays first > convert the Arrow column into a default Pandas column (probably of object > type) and the user would afterwards convert it to a more efficient > {{ExtensionArray}}. This hook here will be especially useful when you build > {{ExtensionArrays}} where the storage is backed by Arrow. > The meta-issue that tracks the implementation inside of Pandas is: > https://github.com/pandas-dev/pandas/issues/19696 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-2428) [Python] Support ExtensionArrays in to_pandas conversion
[ https://issues.apache.org/jira/browse/ARROW-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated ARROW-2428: --- Labels: beginner (was: ) > [Python] Support ExtensionArrays in to_pandas conversion > > > Key: ARROW-2428 > URL: https://issues.apache.org/jira/browse/ARROW-2428 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Uwe L. Korn >Priority: Major > Labels: beginner > Fix For: 1.0.0 > > > With the next release of Pandas, it will be possible to define custom column > types that back a {{pandas.Series}}. Thus we will not be able to cover all > possible column types in the {{to_pandas}} conversion by default as we won't > be aware of all extension arrays. > To enable users to create {{ExtensionArray}} instances from Arrow columns in > the {{to_pandas}} conversion, we should provide a hook in the {{to_pandas}} > call where they can overload the default conversion routines with the ones > that produce their {{ExtensionArray}} instances. > This should avoid additional copies in the case where we would nowadays first > convert the Arrow column into a default Pandas column (probably of object > type) and the user would afterwards convert it to a more efficient > {{ExtensionArray}}. This hook here will be especially useful when you build > {{ExtensionArrays}} where the storage is backed by Arrow. > The meta-issue that tracks the implementation inside of Pandas is: > https://github.com/pandas-dev/pandas/issues/19696 -- This message was sent by Atlassian JIRA (v7.6.3#76005)