[
https://issues.apache.org/jira/browse/ARROW-17925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17616608#comment-17616608
]
Chang She commented on ARROW-17925:
-----------------------------------
My head hurts trying to keep it all straight:
so we have:
- 3 "targets" for conversion (pylist, numpy, pandas)
- At least 6 different knobs that can be turned:
=> 4 different overrideable mechanisms (to_py, to_pylist, to_numpy, to_pandas)
=> Storage fallback
=> pandas extensionDtype <> pa.ExtensionType
- Some of these are defined/performed in C++ and others in Python
hard to think how to give devs clear guidance on the order of things
> [Python] Use ExtensionScalar.as_py() as fallback in ExtensionArray to_pandas?
> -----------------------------------------------------------------------------
>
> Key: ARROW-17925
> URL: https://issues.apache.org/jira/browse/ARROW-17925
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Joris Van den Bossche
> Priority: Major
>
> This was raised in ARROW-17813 by [~changhiskhan]:
> {quote}*ExtensionArray => pandas*
> Just for discussion, I was curious whether you had any thoughts around using
> the extension scalar as a fallback mechanism. It's a lot simpler to define an
> ExtensionScalar with `as_py` than a pandas extension dtype. So if an
> ExtensionArray doesn't have an equivalent pandas dtype, would it make sense
> to convert it to just an object series whose elements are the result of
> `as_py`? {quote}
> and I also mentioned this in ARROW-17535:
> {quote}That actually brings up a question: if an ExtensionType defines an
> ExtensionScalar (but not an associciated pandas dtype, or custom to_numpy
> conversion), should we use this scalar's {{as_py()}} for the
> to_numpy/to_pandas conversion as well for plain extension arrays? (not the
> nested case)
> Because currently, if you have an ExtensionArray like that (for example using
> the example from the docs:
> https://arrow.apache.org/docs/dev/python/extending_types.html#custom-scalar-conversion),
> we still use the storage type conversion for to_numpy/to_pandas, and only
> use the scalar's conversion in {{to_pylist}}.{quote}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)