[
https://issues.apache.org/jira/browse/ARROW-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905256#comment-16905256
]
Joris Van den Bossche edited comment on ARROW-5610 at 8/13/19 1:03 PM:
-----------------------------------------------------------------------
After looking at this some more, my idea to tackle this would be as follows:
Implement a "generic" ExtensionType subclass in C++ (ARROW-6179) that allows a
variable "extension_name" (and optionally metadata) as argument, and thus
doesn't need to be subclassed to create a custom instance.
>From Python, we can provide a class to interact with this (a class the user
>can further subclass, or by directly creating an instance of it) to have an
>extension type which name and metadata is used in IPC.
This is mainly for the sending part, and it wouldn't be registered as extension
type in C++ . So for the receiving end, it would come back as an
"UnknownExtensionType" (something already exists in the Python interface). In
this design, if we want to have it come back as a specific ExtensionType
subclass (without having it registered in C++), we might need a separate
Python-specific registry.
The above approach seems doable to me. And also seems simpler as some C++ ->
Python callbacks to have code in C++ interact with a custom Python class.
Thoughts on this?
was (Author: jorisvandenbossche):
After looking at this some more, my idea to tackle this would be as follows:
Implement a "generic" ExtensionType subclass in C++ (ARROW-6179) that allows a
variable "extension_name" (and optionally metadata) as argument, and thus
doesn't need to be subclassed to create a custom instance.
>From Python, we can provide a class to interact with this (a class the user
>can further subclass, or by directly creating an instance of it) to have an
>extension type which name and metadata is used in IPC.
This is mainly for the sending part, and it wouldn't be registered as extension
type in C++. So for the receiving end, it would come back as an
"UnknownExtensionType" (something already exists in the Python interface). In
this design, if we want to have it come back as a specific ExtensionType
subclass (without having it registered in C++), we might need a separate
Python-specific registry.
The above approach seems doable to me. And also seems simpler as some C++ ->
Python callbacks to have code in C++ interact with a custom Python class.
Thoughts on this?
> [Python] Define extension type API in Python to "receive" or "send" a foreign
> extension type
> --------------------------------------------------------------------------------------------
>
> Key: ARROW-5610
> URL: https://issues.apache.org/jira/browse/ARROW-5610
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Wes McKinney
> Priority: Major
> Fix For: 1.0.0
>
>
> In work in ARROW-840, a static {{arrow.py_extension_type}} name is used.
> There will be cases where an extension type is coming from another
> programming language (e.g. Java), so it would be useful to be able to "plug
> in" a Python extension type subclass that will be used to deserialize the
> extension type coming over the wire. This has some different API requirements
> since the serialized representation of the type will not have knowledge of
> Python pickling, etc.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)