[ 
https://issues.apache.org/jira/browse/ARROW-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905428#comment-16905428
 ] 

lidavidm commented on ARROW-5610:
---------------------------------

[~jorisvandenbossche] the approach makes sense to me. I assume the generic 
ExtensionType would have a Python "vtable" for Python subclasses to implement 
the C++ methods, and that each Python subclass would somehow register a new 
instance of the C++ type (with corresponding Python method references) with the 
extension type registry? The registration method would need to support 
parameterized types as well (i.e. registering multiple instances of the same 
type with different parameters).

There's still the reference loop between C++ and Python. In this case, since 
you have no way of re-instantiating the Python instance if the weak reference 
is dropped, you'd need some other way - so you might have to make the 
Python-side registry, as a way to get around the reference loop. (Then, during 
interpreter shutdown, you would drop all the C++ extension type instance 
references, then drop the Python references.)

I think then, on the C++ side, the generic extension type instance would get 
instantiated, but there would be no way to instantiate the corresponding Python 
class without a separate registry, as you mention. So the unknown extension 
type then comes into play. Alternatively, Python subclasses could be required 
to register a factory method that takes the extension type name and metadata.

> [Python] Define extension type API in Python to "receive" or "send" a foreign 
> extension type
> --------------------------------------------------------------------------------------------
>
>                 Key: ARROW-5610
>                 URL: https://issues.apache.org/jira/browse/ARROW-5610
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Wes McKinney
>            Priority: Major
>             Fix For: 1.0.0
>
>
> In work in ARROW-840, a static {{arrow.py_extension_type}} name is used. 
> There will be cases where an extension type is coming from another 
> programming language (e.g. Java), so it would be useful to be able to "plug 
> in" a Python extension type subclass that will be used to deserialize the 
> extension type coming over the wire. This has some different API requirements 
> since the serialized representation of the type will not have knowledge of 
> Python pickling, etc. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to