On Mon, 28 Oct 2019 at 22:41, Wes McKinney <wesmck...@gmail.com> wrote:
> Adding dev@ > > I don't believe we have APIs yet for plugging in user-defined Array > subtypes. I assume you've read > > > http://arrow.apache.org/docs/python/extending_types.html#defining-extension-types-user-defined-types > > There may be some JIRA issues already about this (defining subclasses > of pa.Array with custom behavior) -- since Joris has been working on > this I'm interested in more comments > Yes, there is https://issues.apache.org/jira/browse/ARROW-6176 for exactly this issue. What I proposed there is to allow one to subclass pyarrow.ExtensionArray and to attach this to an attribute on the custom ExtensionType (eg __arrow_ext_array_class__ in line with the other __arrow_ext_.. methods). That should allow to achieve similar functionality as what is available in Java I think. If that seems a good way to do this, I think we certainly welcome a PR for that (I can also look into it otherwise before 1.0). Joris > > On Mon, Oct 28, 2019 at 3:56 PM Justin Polchlopek > <jpolchlo...@azavea.com> wrote: > > > > Hi! > > > > I've been working through understanding extension types in Arrow. It's > a great feature, and I've had no problems getting things working in > Java/Scala; however, Python has been a bit of a different story. Not that > I am unable to create and register extension types in Python, but rather > that I can't seem to recreate the functionality provided by the Java API's > ExtensionTypeVector class. > > > > In Java, ExtensionType::getNewVector() provides a clear pathway from the > registered type to output a vector in something other than the underlying > vector type, and I am at a loss for how to get this same functionality in > Python. Am I missing something? > > > > Thanks for any hints. > > -Justin >