jorisvandenbossche commented on PR #34483: URL: https://github.com/apache/arrow/pull/34483#issuecomment-1491702391
> I'm interested in whether these strategies would be a useful way of exposing pure C++ Extension Types in Python. I am not a huge fan of the idea of creating classes on the fly .. Also, does this give something more useful than wrapping it just in a base class as is done now? (because right now this generated class also doesn't have any extension-type specific logic?) > Some further context to motivate for this: It would be useful to efficiently convert nested FixedSizeListArray's into numpy arrays. Yeah, so if we have a way to register a python class to use as the type class for an extension type implemented in C++ (https://github.com/apache/arrow/issues/33997), you can override the `to_numpy()` method. However, than only works if you call this method directly on the array object. But if you convert a table, or a chunked array, it will still go through the C++ layer which currently falls back on converting the storage array. So we might need to think about a more general mechanism here to tap into this conversion logic. > This may also be relevant to #8510 which, from imperfect memory, uses FixedSizeListArray's to represent Tensors. Yes, that one is close to being merged, and then we can expose this in python, which might be a useful exercise to see how this goes / what we can learn from this (but of course it's an internal one, so we can hard code support for it) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
