kylebarron commented on issue #38010:
URL: https://github.com/apache/arrow/issues/38010#issuecomment-2010601912
I also just hit an instance where having the `pa.field` constructor consume
these objects would be helpful.
In particular, I was trying to read an arrow array with GeoArrow extension
metadata but manually persist the field metadata:
```
schema_capsule, array_capsule = data.__arrow_c_array__()
class SchemaHolder:
schema_capsule: object
def __init__(self, schema_capsule) -> None:
self.schema_capsule = schema_capsule
def __arrow_c_schema__(self):
return self.schema_capsule
class ArrayHolder:
schema_capsule: object
array_capsule: object
def __init__(self, schema_capsule, array_capsule) -> None:
self.schema_capsule = schema_capsule
self.array_capsule = array_capsule
def __arrow_c_array__(self, requested_schema):
return self.schema_capsule, self.array_capsule
# Here the pa.field constructor doesn't accept pycapsule objects
field = pa.field(SchemaHolder(schema_capsule))
array = pa.array(ArrayHolder(field.__arrow_c_schema__(), array_capsule))
schema = pa.schema([field.with_name("geometry")])
table = pa.Table.from_arrays([array], schema=schema)
```
Aside from this, the only way to maintain extension metadata is to ensure
that the extension types are registered with pyarrow, which is harder to
control because if its global scope.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]