paleolimbot commented on issue #35531: URL: https://github.com/apache/arrow/issues/35531#issuecomment-1551412155
Just a note that I think `__arrow_c_array__` and `__arrow_c_schema__` are rather essential (I'd build nanoarrow's Python support on top of them). I think it's fairly uncontroversial that their behaviour should align with `__arrow_c_array_stream__`. A concrete example of somewhere that might implement `__arrow_c_schema__` is a GeoArrow type representation...currently they're stored as something more like an integer type ID because it's faster. Substrait types could also implement it or maybe pandas dtypes. It would be rather useful if numpy/pandas.Series implemented `__arrow_c_array__`, no? I don't know it if it was mentioned in the discussion, but I think it's fairly important that the PyCapsule have a finalizer that calls the `release()` callback (if non-null), and to have that documented. I assume that's the point of using the PyCapsule but I haven't discussed that with anybody except maybe in passing with Joris. > Do we want to distinguish between an array and a tabular version? Most ways that I know about to create an ArrowArray (`pa.array()`, `pa.record_batch()`, `arrow::as_arrow_array()`, etc.) also accept a `type` or `schema`. Above the level of "array or table", there are certainly objects whose "one true Arrow type" is ambiguous. You could do `__arrow_c_array__(self, schema=None)` and `__arrow_c_array_stream__(self, schema=None)`. That gets a little hard because then either the producer or the consumer has to do some sort of equality check or validation. Did you envision that `__arrow_c_stream__()` could return things that are not tables? They certainly can and do outside pyarrow (I beleive Rust2 supports it...nanoarrow in R does too). It's a fairly useful representation of a ChunkedArray since there's no other officially ABIified way to do that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
