jorisvandenbossche commented on issue #35531: URL: https://github.com/apache/arrow/issues/35531#issuecomment-1542393207
> Also, this proposal doesn't dwell on the consumer side. Would there be higher-level APIs to construct `Array` and `RecordBatch` from those capsules? Yes, indeed I currently didn't touch on that aspect. I think that could certainly be useful, but thought to start with the producer side of things. And some consumers might already have an entry point that could be reused for this (for example, duckdb already implicitly reads from any object that is a pandas DataFrale, pyarrow Table, RecordBatch, Dataset/Scanner, RecordBatchReader, polars DataFrame, ...., and they could just extend this to any object implementing this protocol). Making the parallel with DLPack again, they recommend that libraries implement a `from_dlpack` function as the consumer interface. So we could here also have such a recommendation (for example `from_arrow`, although that might need to differentiate between stream/array/schema), but that's maybe less essential initially? (that's more about user facing API) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
