pitrou commented on code in PR #39199: URL: https://github.com/apache/arrow/pull/39199#discussion_r1423949486
########## docs/source/python/extending_types.rst: ########## @@ -21,6 +21,38 @@ Extending pyarrow ================= +Controlling conversion to (Py)Arrow with the PyCapsule Interface +---------------------------------------------------------------- + +The :ref:`Arrow C data interface <c-data-interface>` allows moving Arrow data between +different implementations of Arrow. This is a generic, cross-language interface not +specific to Python, but for Python libraries this interface is extended with a Python +specific layer: :ref:`arrow-pycapsule-interface`. + +This Python interface ensures that different libraries that support the C Data interface +can recognize each other objects and export Arrow data structures in a standard way. + +If you have a library providing data structures that hold Arrow-compatible data +under the hood, you can implement the following dunder methods on those objects: Review Comment: I would not use the term "dunder", which is an obscure bit of jargon ```suggestion under the hood, you can implement the following methods on those objects: ``` ########## docs/source/python/extending_types.rst: ########## @@ -21,6 +21,38 @@ Extending pyarrow ================= +Controlling conversion to (Py)Arrow with the PyCapsule Interface +---------------------------------------------------------------- + +The :ref:`Arrow C data interface <c-data-interface>` allows moving Arrow data between +different implementations of Arrow. This is a generic, cross-language interface not +specific to Python, but for Python libraries this interface is extended with a Python +specific layer: :ref:`arrow-pycapsule-interface`. + +This Python interface ensures that different libraries that support the C Data interface +can recognize each other objects and export Arrow data structures in a standard way. + +If you have a library providing data structures that hold Arrow-compatible data +under the hood, you can implement the following dunder methods on those objects: + +- ``__arrow_c_schema__`` for schema or type-like objects. +- ``__arrow_c_array__`` for arrays and record batches (contiguous tables). +- ``__arrow_c_stream__`` for chunked tables or streams of data. + +Those methods return `PyCapsule <https://docs.python.org/3/c-api/capsule.html>`__ +objects, and more details on the exact semantics can be found in the +:ref:`specification <arrow-pycapsule-interface>`. + +When your data structures have those dunder methods defined, the pyarrow constructors +(such as :func:`pyarrow.array` or :func:`pyarrow.table`) will recognize those objects as +supporting this protocol, and convert them to PyArrow data structures zero-copy. And the +same can be true for any other library supporting this protocol on ingesting data. + +Similarly, if your library has functions that accept user-provided data, you can add +support for this protocol by checking for the presence of those dunder methods, and Review Comment: ```suggestion support for this protocol by checking for the presence of those methods, and ``` ########## docs/source/python/extending_types.rst: ########## @@ -21,6 +21,38 @@ Extending pyarrow ================= +Controlling conversion to (Py)Arrow with the PyCapsule Interface +---------------------------------------------------------------- + +The :ref:`Arrow C data interface <c-data-interface>` allows moving Arrow data between +different implementations of Arrow. This is a generic, cross-language interface not +specific to Python, but for Python libraries this interface is extended with a Python +specific layer: :ref:`arrow-pycapsule-interface`. + +This Python interface ensures that different libraries that support the C Data interface +can recognize each other objects and export Arrow data structures in a standard way. + +If you have a library providing data structures that hold Arrow-compatible data +under the hood, you can implement the following dunder methods on those objects: + +- ``__arrow_c_schema__`` for schema or type-like objects. +- ``__arrow_c_array__`` for arrays and record batches (contiguous tables). +- ``__arrow_c_stream__`` for chunked tables or streams of data. + +Those methods return `PyCapsule <https://docs.python.org/3/c-api/capsule.html>`__ +objects, and more details on the exact semantics can be found in the +:ref:`specification <arrow-pycapsule-interface>`. + +When your data structures have those dunder methods defined, the pyarrow constructors Review Comment: ```suggestion When your data structures have those methods defined, the PyArrow constructors ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
