kylebarron commented on code in PR #1031:
URL: 
https://github.com/apache/datafusion-python/pull/1031#discussion_r1971890201


##########
docs/source/contributor-guide/ffi.rst:
##########
@@ -0,0 +1,212 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Python Extensions
+=================
+
+The DataFusion in Python project is designed to allow users to extend its 
functionality in a few core
+areas. Ideally many users would like to package their extensions as a Python 
package and easily
+integrate that package with this project. This page serves to describe some of 
the challenges we face
+when doing these integrations and the approach our project uses.
+
+The Primary Issue
+-----------------
+
+Suppose you wish to use DataFusion and you have a custom data source that can 
produce tables that
+can then be queried against, similar to how you can register a :ref:`CSV 
<io_csv>` or
+:ref:`Parquet <io_parquet>` file. In DataFusion terminology, you likely want 
to implement a 
+:ref:`Custom Table Provider <io_custom_table_provider>`. In an effort to make 
your data source
+as performant as possible and to utilize the features of DataFusion, you may 
decide to write
+your source in Rust and then expose it through `PyO3 <https://pyo3.rs>`_ as a 
Python library.
+
+At first glance, it may appear the best way to do this is to add the 
``datafusion-python``
+crate as a dependency, provide a ``PyTable``, and then to register it with the 
+``SessionContext``. Unfortunately, this will not work.
+
+When you produce your code as a Python library and it needs to interact with 
the DataFusion
+library, at the lowest level they communicate through an Application Binary 
Interface (ABI).
+The acronym sounds similar to API (Application Programming Interface), but it 
is distinctly
+different.
+
+The ABI sets the standard for how these libraries can share data and functions 
between each
+other. One of the key differences between Rust and other programming languages 
is that Rust
+does not have a stable ABI. What this means in practice is that if you compile 
a Rust library
+with one version of the ``rustc`` compiler and I compile another library to 
interface with it
+but I use a different version of the compiler, there is no guarantee the 
interface will be
+the same.
+
+In practice, this means that a Python library built with ``datafusion-python`` 
as a Rust
+dependency will generally **not** be compatible with the DataFusion Python 
package, even
+if they reference the same version of ``datafusion-python``. If you attempt to 
do this, it may
+work on your local computer if you have built both packages with the same 
optimizations.
+This can sometimes lead to a false expectation that the code will work, but it 
frequently
+breaks the moment you try to use your package against the released packages.
+
+You can find more information about the Rust ABI in their
+`online documentation <https://doc.rust-lang.org/reference/abi.html>`_.
+
+The FFI Approach
+----------------
+
+Rust supports interacting with other programming languages through it's 
Foreign Function

Review Comment:
   nit: `it's` should be `its`



##########
docs/source/contributor-guide/ffi.rst:
##########
@@ -0,0 +1,212 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Python Extensions
+=================
+
+The DataFusion in Python project is designed to allow users to extend its 
functionality in a few core
+areas. Ideally many users would like to package their extensions as a Python 
package and easily
+integrate that package with this project. This page serves to describe some of 
the challenges we face
+when doing these integrations and the approach our project uses.
+
+The Primary Issue
+-----------------
+
+Suppose you wish to use DataFusion and you have a custom data source that can 
produce tables that
+can then be queried against, similar to how you can register a :ref:`CSV 
<io_csv>` or
+:ref:`Parquet <io_parquet>` file. In DataFusion terminology, you likely want 
to implement a 
+:ref:`Custom Table Provider <io_custom_table_provider>`. In an effort to make 
your data source
+as performant as possible and to utilize the features of DataFusion, you may 
decide to write
+your source in Rust and then expose it through `PyO3 <https://pyo3.rs>`_ as a 
Python library.
+
+At first glance, it may appear the best way to do this is to add the 
``datafusion-python``
+crate as a dependency, provide a ``PyTable``, and then to register it with the 
+``SessionContext``. Unfortunately, this will not work.
+
+When you produce your code as a Python library and it needs to interact with 
the DataFusion
+library, at the lowest level they communicate through an Application Binary 
Interface (ABI).
+The acronym sounds similar to API (Application Programming Interface), but it 
is distinctly
+different.
+
+The ABI sets the standard for how these libraries can share data and functions 
between each
+other. One of the key differences between Rust and other programming languages 
is that Rust
+does not have a stable ABI. What this means in practice is that if you compile 
a Rust library
+with one version of the ``rustc`` compiler and I compile another library to 
interface with it
+but I use a different version of the compiler, there is no guarantee the 
interface will be
+the same.
+
+In practice, this means that a Python library built with ``datafusion-python`` 
as a Rust
+dependency will generally **not** be compatible with the DataFusion Python 
package, even
+if they reference the same version of ``datafusion-python``. If you attempt to 
do this, it may
+work on your local computer if you have built both packages with the same 
optimizations.
+This can sometimes lead to a false expectation that the code will work, but it 
frequently
+breaks the moment you try to use your package against the released packages.
+
+You can find more information about the Rust ABI in their
+`online documentation <https://doc.rust-lang.org/reference/abi.html>`_.
+
+The FFI Approach
+----------------
+
+Rust supports interacting with other programming languages through it's 
Foreign Function
+Interface (FFI). The advantage of using the FFI is that it enables you to 
write data structures
+and functions that have a stable ABI. The allows you to use Rust code with C, 
Python, and
+other languages. In fact, the `PyO3 <https://pyo3.rs>`_ library uses the FFI 
to share data
+and functions between Python and Rust.
+
+The approach we are taking in the DataFusion in Python project is to 
incrementally expose
+more portions of the DataFusion project via FFI interfaces. This allows users 
to write Rust
+code that does **not** require the ``datafusion-python`` crate as a 
dependency, expose their
+code in Python via PyO3, and have it interact with the DataFusion Python 
package.
+
+Early adopters of this approach include `delta-rs 
<https://delta-io.github.io/delta-rs/>`_
+who has adapted their Table Provider for use in ```datafusion-python``` with 
only a few lines
+of code. Also, the DataFusion Python project uses the existing definitions from
+`Apache Arrow CStream Interface 
<https://arrow.apache.org/docs/format/CStreamInterface.html>`_
+to support importing **and** exporting tables. Any Python package that 
supports reading
+the Arrow C Stream interface can work with DataFusion Python out of the box! 
You can read
+more about working with Arrow sources in the :ref:`Data Sources 
<user_guide_data_sources>`
+page.
+
+To learn more about the Foreign Function Interface in Rust, the
+`Rustonomicon <https://doc.rust-lang.org/nomicon/ffi.html>`_ is a good 
resource.
+
+Inspiration from Arrow
+----------------------
+
+DataFusion is built upon `Apache Arrow <https://arrow.apache.org/>`_. The 
canonical Python
+Arrow implementation, `pyarrow 
<https://arrow.apache.org/docs/python/index.html>`_ provides
+an excellent way to share Arrow data between Python projects without 
performing any copy
+operations on the data. They do this by using a well defined set of 
interfaces. You can
+find the details about their stream interface
+`here <https://arrow.apache.org/docs/format/CStreamInterface.html>`_. The
+`Rust Arrow Implementation <https://github.com/apache/arrow-rs>`_ also 
supports these
+``C`` style definitions via the Foreign Function Interface.
+
+In addition to using these interfaces to transfer Arrow data between 
libraries, ``pyarrow``
+goes one step further to make sharing the interfaces easier in Python. They do 
this
+by exposing PyCapsules that contain the expected functionality.

Review Comment:
   It would be nice to note that this isn't just a pyarrow implementation 
detail; it's a full _specification_ for Python, of which pyarrow is just one 
implementor 
https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html



##########
docs/source/contributor-guide/ffi.rst:
##########
@@ -0,0 +1,212 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Python Extensions
+=================
+
+The DataFusion in Python project is designed to allow users to extend its 
functionality in a few core
+areas. Ideally many users would like to package their extensions as a Python 
package and easily
+integrate that package with this project. This page serves to describe some of 
the challenges we face
+when doing these integrations and the approach our project uses.
+
+The Primary Issue
+-----------------
+
+Suppose you wish to use DataFusion and you have a custom data source that can 
produce tables that
+can then be queried against, similar to how you can register a :ref:`CSV 
<io_csv>` or
+:ref:`Parquet <io_parquet>` file. In DataFusion terminology, you likely want 
to implement a 
+:ref:`Custom Table Provider <io_custom_table_provider>`. In an effort to make 
your data source
+as performant as possible and to utilize the features of DataFusion, you may 
decide to write
+your source in Rust and then expose it through `PyO3 <https://pyo3.rs>`_ as a 
Python library.
+
+At first glance, it may appear the best way to do this is to add the 
``datafusion-python``
+crate as a dependency, provide a ``PyTable``, and then to register it with the 
+``SessionContext``. Unfortunately, this will not work.
+
+When you produce your code as a Python library and it needs to interact with 
the DataFusion
+library, at the lowest level they communicate through an Application Binary 
Interface (ABI).
+The acronym sounds similar to API (Application Programming Interface), but it 
is distinctly
+different.
+
+The ABI sets the standard for how these libraries can share data and functions 
between each
+other. One of the key differences between Rust and other programming languages 
is that Rust
+does not have a stable ABI. What this means in practice is that if you compile 
a Rust library
+with one version of the ``rustc`` compiler and I compile another library to 
interface with it
+but I use a different version of the compiler, there is no guarantee the 
interface will be
+the same.
+
+In practice, this means that a Python library built with ``datafusion-python`` 
as a Rust
+dependency will generally **not** be compatible with the DataFusion Python 
package, even
+if they reference the same version of ``datafusion-python``. If you attempt to 
do this, it may
+work on your local computer if you have built both packages with the same 
optimizations.
+This can sometimes lead to a false expectation that the code will work, but it 
frequently
+breaks the moment you try to use your package against the released packages.
+
+You can find more information about the Rust ABI in their
+`online documentation <https://doc.rust-lang.org/reference/abi.html>`_.
+
+The FFI Approach
+----------------
+
+Rust supports interacting with other programming languages through it's 
Foreign Function
+Interface (FFI). The advantage of using the FFI is that it enables you to 
write data structures
+and functions that have a stable ABI. The allows you to use Rust code with C, 
Python, and
+other languages. In fact, the `PyO3 <https://pyo3.rs>`_ library uses the FFI 
to share data
+and functions between Python and Rust.
+
+The approach we are taking in the DataFusion in Python project is to 
incrementally expose
+more portions of the DataFusion project via FFI interfaces. This allows users 
to write Rust
+code that does **not** require the ``datafusion-python`` crate as a 
dependency, expose their
+code in Python via PyO3, and have it interact with the DataFusion Python 
package.
+
+Early adopters of this approach include `delta-rs 
<https://delta-io.github.io/delta-rs/>`_
+who has adapted their Table Provider for use in ```datafusion-python``` with 
only a few lines
+of code. Also, the DataFusion Python project uses the existing definitions from
+`Apache Arrow CStream Interface 
<https://arrow.apache.org/docs/format/CStreamInterface.html>`_
+to support importing **and** exporting tables. Any Python package that 
supports reading
+the Arrow C Stream interface can work with DataFusion Python out of the box! 
You can read
+more about working with Arrow sources in the :ref:`Data Sources 
<user_guide_data_sources>`
+page.
+
+To learn more about the Foreign Function Interface in Rust, the
+`Rustonomicon <https://doc.rust-lang.org/nomicon/ffi.html>`_ is a good 
resource.
+
+Inspiration from Arrow
+----------------------
+
+DataFusion is built upon `Apache Arrow <https://arrow.apache.org/>`_. The 
canonical Python
+Arrow implementation, `pyarrow 
<https://arrow.apache.org/docs/python/index.html>`_ provides
+an excellent way to share Arrow data between Python projects without 
performing any copy
+operations on the data. They do this by using a well defined set of 
interfaces. You can
+find the details about their stream interface
+`here <https://arrow.apache.org/docs/format/CStreamInterface.html>`_. The
+`Rust Arrow Implementation <https://github.com/apache/arrow-rs>`_ also 
supports these
+``C`` style definitions via the Foreign Function Interface.
+
+In addition to using these interfaces to transfer Arrow data between 
libraries, ``pyarrow``
+goes one step further to make sharing the interfaces easier in Python. They do 
this
+by exposing PyCapsules that contain the expected functionality.
+
+You can learn more about PyCapsules from the official
+`Python online documentation <https://docs.python.org/3/c-api/capsule.html>`_. 
PyCapsules
+have excellent support in PyO3 already. The
+`PyO3 online documentation 
<https://pyo3.rs/main/doc/pyo3/types/struct.pycapsule>`_ is a good source
+for more details on using PyCapsules in Rust.
+
+Two lessons we leverage from the Arrow project in DataFusion Python are:
+
+- We reuse the existing Arrow FFI functionality wherever possible.
+- We expose PyCapsules that contain a FFI stable struct.
+
+Implementation Details
+----------------------
+
+The bulk of the code necessary to perform our FFI operations is in the 
upstream 
+`DataFusion <https://datafusion.apache.org/>`_ core repository. You can review 
the code and
+documentation in the `datafusion-ffi`_ crate.
+
+Our FFI implementation is narrowly focused at sharing data and functions with 
Rust backed
+libraries. This allows us to use the `abi_stable crate 
<https://crates.io/crates/abi_stable>`_.
+This is an excellent crate that allows for easy conversion between Rust native 
types
+and FFI-safe alternatives. For example, if you needed to pass a 
``Vec<String>`` via FFI,
+you can simply convert it to a ``RVec<RString>`` in an intuitive manner. It 
also supports
+features like ``RResult`` and ``ROption`` that do not have an obvious 
translation to a
+C equivalent.
+
+The `datafusion-ffi`_ crate has been designed to make it easy to convert from 
DataFusion
+traits into their FFI counterparts. For example, if you have defined a custom
+`TableProvider 
<https://docs.rs/datafusion/45.0.0/datafusion/catalog/trait.TableProvider.html>`_
+and you want to create a sharable FFI counterpart, you could write:
+
+.. code-block:: rust
+
+    let my_provider = MyTableProvider::default();
+    let ffi_provider = FFI_TableProvider::new(Arc::new(my_provider), false, 
None);
+
+If you were interfacing with a library that provided the above 
``FFI_TableProvider`` and
+you needed to turn it back into an ``TableProvider``, you can turn it into a
+``ForeignTableProvider`` with implements the ``TableProvider`` trait.
+
+.. code-block:: rust
+
+    let foreign_provider: ForeignTableProvider = ffi_provider.into();
+
+If you review the code in `datafusion-ffi`_ you will find that each of the 
traits we share
+across the boundary has two portions, one with a ``FFI_`` prefix and one with 
a ``Foreign``
+prefix. This is used to distinguish which side of the FFI boundary that struct 
is
+designed to be used on. The structures with the ``FFI_`` prefix are to be used 
on the
+**provider** of the structure. In the example we're showing, this means the 
code that has
+written the underlying ``TableProvider`` implementation to access your custom 
data source.
+The structures with the ``Foreign`` prefix are to be used by the receiver. In 
this case,
+it is the ``datafusion-python`` library.
+
+In order to share these FFI structures, we need to wrap them in some kind of 
Python object
+that can be used to interface from one package to another. As described in the 
above
+section on our inspiration from Arrow, we use ``PyCapsule``. We can create a 
``PyCapsule``
+for our provider thusly:
+
+.. code-block:: rust
+
+    let name = CString::new("datafusion_table_provider")?;
+    let my_capsule = PyCapsule::new_bound(py, provider, Some(name))?;

Review Comment:
   I think you really want to use 
[`PyCapsule::new_with_destructor`](https://docs.rs/pyo3/latest/pyo3/types/struct.PyCapsule.html#method.new_with_destructor)
 so that you don't get memory leaks if the capsule isn't used.
   
   Or maybe `PyCapsule::new` will automatically call the `Drop` impl and you 
don't need to manually call `new_with_destructor`? I can't remember



##########
docs/source/contributor-guide/ffi.rst:
##########
@@ -0,0 +1,212 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Python Extensions
+=================
+
+The DataFusion in Python project is designed to allow users to extend its 
functionality in a few core
+areas. Ideally many users would like to package their extensions as a Python 
package and easily
+integrate that package with this project. This page serves to describe some of 
the challenges we face
+when doing these integrations and the approach our project uses.
+
+The Primary Issue
+-----------------
+
+Suppose you wish to use DataFusion and you have a custom data source that can 
produce tables that
+can then be queried against, similar to how you can register a :ref:`CSV 
<io_csv>` or
+:ref:`Parquet <io_parquet>` file. In DataFusion terminology, you likely want 
to implement a 
+:ref:`Custom Table Provider <io_custom_table_provider>`. In an effort to make 
your data source
+as performant as possible and to utilize the features of DataFusion, you may 
decide to write
+your source in Rust and then expose it through `PyO3 <https://pyo3.rs>`_ as a 
Python library.
+
+At first glance, it may appear the best way to do this is to add the 
``datafusion-python``
+crate as a dependency, provide a ``PyTable``, and then to register it with the 
+``SessionContext``. Unfortunately, this will not work.
+
+When you produce your code as a Python library and it needs to interact with 
the DataFusion
+library, at the lowest level they communicate through an Application Binary 
Interface (ABI).
+The acronym sounds similar to API (Application Programming Interface), but it 
is distinctly
+different.
+
+The ABI sets the standard for how these libraries can share data and functions 
between each
+other. One of the key differences between Rust and other programming languages 
is that Rust
+does not have a stable ABI. What this means in practice is that if you compile 
a Rust library
+with one version of the ``rustc`` compiler and I compile another library to 
interface with it
+but I use a different version of the compiler, there is no guarantee the 
interface will be
+the same.
+
+In practice, this means that a Python library built with ``datafusion-python`` 
as a Rust
+dependency will generally **not** be compatible with the DataFusion Python 
package, even
+if they reference the same version of ``datafusion-python``. If you attempt to 
do this, it may
+work on your local computer if you have built both packages with the same 
optimizations.
+This can sometimes lead to a false expectation that the code will work, but it 
frequently
+breaks the moment you try to use your package against the released packages.
+
+You can find more information about the Rust ABI in their
+`online documentation <https://doc.rust-lang.org/reference/abi.html>`_.
+
+The FFI Approach
+----------------
+
+Rust supports interacting with other programming languages through it's 
Foreign Function
+Interface (FFI). The advantage of using the FFI is that it enables you to 
write data structures
+and functions that have a stable ABI. The allows you to use Rust code with C, 
Python, and
+other languages. In fact, the `PyO3 <https://pyo3.rs>`_ library uses the FFI 
to share data
+and functions between Python and Rust.
+
+The approach we are taking in the DataFusion in Python project is to 
incrementally expose
+more portions of the DataFusion project via FFI interfaces. This allows users 
to write Rust
+code that does **not** require the ``datafusion-python`` crate as a 
dependency, expose their
+code in Python via PyO3, and have it interact with the DataFusion Python 
package.
+
+Early adopters of this approach include `delta-rs 
<https://delta-io.github.io/delta-rs/>`_
+who has adapted their Table Provider for use in ```datafusion-python``` with 
only a few lines
+of code. Also, the DataFusion Python project uses the existing definitions from
+`Apache Arrow CStream Interface 
<https://arrow.apache.org/docs/format/CStreamInterface.html>`_
+to support importing **and** exporting tables. Any Python package that 
supports reading
+the Arrow C Stream interface can work with DataFusion Python out of the box! 
You can read
+more about working with Arrow sources in the :ref:`Data Sources 
<user_guide_data_sources>`
+page.
+
+To learn more about the Foreign Function Interface in Rust, the
+`Rustonomicon <https://doc.rust-lang.org/nomicon/ffi.html>`_ is a good 
resource.
+
+Inspiration from Arrow
+----------------------
+
+DataFusion is built upon `Apache Arrow <https://arrow.apache.org/>`_. The 
canonical Python
+Arrow implementation, `pyarrow 
<https://arrow.apache.org/docs/python/index.html>`_ provides
+an excellent way to share Arrow data between Python projects without 
performing any copy
+operations on the data. They do this by using a well defined set of 
interfaces. You can
+find the details about their stream interface
+`here <https://arrow.apache.org/docs/format/CStreamInterface.html>`_. The
+`Rust Arrow Implementation <https://github.com/apache/arrow-rs>`_ also 
supports these
+``C`` style definitions via the Foreign Function Interface.
+
+In addition to using these interfaces to transfer Arrow data between 
libraries, ``pyarrow``
+goes one step further to make sharing the interfaces easier in Python. They do 
this
+by exposing PyCapsules that contain the expected functionality.
+
+You can learn more about PyCapsules from the official
+`Python online documentation <https://docs.python.org/3/c-api/capsule.html>`_. 
PyCapsules
+have excellent support in PyO3 already. The
+`PyO3 online documentation 
<https://pyo3.rs/main/doc/pyo3/types/struct.pycapsule>`_ is a good source
+for more details on using PyCapsules in Rust.
+
+Two lessons we leverage from the Arrow project in DataFusion Python are:
+
+- We reuse the existing Arrow FFI functionality wherever possible.
+- We expose PyCapsules that contain a FFI stable struct.
+
+Implementation Details
+----------------------
+
+The bulk of the code necessary to perform our FFI operations is in the 
upstream 
+`DataFusion <https://datafusion.apache.org/>`_ core repository. You can review 
the code and
+documentation in the `datafusion-ffi`_ crate.
+
+Our FFI implementation is narrowly focused at sharing data and functions with 
Rust backed
+libraries. This allows us to use the `abi_stable crate 
<https://crates.io/crates/abi_stable>`_.
+This is an excellent crate that allows for easy conversion between Rust native 
types
+and FFI-safe alternatives. For example, if you needed to pass a 
``Vec<String>`` via FFI,
+you can simply convert it to a ``RVec<RString>`` in an intuitive manner. It 
also supports
+features like ``RResult`` and ``ROption`` that do not have an obvious 
translation to a
+C equivalent.
+
+The `datafusion-ffi`_ crate has been designed to make it easy to convert from 
DataFusion
+traits into their FFI counterparts. For example, if you have defined a custom
+`TableProvider 
<https://docs.rs/datafusion/45.0.0/datafusion/catalog/trait.TableProvider.html>`_
+and you want to create a sharable FFI counterpart, you could write:
+
+.. code-block:: rust
+
+    let my_provider = MyTableProvider::default();
+    let ffi_provider = FFI_TableProvider::new(Arc::new(my_provider), false, 
None);
+
+If you were interfacing with a library that provided the above 
``FFI_TableProvider`` and
+you needed to turn it back into an ``TableProvider``, you can turn it into a

Review Comment:
   `into an` should be `into a`



##########
docs/source/contributor-guide/ffi.rst:
##########
@@ -0,0 +1,212 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Python Extensions
+=================
+
+The DataFusion in Python project is designed to allow users to extend its 
functionality in a few core
+areas. Ideally many users would like to package their extensions as a Python 
package and easily
+integrate that package with this project. This page serves to describe some of 
the challenges we face
+when doing these integrations and the approach our project uses.
+
+The Primary Issue
+-----------------
+
+Suppose you wish to use DataFusion and you have a custom data source that can 
produce tables that
+can then be queried against, similar to how you can register a :ref:`CSV 
<io_csv>` or
+:ref:`Parquet <io_parquet>` file. In DataFusion terminology, you likely want 
to implement a 
+:ref:`Custom Table Provider <io_custom_table_provider>`. In an effort to make 
your data source
+as performant as possible and to utilize the features of DataFusion, you may 
decide to write
+your source in Rust and then expose it through `PyO3 <https://pyo3.rs>`_ as a 
Python library.
+
+At first glance, it may appear the best way to do this is to add the 
``datafusion-python``
+crate as a dependency, provide a ``PyTable``, and then to register it with the 
+``SessionContext``. Unfortunately, this will not work.
+
+When you produce your code as a Python library and it needs to interact with 
the DataFusion
+library, at the lowest level they communicate through an Application Binary 
Interface (ABI).
+The acronym sounds similar to API (Application Programming Interface), but it 
is distinctly
+different.
+
+The ABI sets the standard for how these libraries can share data and functions 
between each
+other. One of the key differences between Rust and other programming languages 
is that Rust
+does not have a stable ABI. What this means in practice is that if you compile 
a Rust library
+with one version of the ``rustc`` compiler and I compile another library to 
interface with it
+but I use a different version of the compiler, there is no guarantee the 
interface will be
+the same.
+
+In practice, this means that a Python library built with ``datafusion-python`` 
as a Rust
+dependency will generally **not** be compatible with the DataFusion Python 
package, even
+if they reference the same version of ``datafusion-python``. If you attempt to 
do this, it may
+work on your local computer if you have built both packages with the same 
optimizations.
+This can sometimes lead to a false expectation that the code will work, but it 
frequently
+breaks the moment you try to use your package against the released packages.
+
+You can find more information about the Rust ABI in their
+`online documentation <https://doc.rust-lang.org/reference/abi.html>`_.
+
+The FFI Approach
+----------------
+
+Rust supports interacting with other programming languages through it's 
Foreign Function
+Interface (FFI). The advantage of using the FFI is that it enables you to 
write data structures
+and functions that have a stable ABI. The allows you to use Rust code with C, 
Python, and
+other languages. In fact, the `PyO3 <https://pyo3.rs>`_ library uses the FFI 
to share data
+and functions between Python and Rust.
+
+The approach we are taking in the DataFusion in Python project is to 
incrementally expose
+more portions of the DataFusion project via FFI interfaces. This allows users 
to write Rust
+code that does **not** require the ``datafusion-python`` crate as a 
dependency, expose their
+code in Python via PyO3, and have it interact with the DataFusion Python 
package.
+
+Early adopters of this approach include `delta-rs 
<https://delta-io.github.io/delta-rs/>`_
+who has adapted their Table Provider for use in ```datafusion-python``` with 
only a few lines
+of code. Also, the DataFusion Python project uses the existing definitions from
+`Apache Arrow CStream Interface 
<https://arrow.apache.org/docs/format/CStreamInterface.html>`_
+to support importing **and** exporting tables. Any Python package that 
supports reading
+the Arrow C Stream interface can work with DataFusion Python out of the box! 
You can read
+more about working with Arrow sources in the :ref:`Data Sources 
<user_guide_data_sources>`
+page.
+
+To learn more about the Foreign Function Interface in Rust, the
+`Rustonomicon <https://doc.rust-lang.org/nomicon/ffi.html>`_ is a good 
resource.
+
+Inspiration from Arrow
+----------------------
+
+DataFusion is built upon `Apache Arrow <https://arrow.apache.org/>`_. The 
canonical Python
+Arrow implementation, `pyarrow 
<https://arrow.apache.org/docs/python/index.html>`_ provides
+an excellent way to share Arrow data between Python projects without 
performing any copy
+operations on the data. They do this by using a well defined set of 
interfaces. You can
+find the details about their stream interface
+`here <https://arrow.apache.org/docs/format/CStreamInterface.html>`_. The
+`Rust Arrow Implementation <https://github.com/apache/arrow-rs>`_ also 
supports these
+``C`` style definitions via the Foreign Function Interface.
+
+In addition to using these interfaces to transfer Arrow data between 
libraries, ``pyarrow``
+goes one step further to make sharing the interfaces easier in Python. They do 
this
+by exposing PyCapsules that contain the expected functionality.

Review Comment:
   In particular, pyarrow didn't used to use capsules; they use to use raw 
integers in the `_export_to_c` and `_import_from_c` methods, where those 
integers refer to data pointers. 
   
   But this can be unsafe/lead to memory leaks if those pointers aren't always 
consumed, because then the Arrow release callback may never be called.
   
   The added benefit and safety of the capsule is specifically because the 
capsule allows you to **define a destructor**, so that the destructor will be 
called by the Python GC  if the capsule is never used. So you can't get memory 
leaks when using capsules.



##########
docs/source/contributor-guide/ffi.rst:
##########
@@ -0,0 +1,212 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Python Extensions
+=================
+
+The DataFusion in Python project is designed to allow users to extend its 
functionality in a few core
+areas. Ideally many users would like to package their extensions as a Python 
package and easily
+integrate that package with this project. This page serves to describe some of 
the challenges we face
+when doing these integrations and the approach our project uses.
+
+The Primary Issue
+-----------------
+
+Suppose you wish to use DataFusion and you have a custom data source that can 
produce tables that
+can then be queried against, similar to how you can register a :ref:`CSV 
<io_csv>` or
+:ref:`Parquet <io_parquet>` file. In DataFusion terminology, you likely want 
to implement a 
+:ref:`Custom Table Provider <io_custom_table_provider>`. In an effort to make 
your data source
+as performant as possible and to utilize the features of DataFusion, you may 
decide to write
+your source in Rust and then expose it through `PyO3 <https://pyo3.rs>`_ as a 
Python library.
+
+At first glance, it may appear the best way to do this is to add the 
``datafusion-python``
+crate as a dependency, provide a ``PyTable``, and then to register it with the 
+``SessionContext``. Unfortunately, this will not work.
+
+When you produce your code as a Python library and it needs to interact with 
the DataFusion
+library, at the lowest level they communicate through an Application Binary 
Interface (ABI).
+The acronym sounds similar to API (Application Programming Interface), but it 
is distinctly
+different.
+
+The ABI sets the standard for how these libraries can share data and functions 
between each
+other. One of the key differences between Rust and other programming languages 
is that Rust
+does not have a stable ABI. What this means in practice is that if you compile 
a Rust library
+with one version of the ``rustc`` compiler and I compile another library to 
interface with it
+but I use a different version of the compiler, there is no guarantee the 
interface will be
+the same.
+
+In practice, this means that a Python library built with ``datafusion-python`` 
as a Rust
+dependency will generally **not** be compatible with the DataFusion Python 
package, even
+if they reference the same version of ``datafusion-python``. If you attempt to 
do this, it may
+work on your local computer if you have built both packages with the same 
optimizations.
+This can sometimes lead to a false expectation that the code will work, but it 
frequently
+breaks the moment you try to use your package against the released packages.
+
+You can find more information about the Rust ABI in their
+`online documentation <https://doc.rust-lang.org/reference/abi.html>`_.
+
+The FFI Approach
+----------------
+
+Rust supports interacting with other programming languages through it's 
Foreign Function
+Interface (FFI). The advantage of using the FFI is that it enables you to 
write data structures
+and functions that have a stable ABI. The allows you to use Rust code with C, 
Python, and
+other languages. In fact, the `PyO3 <https://pyo3.rs>`_ library uses the FFI 
to share data
+and functions between Python and Rust.
+
+The approach we are taking in the DataFusion in Python project is to 
incrementally expose
+more portions of the DataFusion project via FFI interfaces. This allows users 
to write Rust
+code that does **not** require the ``datafusion-python`` crate as a 
dependency, expose their
+code in Python via PyO3, and have it interact with the DataFusion Python 
package.
+
+Early adopters of this approach include `delta-rs 
<https://delta-io.github.io/delta-rs/>`_
+who has adapted their Table Provider for use in ```datafusion-python``` with 
only a few lines
+of code. Also, the DataFusion Python project uses the existing definitions from
+`Apache Arrow CStream Interface 
<https://arrow.apache.org/docs/format/CStreamInterface.html>`_
+to support importing **and** exporting tables. Any Python package that 
supports reading
+the Arrow C Stream interface can work with DataFusion Python out of the box! 
You can read
+more about working with Arrow sources in the :ref:`Data Sources 
<user_guide_data_sources>`
+page.
+
+To learn more about the Foreign Function Interface in Rust, the
+`Rustonomicon <https://doc.rust-lang.org/nomicon/ffi.html>`_ is a good 
resource.
+
+Inspiration from Arrow
+----------------------
+
+DataFusion is built upon `Apache Arrow <https://arrow.apache.org/>`_. The 
canonical Python
+Arrow implementation, `pyarrow 
<https://arrow.apache.org/docs/python/index.html>`_ provides
+an excellent way to share Arrow data between Python projects without 
performing any copy
+operations on the data. They do this by using a well defined set of 
interfaces. You can
+find the details about their stream interface
+`here <https://arrow.apache.org/docs/format/CStreamInterface.html>`_. The
+`Rust Arrow Implementation <https://github.com/apache/arrow-rs>`_ also 
supports these
+``C`` style definitions via the Foreign Function Interface.
+
+In addition to using these interfaces to transfer Arrow data between 
libraries, ``pyarrow``
+goes one step further to make sharing the interfaces easier in Python. They do 
this
+by exposing PyCapsules that contain the expected functionality.
+
+You can learn more about PyCapsules from the official
+`Python online documentation <https://docs.python.org/3/c-api/capsule.html>`_. 
PyCapsules
+have excellent support in PyO3 already. The
+`PyO3 online documentation 
<https://pyo3.rs/main/doc/pyo3/types/struct.pycapsule>`_ is a good source
+for more details on using PyCapsules in Rust.
+
+Two lessons we leverage from the Arrow project in DataFusion Python are:
+
+- We reuse the existing Arrow FFI functionality wherever possible.
+- We expose PyCapsules that contain a FFI stable struct.
+
+Implementation Details
+----------------------
+
+The bulk of the code necessary to perform our FFI operations is in the 
upstream 
+`DataFusion <https://datafusion.apache.org/>`_ core repository. You can review 
the code and
+documentation in the `datafusion-ffi`_ crate.
+
+Our FFI implementation is narrowly focused at sharing data and functions with 
Rust backed
+libraries. This allows us to use the `abi_stable crate 
<https://crates.io/crates/abi_stable>`_.
+This is an excellent crate that allows for easy conversion between Rust native 
types
+and FFI-safe alternatives. For example, if you needed to pass a 
``Vec<String>`` via FFI,
+you can simply convert it to a ``RVec<RString>`` in an intuitive manner. It 
also supports
+features like ``RResult`` and ``ROption`` that do not have an obvious 
translation to a
+C equivalent.
+
+The `datafusion-ffi`_ crate has been designed to make it easy to convert from 
DataFusion
+traits into their FFI counterparts. For example, if you have defined a custom
+`TableProvider 
<https://docs.rs/datafusion/45.0.0/datafusion/catalog/trait.TableProvider.html>`_
+and you want to create a sharable FFI counterpart, you could write:
+
+.. code-block:: rust
+
+    let my_provider = MyTableProvider::default();
+    let ffi_provider = FFI_TableProvider::new(Arc::new(my_provider), false, 
None);
+
+If you were interfacing with a library that provided the above 
``FFI_TableProvider`` and
+you needed to turn it back into an ``TableProvider``, you can turn it into a
+``ForeignTableProvider`` with implements the ``TableProvider`` trait.
+
+.. code-block:: rust
+
+    let foreign_provider: ForeignTableProvider = ffi_provider.into();
+
+If you review the code in `datafusion-ffi`_ you will find that each of the 
traits we share
+across the boundary has two portions, one with a ``FFI_`` prefix and one with 
a ``Foreign``
+prefix. This is used to distinguish which side of the FFI boundary that struct 
is
+designed to be used on. The structures with the ``FFI_`` prefix are to be used 
on the
+**provider** of the structure. In the example we're showing, this means the 
code that has
+written the underlying ``TableProvider`` implementation to access your custom 
data source.
+The structures with the ``Foreign`` prefix are to be used by the receiver. In 
this case,
+it is the ``datafusion-python`` library.
+
+In order to share these FFI structures, we need to wrap them in some kind of 
Python object
+that can be used to interface from one package to another. As described in the 
above
+section on our inspiration from Arrow, we use ``PyCapsule``. We can create a 
``PyCapsule``
+for our provider thusly:
+
+.. code-block:: rust
+
+    let name = CString::new("datafusion_table_provider")?;
+    let my_capsule = PyCapsule::new_bound(py, provider, Some(name))?;
+
+On the receiving side, turn this pycapsule object into the 
``FFI_TableProvider``, which
+can then be turned into a ``ForeignTableProvider`` the associated code is:
+
+.. code-block:: rust
+
+    let capsule = capsule.downcast::<PyCapsule>()?;
+    let provider = unsafe { capsule.reference::<FFI_TableProvider>() };

Review Comment:
   IMO you should always check the name of the capsule before opening it. 
https://github.com/kylebarron/arro3/blob/a041de6bae703a2c8fa3e3b31cd863d4801101d2/pyo3-arrow/src/ffi/from_python/utils.rs#L48
   
   Your FFI interface should also define a name that all implementations should 
attach to the pycapsule.



##########
docs/source/contributor-guide/ffi.rst:
##########
@@ -0,0 +1,212 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Python Extensions
+=================
+
+The DataFusion in Python project is designed to allow users to extend its 
functionality in a few core
+areas. Ideally many users would like to package their extensions as a Python 
package and easily
+integrate that package with this project. This page serves to describe some of 
the challenges we face
+when doing these integrations and the approach our project uses.
+
+The Primary Issue
+-----------------
+
+Suppose you wish to use DataFusion and you have a custom data source that can 
produce tables that
+can then be queried against, similar to how you can register a :ref:`CSV 
<io_csv>` or
+:ref:`Parquet <io_parquet>` file. In DataFusion terminology, you likely want 
to implement a 
+:ref:`Custom Table Provider <io_custom_table_provider>`. In an effort to make 
your data source
+as performant as possible and to utilize the features of DataFusion, you may 
decide to write
+your source in Rust and then expose it through `PyO3 <https://pyo3.rs>`_ as a 
Python library.
+
+At first glance, it may appear the best way to do this is to add the 
``datafusion-python``
+crate as a dependency, provide a ``PyTable``, and then to register it with the 
+``SessionContext``. Unfortunately, this will not work.
+
+When you produce your code as a Python library and it needs to interact with 
the DataFusion
+library, at the lowest level they communicate through an Application Binary 
Interface (ABI).
+The acronym sounds similar to API (Application Programming Interface), but it 
is distinctly
+different.
+
+The ABI sets the standard for how these libraries can share data and functions 
between each
+other. One of the key differences between Rust and other programming languages 
is that Rust
+does not have a stable ABI. What this means in practice is that if you compile 
a Rust library
+with one version of the ``rustc`` compiler and I compile another library to 
interface with it
+but I use a different version of the compiler, there is no guarantee the 
interface will be
+the same.
+
+In practice, this means that a Python library built with ``datafusion-python`` 
as a Rust
+dependency will generally **not** be compatible with the DataFusion Python 
package, even
+if they reference the same version of ``datafusion-python``. If you attempt to 
do this, it may
+work on your local computer if you have built both packages with the same 
optimizations.
+This can sometimes lead to a false expectation that the code will work, but it 
frequently
+breaks the moment you try to use your package against the released packages.
+
+You can find more information about the Rust ABI in their
+`online documentation <https://doc.rust-lang.org/reference/abi.html>`_.
+
+The FFI Approach
+----------------
+
+Rust supports interacting with other programming languages through it's 
Foreign Function
+Interface (FFI). The advantage of using the FFI is that it enables you to 
write data structures
+and functions that have a stable ABI. The allows you to use Rust code with C, 
Python, and
+other languages. In fact, the `PyO3 <https://pyo3.rs>`_ library uses the FFI 
to share data
+and functions between Python and Rust.
+
+The approach we are taking in the DataFusion in Python project is to 
incrementally expose
+more portions of the DataFusion project via FFI interfaces. This allows users 
to write Rust
+code that does **not** require the ``datafusion-python`` crate as a 
dependency, expose their
+code in Python via PyO3, and have it interact with the DataFusion Python 
package.
+
+Early adopters of this approach include `delta-rs 
<https://delta-io.github.io/delta-rs/>`_
+who has adapted their Table Provider for use in ```datafusion-python``` with 
only a few lines
+of code. Also, the DataFusion Python project uses the existing definitions from
+`Apache Arrow CStream Interface 
<https://arrow.apache.org/docs/format/CStreamInterface.html>`_
+to support importing **and** exporting tables. Any Python package that 
supports reading
+the Arrow C Stream interface can work with DataFusion Python out of the box! 
You can read
+more about working with Arrow sources in the :ref:`Data Sources 
<user_guide_data_sources>`
+page.
+
+To learn more about the Foreign Function Interface in Rust, the
+`Rustonomicon <https://doc.rust-lang.org/nomicon/ffi.html>`_ is a good 
resource.
+
+Inspiration from Arrow
+----------------------
+
+DataFusion is built upon `Apache Arrow <https://arrow.apache.org/>`_. The 
canonical Python
+Arrow implementation, `pyarrow 
<https://arrow.apache.org/docs/python/index.html>`_ provides
+an excellent way to share Arrow data between Python projects without 
performing any copy
+operations on the data. They do this by using a well defined set of 
interfaces. You can
+find the details about their stream interface
+`here <https://arrow.apache.org/docs/format/CStreamInterface.html>`_. The
+`Rust Arrow Implementation <https://github.com/apache/arrow-rs>`_ also 
supports these
+``C`` style definitions via the Foreign Function Interface.
+
+In addition to using these interfaces to transfer Arrow data between 
libraries, ``pyarrow``
+goes one step further to make sharing the interfaces easier in Python. They do 
this
+by exposing PyCapsules that contain the expected functionality.
+
+You can learn more about PyCapsules from the official
+`Python online documentation <https://docs.python.org/3/c-api/capsule.html>`_. 
PyCapsules
+have excellent support in PyO3 already. The
+`PyO3 online documentation 
<https://pyo3.rs/main/doc/pyo3/types/struct.pycapsule>`_ is a good source
+for more details on using PyCapsules in Rust.
+
+Two lessons we leverage from the Arrow project in DataFusion Python are:
+
+- We reuse the existing Arrow FFI functionality wherever possible.
+- We expose PyCapsules that contain a FFI stable struct.
+
+Implementation Details
+----------------------
+
+The bulk of the code necessary to perform our FFI operations is in the 
upstream 
+`DataFusion <https://datafusion.apache.org/>`_ core repository. You can review 
the code and
+documentation in the `datafusion-ffi`_ crate.
+
+Our FFI implementation is narrowly focused at sharing data and functions with 
Rust backed
+libraries. This allows us to use the `abi_stable crate 
<https://crates.io/crates/abi_stable>`_.
+This is an excellent crate that allows for easy conversion between Rust native 
types
+and FFI-safe alternatives. For example, if you needed to pass a 
``Vec<String>`` via FFI,
+you can simply convert it to a ``RVec<RString>`` in an intuitive manner. It 
also supports
+features like ``RResult`` and ``ROption`` that do not have an obvious 
translation to a
+C equivalent.
+
+The `datafusion-ffi`_ crate has been designed to make it easy to convert from 
DataFusion
+traits into their FFI counterparts. For example, if you have defined a custom
+`TableProvider 
<https://docs.rs/datafusion/45.0.0/datafusion/catalog/trait.TableProvider.html>`_
+and you want to create a sharable FFI counterpart, you could write:
+
+.. code-block:: rust
+
+    let my_provider = MyTableProvider::default();
+    let ffi_provider = FFI_TableProvider::new(Arc::new(my_provider), false, 
None);
+
+If you were interfacing with a library that provided the above 
``FFI_TableProvider`` and
+you needed to turn it back into an ``TableProvider``, you can turn it into a
+``ForeignTableProvider`` with implements the ``TableProvider`` trait.
+
+.. code-block:: rust
+
+    let foreign_provider: ForeignTableProvider = ffi_provider.into();
+
+If you review the code in `datafusion-ffi`_ you will find that each of the 
traits we share
+across the boundary has two portions, one with a ``FFI_`` prefix and one with 
a ``Foreign``
+prefix. This is used to distinguish which side of the FFI boundary that struct 
is
+designed to be used on. The structures with the ``FFI_`` prefix are to be used 
on the
+**provider** of the structure. In the example we're showing, this means the 
code that has
+written the underlying ``TableProvider`` implementation to access your custom 
data source.
+The structures with the ``Foreign`` prefix are to be used by the receiver. In 
this case,
+it is the ``datafusion-python`` library.
+
+In order to share these FFI structures, we need to wrap them in some kind of 
Python object
+that can be used to interface from one package to another. As described in the 
above
+section on our inspiration from Arrow, we use ``PyCapsule``. We can create a 
``PyCapsule``
+for our provider thusly:
+
+.. code-block:: rust
+
+    let name = CString::new("datafusion_table_provider")?;
+    let my_capsule = PyCapsule::new_bound(py, provider, Some(name))?;
+
+On the receiving side, turn this pycapsule object into the 
``FFI_TableProvider``, which
+can then be turned into a ``ForeignTableProvider`` the associated code is:
+
+.. code-block:: rust
+
+    let capsule = capsule.downcast::<PyCapsule>()?;
+    let provider = unsafe { capsule.reference::<FFI_TableProvider>() };
+
+By convention the ``datafusion-python`` library expects a Python object that 
has a
+``TableProvider`` PyCapsule to have this capsule accessible by calling a 
function named
+``__datafusion_table_provider__``. You can see a complete working example of 
how to
+share a ``TableProvider`` from one python library to DataFusion Python in the
+`repository examples folder 
<https://github.com/apache/datafusion-python/tree/main/examples/ffi-table-provider>`_.
+
+This section has been written using ``TableProvider`` as an example. It is the 
first
+extension that has been written using this approach and the most thoroughly 
implemented.
+As we continue to expose more of the DataFusion features, we intend to follow 
this same
+design pattern.
+
+Alternative Approach
+--------------------
+
+Suppose you needed to expose some other features of DataFusion and you could 
not wait
+for the upstream repository to implement the FFI approach we describe. In this 
case
+you decide to create your dependency on the ``datafusion-python`` crate 
instead.
+
+As we discussed, this is not guaranteed to work across different compiler 
versions and
+optimization levels. If you wish to go down this route, there are two 
approaches we
+have identified you can use.

Review Comment:
   IMO I wouldn't even suggest this because this is explicitly recommended 
against by the pyo3 maintainers: https://github.com/PyO3/pyo3/issues/1444



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to