pitrou commented on a change in pull request #11679:
URL: https://github.com/apache/arrow/pull/11679#discussion_r786142278



##########
File path: docs/source/python/integration.rst
##########
@@ -0,0 +1,39 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+.. _integration:
+
+********************
+PyArrow Integrations
+********************
+
+Arrow is designed to be both a framework and an interchange format.

Review comment:
       Add a link to the format definition?

##########
File path: docs/source/python/integration/python_r.rst
##########
@@ -0,0 +1,312 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Integrating PyArrow with R
+==========================
+
+Arrow supports exchanging data within the same process through the
+:ref:`c-data-interface`.
+
+This can be used to exchange data between Python and R functions and
+methods so that the two languages can interact without any cost of
+marshaling and unmarshaling data.
+
+.. note::
+
+    The article takes for granted that you have a ``Python`` environment
+    with ``pyarrow`` correctly installed and an ``R`` environment with
+    ``arrow`` library correctly installed. 
+    See `Python Install Instructions 
<https://arrow.apache.org/docs/python/install.html>`_
+    and `R Install instructions 
<https://arrow.apache.org/docs/r/#installation>`_
+    for further details.
+
+Invoking R functions from Python
+--------------------------------
+
+Suppose we have a simple R function receiving an Arrow Array to
+add ``3`` to all its elements:
+
+.. code-block:: R
+
+    library(arrow)
+
+    addthree <- function(arr) {
+        return(arr + 3L)
+    }
+
+We could save such function in a ``addthree.R`` file so that we can

Review comment:
       ```suggestion
   We could save such a function in a ``addthree.R`` file so that we can
   ```

##########
File path: docs/source/python/integration/python_r.rst
##########
@@ -0,0 +1,312 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Integrating PyArrow with R
+==========================
+
+Arrow supports exchanging data within the same process through the
+:ref:`c-data-interface`.
+
+This can be used to exchange data between Python and R functions and
+methods so that the two languages can interact without any cost of
+marshaling and unmarshaling data.
+
+.. note::
+
+    The article takes for granted that you have a ``Python`` environment
+    with ``pyarrow`` correctly installed and an ``R`` environment with
+    ``arrow`` library correctly installed. 
+    See `Python Install Instructions 
<https://arrow.apache.org/docs/python/install.html>`_
+    and `R Install instructions 
<https://arrow.apache.org/docs/r/#installation>`_
+    for further details.
+
+Invoking R functions from Python
+--------------------------------
+
+Suppose we have a simple R function receiving an Arrow Array to
+add ``3`` to all its elements:
+
+.. code-block:: R
+
+    library(arrow)
+
+    addthree <- function(arr) {
+        return(arr + 3L)
+    }
+
+We could save such function in a ``addthree.R`` file so that we can
+make it available for reuse.
+
+Once the ``addthree.R`` is created we can invoke any of its functions
+from Python using the 
+`rpy2 <https://rpy2.github.io/doc/latest/html/index.html>`_ library which
+enables a R runtime within the Python interpreter.
+
+``rpy2`` can be installed using ``pip`` like most python libraries
+
+.. code-block:: bash
+
+    $ pip install rpy2
+
+The most basic thing we can do with our ``addthree`` function is to
+invoke it from Python with a number and see how it will return the result.
+
+To do so we can create an ``addthree.py`` file which uses ``rpy2`` to
+import the ``addthree`` function from ``addthree.R`` file and invoke it:
+
+.. code-block:: python
+
+    import rpy2.robjects as robjects
+
+    # Load the addthree.R file
+    r_source = robjects.r["source"]
+    r_source("addthree.R")
+
+    # Get a reference to the addthree function
+    addthree = robjects.r["addthree"]
+
+    # Invoke the function
+    r = addthree(3)
+
+    # Access the returned value
+    value = r[0]
+    print(value)
+
+Running the ``addthree.py`` file will show how our Python code is able
+to access the ``R`` function and print the expected result:
+
+.. code-block:: bash
+
+    $ python addthree.py 
+    6
+
+If instead of passing around basic data types we want to pass around
+Arrow Arrays, we can do so relying on the
+`rpy2-arrow <https://rpy2.github.io/rpy2-arrow/version/main/html/index.html>`_ 
+module which implements ``rpy2`` support for Arrow types.
+
+``rpy2-arrow`` can be installed through ``pip``:
+
+.. code-block:: bash
+
+    $ pip install rpy2-arrow
+
+``rpy2-arrow`` implements converters from PyArrow objects to R Arrow objects,
+this is done without occurring into any data copy cost as it relies on the
+C Data interface.
+
+To pass to ``addthree`` a PyArrow array our ``addthree.py`` needs to be 
modified
+to enable ``rpy2-arrow`` converters and then pass the PyArrow array:
+
+.. code-block:: python
+
+    import rpy2.robjects as robjects
+    from rpy2_arrow.pyarrow_rarrow import (rarrow_to_py_array,
+                                           converter as arrowconverter)
+    from rpy2.robjects.conversion import localconverter
+
+    r_source = robjects.r["source"]
+    r_source("addthree.R")
+
+    addthree = robjects.r["addthree"]
+
+    import pyarrow
+
+    array = pyarrow.array((1, 2, 3))
+
+    # Enable rpy2-arrow converter so that R can receive the array.
+    with localconverter(arrowconverter):
+        r_result = addthree(array)
+
+    # The result of the R function will be an R Environment
+    # we can convert the Environment back to a pyarrow Array
+    # using the rarrow_to_py_array function
+    py_result = rarrow_to_py_array(r_result)
+    print("RESULT", type(py_result), py_result)
+
+Running the newly modified ``addthree.py`` should now properly execute
+the R function and print the resulting PyArrow Array:
+
+.. code-block:: bash
+
+    $ python addthree.py
+    RESULT <class 'pyarrow.lib.Int64Array'> [
+      4,
+      5,
+      6
+    ]
+
+For additional information you can refer to
+`rpy2 Documentation <https://rpy2.github.io/doc/latest/html/index.html>`_
+and `rpy2-arrow Documentation 
<https://rpy2.github.io/rpy2-arrow/version/main/html/index.html>`_
+
+Invoking Python functions from R
+--------------------------------
+
+Exposing Python functions to R can be done through the ``reticulate``
+library. For example if we want to invoke :func:`pyarrow.compute.add` from
+R on an Array created in R we can do so importing ``pyarrow`` in R
+through ``reticulate``.
+
+A basic ``addthree.R`` script that invokes ``add`` to add ``3`` to
+an R array would look like:
+
+.. code-block:: R
+
+    # Load arrow and reticulate libraries
+    library(arrow)
+    library(reticulate)
+
+    # Create a new array in R
+    a <- Array$create(c(1, 2, 3))
+
+    # Make pyarrow.compute available to R
+    pc <- import("pyarrow.compute")
+
+    # Invoke pyarrow.compute.add with the array and 3
+    # This will add 3 to all elements of the array and return a new Array
+    result <- pc$add(a, 3)
+
+    # Print the result to confirm it's what we expect
+    print(result)
+
+Invoking the ``addthree.R`` script will print the outcome of adding
+``3`` to all the elements of the original ``Array$create(c(1, 2, 3))`` array:
+
+.. code-block:: bash
+
+    $ R --silent -f addthree.R 
+    Array
+    <double>
+    [
+      4,
+      5,
+      6
+    ]
+
+For additional information you can refer to
+`Reticulate Documentation <https://rstudio.github.io/reticulate/>`_
+and to the `R Arrow documentation 
<https://arrow.apache.org/docs/r/articles/python.html#using>`_
+
+R to Python communication using C Data Interface
+------------------------------------------------

Review comment:
       ```suggestion
   R to Python communication using the C Data Interface
   ----------------------------------------------------
   ```

##########
File path: docs/source/python/integration/python_r.rst
##########
@@ -0,0 +1,312 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Integrating PyArrow with R
+==========================
+
+Arrow supports exchanging data within the same process through the
+:ref:`c-data-interface`.
+
+This can be used to exchange data between Python and R functions and
+methods so that the two languages can interact without any cost of
+marshaling and unmarshaling data.
+
+.. note::
+
+    The article takes for granted that you have a ``Python`` environment
+    with ``pyarrow`` correctly installed and an ``R`` environment with
+    ``arrow`` library correctly installed. 
+    See `Python Install Instructions 
<https://arrow.apache.org/docs/python/install.html>`_
+    and `R Install instructions 
<https://arrow.apache.org/docs/r/#installation>`_
+    for further details.
+
+Invoking R functions from Python
+--------------------------------
+
+Suppose we have a simple R function receiving an Arrow Array to
+add ``3`` to all its elements:
+
+.. code-block:: R
+
+    library(arrow)
+
+    addthree <- function(arr) {
+        return(arr + 3L)
+    }
+
+We could save such function in a ``addthree.R`` file so that we can
+make it available for reuse.
+
+Once the ``addthree.R`` is created we can invoke any of its functions

Review comment:
       ```suggestion
   Once the ``addthree.R`` file is created we can invoke any of its functions
   ```

##########
File path: docs/source/python/integration/python_r.rst
##########
@@ -0,0 +1,312 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Integrating PyArrow with R
+==========================
+
+Arrow supports exchanging data within the same process through the
+:ref:`c-data-interface`.
+
+This can be used to exchange data between Python and R functions and
+methods so that the two languages can interact without any cost of
+marshaling and unmarshaling data.
+
+.. note::
+
+    The article takes for granted that you have a ``Python`` environment
+    with ``pyarrow`` correctly installed and an ``R`` environment with
+    ``arrow`` library correctly installed. 
+    See `Python Install Instructions 
<https://arrow.apache.org/docs/python/install.html>`_
+    and `R Install instructions 
<https://arrow.apache.org/docs/r/#installation>`_
+    for further details.
+
+Invoking R functions from Python
+--------------------------------
+
+Suppose we have a simple R function receiving an Arrow Array to
+add ``3`` to all its elements:
+
+.. code-block:: R
+
+    library(arrow)
+
+    addthree <- function(arr) {
+        return(arr + 3L)
+    }
+
+We could save such function in a ``addthree.R`` file so that we can
+make it available for reuse.
+
+Once the ``addthree.R`` is created we can invoke any of its functions
+from Python using the 
+`rpy2 <https://rpy2.github.io/doc/latest/html/index.html>`_ library which
+enables a R runtime within the Python interpreter.
+
+``rpy2`` can be installed using ``pip`` like most python libraries
+
+.. code-block:: bash
+
+    $ pip install rpy2
+
+The most basic thing we can do with our ``addthree`` function is to
+invoke it from Python with a number and see how it will return the result.
+
+To do so we can create an ``addthree.py`` file which uses ``rpy2`` to
+import the ``addthree`` function from ``addthree.R`` file and invoke it:
+
+.. code-block:: python
+
+    import rpy2.robjects as robjects
+
+    # Load the addthree.R file
+    r_source = robjects.r["source"]
+    r_source("addthree.R")
+
+    # Get a reference to the addthree function
+    addthree = robjects.r["addthree"]
+
+    # Invoke the function
+    r = addthree(3)
+
+    # Access the returned value
+    value = r[0]
+    print(value)
+
+Running the ``addthree.py`` file will show how our Python code is able
+to access the ``R`` function and print the expected result:
+
+.. code-block:: bash
+
+    $ python addthree.py 
+    6
+
+If instead of passing around basic data types we want to pass around
+Arrow Arrays, we can do so relying on the
+`rpy2-arrow <https://rpy2.github.io/rpy2-arrow/version/main/html/index.html>`_ 
+module which implements ``rpy2`` support for Arrow types.
+
+``rpy2-arrow`` can be installed through ``pip``:
+
+.. code-block:: bash
+
+    $ pip install rpy2-arrow
+
+``rpy2-arrow`` implements converters from PyArrow objects to R Arrow objects,
+this is done without occurring into any data copy cost as it relies on the
+C Data interface.
+
+To pass to ``addthree`` a PyArrow array our ``addthree.py`` needs to be 
modified
+to enable ``rpy2-arrow`` converters and then pass the PyArrow array:
+
+.. code-block:: python
+
+    import rpy2.robjects as robjects
+    from rpy2_arrow.pyarrow_rarrow import (rarrow_to_py_array,
+                                           converter as arrowconverter)
+    from rpy2.robjects.conversion import localconverter
+
+    r_source = robjects.r["source"]
+    r_source("addthree.R")
+
+    addthree = robjects.r["addthree"]
+
+    import pyarrow
+
+    array = pyarrow.array((1, 2, 3))
+
+    # Enable rpy2-arrow converter so that R can receive the array.
+    with localconverter(arrowconverter):
+        r_result = addthree(array)
+
+    # The result of the R function will be an R Environment
+    # we can convert the Environment back to a pyarrow Array
+    # using the rarrow_to_py_array function
+    py_result = rarrow_to_py_array(r_result)
+    print("RESULT", type(py_result), py_result)
+
+Running the newly modified ``addthree.py`` should now properly execute
+the R function and print the resulting PyArrow Array:
+
+.. code-block:: bash
+
+    $ python addthree.py
+    RESULT <class 'pyarrow.lib.Int64Array'> [
+      4,
+      5,
+      6
+    ]
+
+For additional information you can refer to
+`rpy2 Documentation <https://rpy2.github.io/doc/latest/html/index.html>`_
+and `rpy2-arrow Documentation 
<https://rpy2.github.io/rpy2-arrow/version/main/html/index.html>`_
+
+Invoking Python functions from R
+--------------------------------
+
+Exposing Python functions to R can be done through the ``reticulate``
+library. For example if we want to invoke :func:`pyarrow.compute.add` from
+R on an Array created in R we can do so importing ``pyarrow`` in R
+through ``reticulate``.
+
+A basic ``addthree.R`` script that invokes ``add`` to add ``3`` to
+an R array would look like:
+
+.. code-block:: R
+
+    # Load arrow and reticulate libraries
+    library(arrow)
+    library(reticulate)
+
+    # Create a new array in R
+    a <- Array$create(c(1, 2, 3))
+
+    # Make pyarrow.compute available to R
+    pc <- import("pyarrow.compute")
+
+    # Invoke pyarrow.compute.add with the array and 3
+    # This will add 3 to all elements of the array and return a new Array
+    result <- pc$add(a, 3)
+
+    # Print the result to confirm it's what we expect
+    print(result)
+
+Invoking the ``addthree.R`` script will print the outcome of adding
+``3`` to all the elements of the original ``Array$create(c(1, 2, 3))`` array:
+
+.. code-block:: bash
+
+    $ R --silent -f addthree.R 
+    Array
+    <double>
+    [
+      4,
+      5,
+      6
+    ]
+
+For additional information you can refer to
+`Reticulate Documentation <https://rstudio.github.io/reticulate/>`_
+and to the `R Arrow documentation 
<https://arrow.apache.org/docs/r/articles/python.html#using>`_
+
+R to Python communication using C Data Interface
+------------------------------------------------
+
+Both the solutions described in previous chapters use the Arrow C Data
+interface under the hood.
+
+In case we want to extend the previous ``addthree`` example to switch
+from using ``rpy2-arrow`` to using the plain C Data interface we can
+do so by introducing some modifications to our codebase.
+
+To enable importing the Arrow Array from the C Data interface we have to
+wrap our ``addthree`` function in a function that does the extra work
+necessary to import an Arrow Array in R from the C Data interface.
+
+That work will be done by the ``addthree_cdata`` function which invokes the
+``addthree`` function once the Array is imported.

Review comment:
       This entire snippet doesn't seem very informative to me ("introduce some 
modifications"... "do the extra work"...). Perhaps condense them into one or 
two meaningful sentences?
   
   Also, this does not explain _why_ I would want to avoid ``rpy2-arrow`` and 
instead invoke the C Data Interface directly.

##########
File path: docs/source/python/integration/python_r.rst
##########
@@ -0,0 +1,312 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Integrating PyArrow with R
+==========================
+
+Arrow supports exchanging data within the same process through the
+:ref:`c-data-interface`.
+
+This can be used to exchange data between Python and R functions and
+methods so that the two languages can interact without any cost of
+marshaling and unmarshaling data.
+
+.. note::
+
+    The article takes for granted that you have a ``Python`` environment
+    with ``pyarrow`` correctly installed and an ``R`` environment with
+    ``arrow`` library correctly installed. 
+    See `Python Install Instructions 
<https://arrow.apache.org/docs/python/install.html>`_
+    and `R Install instructions 
<https://arrow.apache.org/docs/r/#installation>`_
+    for further details.
+
+Invoking R functions from Python
+--------------------------------
+
+Suppose we have a simple R function receiving an Arrow Array to
+add ``3`` to all its elements:
+
+.. code-block:: R
+
+    library(arrow)
+
+    addthree <- function(arr) {
+        return(arr + 3L)
+    }
+
+We could save such function in a ``addthree.R`` file so that we can
+make it available for reuse.
+
+Once the ``addthree.R`` is created we can invoke any of its functions
+from Python using the 
+`rpy2 <https://rpy2.github.io/doc/latest/html/index.html>`_ library which
+enables a R runtime within the Python interpreter.
+
+``rpy2`` can be installed using ``pip`` like most python libraries
+
+.. code-block:: bash
+
+    $ pip install rpy2
+
+The most basic thing we can do with our ``addthree`` function is to
+invoke it from Python with a number and see how it will return the result.
+
+To do so we can create an ``addthree.py`` file which uses ``rpy2`` to
+import the ``addthree`` function from ``addthree.R`` file and invoke it:
+
+.. code-block:: python
+
+    import rpy2.robjects as robjects
+
+    # Load the addthree.R file
+    r_source = robjects.r["source"]
+    r_source("addthree.R")
+
+    # Get a reference to the addthree function
+    addthree = robjects.r["addthree"]
+
+    # Invoke the function
+    r = addthree(3)
+
+    # Access the returned value
+    value = r[0]
+    print(value)
+
+Running the ``addthree.py`` file will show how our Python code is able
+to access the ``R`` function and print the expected result:
+
+.. code-block:: bash
+
+    $ python addthree.py 
+    6
+
+If instead of passing around basic data types we want to pass around
+Arrow Arrays, we can do so relying on the
+`rpy2-arrow <https://rpy2.github.io/rpy2-arrow/version/main/html/index.html>`_ 
+module which implements ``rpy2`` support for Arrow types.
+
+``rpy2-arrow`` can be installed through ``pip``:
+
+.. code-block:: bash
+
+    $ pip install rpy2-arrow
+
+``rpy2-arrow`` implements converters from PyArrow objects to R Arrow objects,
+this is done without occurring into any data copy cost as it relies on the

Review comment:
       ```suggestion
   this is done without incurring any data copy cost as it relies on the
   ```

##########
File path: docs/source/python/integration/python_r.rst
##########
@@ -0,0 +1,312 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Integrating PyArrow with R
+==========================
+
+Arrow supports exchanging data within the same process through the
+:ref:`c-data-interface`.
+
+This can be used to exchange data between Python and R functions and
+methods so that the two languages can interact without any cost of
+marshaling and unmarshaling data.
+
+.. note::
+
+    The article takes for granted that you have a ``Python`` environment
+    with ``pyarrow`` correctly installed and an ``R`` environment with
+    ``arrow`` library correctly installed. 
+    See `Python Install Instructions 
<https://arrow.apache.org/docs/python/install.html>`_
+    and `R Install instructions 
<https://arrow.apache.org/docs/r/#installation>`_
+    for further details.
+
+Invoking R functions from Python
+--------------------------------
+
+Suppose we have a simple R function receiving an Arrow Array to
+add ``3`` to all its elements:
+
+.. code-block:: R
+
+    library(arrow)
+
+    addthree <- function(arr) {
+        return(arr + 3L)
+    }
+
+We could save such function in a ``addthree.R`` file so that we can
+make it available for reuse.
+
+Once the ``addthree.R`` is created we can invoke any of its functions
+from Python using the 
+`rpy2 <https://rpy2.github.io/doc/latest/html/index.html>`_ library which
+enables a R runtime within the Python interpreter.
+
+``rpy2`` can be installed using ``pip`` like most python libraries
+
+.. code-block:: bash
+
+    $ pip install rpy2
+
+The most basic thing we can do with our ``addthree`` function is to
+invoke it from Python with a number and see how it will return the result.
+
+To do so we can create an ``addthree.py`` file which uses ``rpy2`` to
+import the ``addthree`` function from ``addthree.R`` file and invoke it:
+
+.. code-block:: python
+
+    import rpy2.robjects as robjects
+
+    # Load the addthree.R file
+    r_source = robjects.r["source"]
+    r_source("addthree.R")
+
+    # Get a reference to the addthree function
+    addthree = robjects.r["addthree"]
+
+    # Invoke the function
+    r = addthree(3)
+
+    # Access the returned value
+    value = r[0]
+    print(value)
+
+Running the ``addthree.py`` file will show how our Python code is able
+to access the ``R`` function and print the expected result:
+
+.. code-block:: bash
+
+    $ python addthree.py 
+    6
+
+If instead of passing around basic data types we want to pass around
+Arrow Arrays, we can do so relying on the
+`rpy2-arrow <https://rpy2.github.io/rpy2-arrow/version/main/html/index.html>`_ 
+module which implements ``rpy2`` support for Arrow types.
+
+``rpy2-arrow`` can be installed through ``pip``:
+
+.. code-block:: bash
+
+    $ pip install rpy2-arrow
+
+``rpy2-arrow`` implements converters from PyArrow objects to R Arrow objects,
+this is done without occurring into any data copy cost as it relies on the
+C Data interface.
+
+To pass to ``addthree`` a PyArrow array our ``addthree.py`` needs to be 
modified

Review comment:
       ```suggestion
   To pass to the ``addthree`` function a PyArrow array, our ``addthree.py`` 
file needs to be modified
   ```

##########
File path: docs/source/python/integration/python_r.rst
##########
@@ -0,0 +1,312 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Integrating PyArrow with R
+==========================
+
+Arrow supports exchanging data within the same process through the
+:ref:`c-data-interface`.
+
+This can be used to exchange data between Python and R functions and
+methods so that the two languages can interact without any cost of
+marshaling and unmarshaling data.
+
+.. note::
+
+    The article takes for granted that you have a ``Python`` environment
+    with ``pyarrow`` correctly installed and an ``R`` environment with
+    ``arrow`` library correctly installed. 
+    See `Python Install Instructions 
<https://arrow.apache.org/docs/python/install.html>`_
+    and `R Install instructions 
<https://arrow.apache.org/docs/r/#installation>`_
+    for further details.
+
+Invoking R functions from Python
+--------------------------------
+
+Suppose we have a simple R function receiving an Arrow Array to
+add ``3`` to all its elements:
+
+.. code-block:: R
+
+    library(arrow)
+
+    addthree <- function(arr) {
+        return(arr + 3L)
+    }
+
+We could save such function in a ``addthree.R`` file so that we can
+make it available for reuse.
+
+Once the ``addthree.R`` is created we can invoke any of its functions
+from Python using the 
+`rpy2 <https://rpy2.github.io/doc/latest/html/index.html>`_ library which
+enables a R runtime within the Python interpreter.
+
+``rpy2`` can be installed using ``pip`` like most python libraries
+
+.. code-block:: bash
+
+    $ pip install rpy2
+
+The most basic thing we can do with our ``addthree`` function is to
+invoke it from Python with a number and see how it will return the result.
+
+To do so we can create an ``addthree.py`` file which uses ``rpy2`` to
+import the ``addthree`` function from ``addthree.R`` file and invoke it:
+
+.. code-block:: python
+
+    import rpy2.robjects as robjects
+
+    # Load the addthree.R file
+    r_source = robjects.r["source"]
+    r_source("addthree.R")
+
+    # Get a reference to the addthree function
+    addthree = robjects.r["addthree"]
+
+    # Invoke the function
+    r = addthree(3)
+
+    # Access the returned value
+    value = r[0]
+    print(value)
+
+Running the ``addthree.py`` file will show how our Python code is able
+to access the ``R`` function and print the expected result:
+
+.. code-block:: bash
+
+    $ python addthree.py 
+    6
+
+If instead of passing around basic data types we want to pass around
+Arrow Arrays, we can do so relying on the
+`rpy2-arrow <https://rpy2.github.io/rpy2-arrow/version/main/html/index.html>`_ 
+module which implements ``rpy2`` support for Arrow types.
+
+``rpy2-arrow`` can be installed through ``pip``:
+
+.. code-block:: bash
+
+    $ pip install rpy2-arrow
+
+``rpy2-arrow`` implements converters from PyArrow objects to R Arrow objects,
+this is done without occurring into any data copy cost as it relies on the
+C Data interface.
+
+To pass to ``addthree`` a PyArrow array our ``addthree.py`` needs to be 
modified
+to enable ``rpy2-arrow`` converters and then pass the PyArrow array:
+
+.. code-block:: python
+
+    import rpy2.robjects as robjects
+    from rpy2_arrow.pyarrow_rarrow import (rarrow_to_py_array,
+                                           converter as arrowconverter)
+    from rpy2.robjects.conversion import localconverter
+
+    r_source = robjects.r["source"]
+    r_source("addthree.R")
+
+    addthree = robjects.r["addthree"]
+
+    import pyarrow
+
+    array = pyarrow.array((1, 2, 3))
+
+    # Enable rpy2-arrow converter so that R can receive the array.
+    with localconverter(arrowconverter):
+        r_result = addthree(array)
+
+    # The result of the R function will be an R Environment
+    # we can convert the Environment back to a pyarrow Array
+    # using the rarrow_to_py_array function
+    py_result = rarrow_to_py_array(r_result)
+    print("RESULT", type(py_result), py_result)
+
+Running the newly modified ``addthree.py`` should now properly execute
+the R function and print the resulting PyArrow Array:
+
+.. code-block:: bash
+
+    $ python addthree.py
+    RESULT <class 'pyarrow.lib.Int64Array'> [
+      4,
+      5,
+      6
+    ]
+
+For additional information you can refer to
+`rpy2 Documentation <https://rpy2.github.io/doc/latest/html/index.html>`_
+and `rpy2-arrow Documentation 
<https://rpy2.github.io/rpy2-arrow/version/main/html/index.html>`_
+
+Invoking Python functions from R
+--------------------------------
+
+Exposing Python functions to R can be done through the ``reticulate``
+library. For example if we want to invoke :func:`pyarrow.compute.add` from
+R on an Array created in R we can do so importing ``pyarrow`` in R
+through ``reticulate``.
+
+A basic ``addthree.R`` script that invokes ``add`` to add ``3`` to
+an R array would look like:
+
+.. code-block:: R
+
+    # Load arrow and reticulate libraries
+    library(arrow)
+    library(reticulate)
+
+    # Create a new array in R
+    a <- Array$create(c(1, 2, 3))
+
+    # Make pyarrow.compute available to R
+    pc <- import("pyarrow.compute")
+
+    # Invoke pyarrow.compute.add with the array and 3
+    # This will add 3 to all elements of the array and return a new Array
+    result <- pc$add(a, 3)
+
+    # Print the result to confirm it's what we expect
+    print(result)
+
+Invoking the ``addthree.R`` script will print the outcome of adding
+``3`` to all the elements of the original ``Array$create(c(1, 2, 3))`` array:
+
+.. code-block:: bash
+
+    $ R --silent -f addthree.R 
+    Array
+    <double>
+    [
+      4,
+      5,
+      6
+    ]
+
+For additional information you can refer to
+`Reticulate Documentation <https://rstudio.github.io/reticulate/>`_
+and to the `R Arrow documentation 
<https://arrow.apache.org/docs/r/articles/python.html#using>`_
+
+R to Python communication using C Data Interface
+------------------------------------------------
+
+Both the solutions described in previous chapters use the Arrow C Data

Review comment:
       ```suggestion
   Both solutions described above use the Arrow C Data
   ```

##########
File path: docs/source/python/integration/python_r.rst
##########
@@ -0,0 +1,312 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+Integrating PyArrow with R
+==========================
+
+Arrow supports exchanging data within the same process through the
+:ref:`c-data-interface`.
+
+This can be used to exchange data between Python and R functions and
+methods so that the two languages can interact without any cost of
+marshaling and unmarshaling data.
+
+.. note::
+
+    The article takes for granted that you have a ``Python`` environment
+    with ``pyarrow`` correctly installed and an ``R`` environment with
+    ``arrow`` library correctly installed. 
+    See `Python Install Instructions 
<https://arrow.apache.org/docs/python/install.html>`_
+    and `R Install instructions 
<https://arrow.apache.org/docs/r/#installation>`_
+    for further details.
+
+Invoking R functions from Python
+--------------------------------
+
+Suppose we have a simple R function receiving an Arrow Array to
+add ``3`` to all its elements:
+
+.. code-block:: R
+
+    library(arrow)
+
+    addthree <- function(arr) {
+        return(arr + 3L)
+    }
+
+We could save such function in a ``addthree.R`` file so that we can
+make it available for reuse.
+
+Once the ``addthree.R`` is created we can invoke any of its functions
+from Python using the 
+`rpy2 <https://rpy2.github.io/doc/latest/html/index.html>`_ library which
+enables a R runtime within the Python interpreter.
+
+``rpy2`` can be installed using ``pip`` like most python libraries

Review comment:
       ```suggestion
   ``rpy2`` can be installed using ``pip`` like most Python libraries
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to