[spark] branch branch-3.4 updated: [SPARK-44184][PYTHON][DOCS] Remove a wrong doc about `ARROW_PRE_0_15_IPC_FORMAT`

dongjoon Sun, 25 Jun 2023 18:53:49 -0700

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-3.4 by this push:
     new e3abc1bf7b6 [SPARK-44184][PYTHON][DOCS] Remove a wrong doc about 
`ARROW_PRE_0_15_IPC_FORMAT`
e3abc1bf7b6 is described below

commit e3abc1bf7b680eba202c733b1c2217ada43d2e77
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Sun Jun 25 18:53:25 2023 -0700

    [SPARK-44184][PYTHON][DOCS] Remove a wrong doc about 
`ARROW_PRE_0_15_IPC_FORMAT`
    
    ### What changes were proposed in this pull request?
    
    This PR aims to remove a wrong documentation about 
`ARROW_PRE_0_15_IPC_FORMAT`.
    
    ### Why are the changes needed?
    
    Since Apache Spark 3.0.0, Spark doesn't allow `ARROW_PRE_0_15_IPC_FORMAT` 
environment variable at all.
    
    
https://github.com/apache/spark/blob/2407183cb8637b6ac2d1b76320cae9cbde3411da/python/pyspark/sql/pandas/utils.py#L69-L73
    
    ### Does this PR introduce _any_ user-facing change?
    
    No. This is a removal of outdated wrong documentation.
    
    ### How was this patch tested?
    
    Manual review.
    
    Closes #41730 from dongjoon-hyun/SPARK-44184.
    
    Authored-by: Dongjoon Hyun <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
    (cherry picked from commit 00e7c08606d0b6de22604d2a7350ea0711355300)
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 python/docs/source/user_guide/sql/arrow_pandas.rst | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/python/docs/source/user_guide/sql/arrow_pandas.rst 
b/python/docs/source/user_guide/sql/arrow_pandas.rst
index 0901be73a02..e6627d18f92 100644
--- a/python/docs/source/user_guide/sql/arrow_pandas.rst
+++ b/python/docs/source/user_guide/sql/arrow_pandas.rst
@@ -392,25 +392,6 @@ For usage with pyspark.sql, the minimum supported versions 
of Pandas is 1.0.5 an
 Higher versions may be used, however, compatibility and data correctness can 
not be guaranteed and should
 be verified by the user.
 
-Compatibility Setting for PyArrow >= 0.15.0 and Spark 2.3.x, 2.4.x
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Since Arrow 0.15.0, a change in the binary IPC format requires an environment 
variable to be
-compatible with previous versions of Arrow <= 0.14.1. This is only necessary 
to do for PySpark
-users with versions 2.3.x and 2.4.x that have manually upgraded PyArrow to 
0.15.0. The following
-can be added to ``conf/spark-env.sh`` to use the legacy Arrow IPC format:
-
-.. code-block:: bash
-
-    ARROW_PRE_0_15_IPC_FORMAT=1
-
-
-This will instruct PyArrow >= 0.15.0 to use the legacy IPC format with the 
older Arrow Java that
-is in Spark 2.3.x and 2.4.x. Not setting this environment variable will lead 
to a similar error as
-described in `SPARK-29367 
<https://issues.apache.org/jira/browse/SPARK-29367>`_ when running
-``pandas_udf``\s or :meth:`DataFrame.toPandas` with Arrow enabled. More 
information about the Arrow IPC change can
-be read on the Arrow 0.15.0 release `blog 
<https://arrow.apache.org/blog/2019/10/06/0.15.0-release/#columnar-streaming-protocol-change-since-0140>`_.
-
 Setting Arrow ``self_destruct`` for memory savings
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[spark] branch branch-3.4 updated: [SPARK-44184][PYTHON][DOCS] Remove a wrong doc about `ARROW_PRE_0_15_IPC_FORMAT`

Reply via email to