This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 00e7c08606d [SPARK-44184][PYTHON][DOCS] Remove a wrong doc about 
`ARROW_PRE_0_15_IPC_FORMAT`
00e7c08606d is described below

commit 00e7c08606d0b6de22604d2a7350ea0711355300
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Sun Jun 25 18:53:25 2023 -0700

    [SPARK-44184][PYTHON][DOCS] Remove a wrong doc about 
`ARROW_PRE_0_15_IPC_FORMAT`
    
    ### What changes were proposed in this pull request?
    
    This PR aims to remove a wrong documentation about 
`ARROW_PRE_0_15_IPC_FORMAT`.
    
    ### Why are the changes needed?
    
    Since Apache Spark 3.0.0, Spark doesn't allow `ARROW_PRE_0_15_IPC_FORMAT` 
environment variable at all.
    
    
https://github.com/apache/spark/blob/2407183cb8637b6ac2d1b76320cae9cbde3411da/python/pyspark/sql/pandas/utils.py#L69-L73
    
    ### Does this PR introduce _any_ user-facing change?
    
    No. This is a removal of outdated wrong documentation.
    
    ### How was this patch tested?
    
    Manual review.
    
    Closes #41730 from dongjoon-hyun/SPARK-44184.
    
    Authored-by: Dongjoon Hyun <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 python/docs/source/user_guide/sql/arrow_pandas.rst | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/python/docs/source/user_guide/sql/arrow_pandas.rst 
b/python/docs/source/user_guide/sql/arrow_pandas.rst
index 0901be73a02..e6627d18f92 100644
--- a/python/docs/source/user_guide/sql/arrow_pandas.rst
+++ b/python/docs/source/user_guide/sql/arrow_pandas.rst
@@ -392,25 +392,6 @@ For usage with pyspark.sql, the minimum supported versions 
of Pandas is 1.0.5 an
 Higher versions may be used, however, compatibility and data correctness can 
not be guaranteed and should
 be verified by the user.
 
-Compatibility Setting for PyArrow >= 0.15.0 and Spark 2.3.x, 2.4.x
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Since Arrow 0.15.0, a change in the binary IPC format requires an environment 
variable to be
-compatible with previous versions of Arrow <= 0.14.1. This is only necessary 
to do for PySpark
-users with versions 2.3.x and 2.4.x that have manually upgraded PyArrow to 
0.15.0. The following
-can be added to ``conf/spark-env.sh`` to use the legacy Arrow IPC format:
-
-.. code-block:: bash
-
-    ARROW_PRE_0_15_IPC_FORMAT=1
-
-
-This will instruct PyArrow >= 0.15.0 to use the legacy IPC format with the 
older Arrow Java that
-is in Spark 2.3.x and 2.4.x. Not setting this environment variable will lead 
to a similar error as
-described in `SPARK-29367 
<https://issues.apache.org/jira/browse/SPARK-29367>`_ when running
-``pandas_udf``\s or :meth:`DataFrame.toPandas` with Arrow enabled. More 
information about the Arrow IPC change can
-be read on the Arrow 0.15.0 release `blog 
<https://arrow.apache.org/blog/2019/10/06/0.15.0-release/#columnar-streaming-protocol-change-since-0140>`_.
-
 Setting Arrow ``self_destruct`` for memory savings
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to