[ 
https://issues.apache.org/jira/browse/SPARK-49847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haejoon Lee updated SPARK-49847:
--------------------------------
    Description: 
This aims to ensure full compatibility between PySpark and Spark Connect by 
thoroughly implementing and testing that all functionalities in PySpark work 
seamlessly with Spark Connect.

The [initial work|https://github.com/apache/spark/pull/48085] includes the 
creation of the *{{test_connect_compatibility.py}}* test suite, which validates 
the signature compatibility for core components such as {*}DataFrame{*}, 
{*}Column{*}, and *SparkSession* APIs. This test suite also includes checks for 
missing APIs and properties that need to be supported by Spark Connect.

Key goals for this project:
 * Ensure that all PySpark APIs are fully functional in Spark Connect.
 * Identify discrepancies in API signatures between PySpark and Spark Connect.
 * Verify missing APIs and properties, and add necessary functionality to Spark 
Connect.
 * Create comprehensive tests to prevent regressions and ensure long-term 
compatibility.

Further work will involve extending the test coverage to all critical PySpark 
modules and ensuring compatibility with Spark Connect in future releases.

  was:
This aims to ensure full compatibility between PySpark and Spark Connect by 
thoroughly testing and validating that all functionalities in PySpark work 
seamlessly with Spark Connect.

The [initial work|https://github.com/apache/spark/pull/48085] includes the 
creation of the *{{test_connect_compatibility.py}}* test suite, which validates 
the signature compatibility for core components such as {*}DataFrame{*}, 
{*}Column{*}, and *SparkSession* APIs. This test suite also includes checks for 
missing APIs and properties that need to be supported by Spark Connect.

Key goals for this project:
 * Ensure that all PySpark APIs are fully functional in Spark Connect.
 * Identify discrepancies in API signatures between PySpark and Spark Connect.
 * Verify missing APIs and properties, and add necessary functionality to Spark 
Connect.
 * Create comprehensive tests to prevent regressions and ensure long-term 
compatibility.

Further work will involve extending the test coverage to all critical PySpark 
modules and ensuring compatibility with Spark Connect in future releases.


> PySpark compatibility with Spark Connect
> ----------------------------------------
>
>                 Key: SPARK-49847
>                 URL: https://issues.apache.org/jira/browse/SPARK-49847
>             Project: Spark
>          Issue Type: Umbrella
>          Components: Connect, PySpark
>    Affects Versions: 4.0.0
>            Reporter: Haejoon Lee
>            Assignee: Haejoon Lee
>            Priority: Critical
>
> This aims to ensure full compatibility between PySpark and Spark Connect by 
> thoroughly implementing and testing that all functionalities in PySpark work 
> seamlessly with Spark Connect.
> The [initial work|https://github.com/apache/spark/pull/48085] includes the 
> creation of the *{{test_connect_compatibility.py}}* test suite, which 
> validates the signature compatibility for core components such as 
> {*}DataFrame{*}, {*}Column{*}, and *SparkSession* APIs. This test suite also 
> includes checks for missing APIs and properties that need to be supported by 
> Spark Connect.
> Key goals for this project:
>  * Ensure that all PySpark APIs are fully functional in Spark Connect.
>  * Identify discrepancies in API signatures between PySpark and Spark Connect.
>  * Verify missing APIs and properties, and add necessary functionality to 
> Spark Connect.
>  * Create comprehensive tests to prevent regressions and ensure long-term 
> compatibility.
> Further work will involve extending the test coverage to all critical PySpark 
> modules and ensuring compatibility with Spark Connect in future releases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to