(spark) branch master updated: [SPARK-47367][PYTHON][CONNECT][TESTS][FOLLOW-UP] Recover the test case for the number of partitions

dongjoon Tue, 26 Mar 2024 07:36:05 -0700

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new ded8cdf8d945 [SPARK-47367][PYTHON][CONNECT][TESTS][FOLLOW-UP] Recover 
the test case for the number of partitions
ded8cdf8d945 is described below

commit ded8cdf8d9459e0e5b73c01c8ee41ae54ccd7ac5
Author: Hyukjin Kwon <[email protected]>
AuthorDate: Tue Mar 26 07:35:49 2024 -0700

    [SPARK-47367][PYTHON][CONNECT][TESTS][FOLLOW-UP] Recover the test case for 
the number of partitions
    
    ### What changes were proposed in this pull request?
    
    This PR is a followup of https://github.com/apache/spark/pull/45486 that 
addresses https://github.com/apache/spark/pull/45486#discussion_r1538753052 
review comment to recover the test coverage related to the number of partitions 
in Python Data Source.
    
    ### Why are the changes needed?
    
    To restore the test coverage.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No, test-only.
    
    ### How was this patch tested?
    
    Unittest fixed, CI in this PR should verify it.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #45720 from HyukjinKwon/SPARK-47367-folliwup.
    
    Authored-by: Hyukjin Kwon <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 python/pyspark/sql/tests/test_python_datasource.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/python/pyspark/sql/tests/test_python_datasource.py 
b/python/pyspark/sql/tests/test_python_datasource.py
index f69e1dee1285..d028a210b007 100644
--- a/python/pyspark/sql/tests/test_python_datasource.py
+++ b/python/pyspark/sql/tests/test_python_datasource.py
@@ -28,6 +28,7 @@ from pyspark.sql.datasource import (
     WriterCommitMessage,
     CaseInsensitiveDict,
 )
+from pyspark.sql.functions import spark_partition_id
 from pyspark.sql.types import Row, StructType
 from pyspark.testing.sqlutils import (
     have_pyarrow,
@@ -236,10 +237,12 @@ class BasePythonDataSourceTestsMixin:
 
         self.spark.dataSource.register(InMemoryDataSource)
         df = self.spark.read.format("memory").load()
+        self.assertEqual(df.select(spark_partition_id()).distinct().count(), 3)
         assertDataFrameEqual(df, [Row(x=0, y="0"), Row(x=1, y="1"), Row(x=2, 
y="2")])
 
         df = self.spark.read.format("memory").option("num_partitions", 
2).load()
         assertDataFrameEqual(df, [Row(x=0, y="0"), Row(x=1, y="1")])
+        self.assertEqual(df.select(spark_partition_id()).distinct().count(), 2)
 
     def _get_test_json_data_source(self):
         import json


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-47367][PYTHON][CONNECT][TESTS][FOLLOW-UP] Recover the test case for the number of partitions

Reply via email to