Re: [PR] [SPARK-50275][PYTHON][SS] Enable test_pandas_transform_with_state unit test [spark]

via GitHub Fri, 08 Nov 2024 18:09:16 -0800


HeartSaVioR commented on code in PR #48805:
URL: https://github.com/apache/spark/pull/48805#discussion_r1835210545



##########
dev/sparktestsupport/modules.py:
##########
@@ -526,6 +526,7 @@ def __hash__(self):
         "pyspark.sql.tests.pandas.test_pandas_grouped_map",
         "pyspark.sql.tests.pandas.test_pandas_grouped_map_with_state",
         "pyspark.sql.tests.pandas.test_pandas_map",
+        "pyspark.sql.tests.pandas.test_pandas_transform_with_state",

Review Comment:
   Gosh, I didn't realize that pyspark test suite should be added into here. 
Does this mean we haven't run this test suite in previous PRs?



##########
python/pyspark/sql/tests/pandas/test_pandas_transform_with_state.py:
##########
@@ -109,6 +109,7 @@ def _test_transform_with_state_in_pandas_basic(
         input_path = tempfile.mkdtemp()
         self._prepare_test_resource1(input_path)
         if not single_batch:
+            time.sleep(10)

Review Comment:
   Is this to ensure that created files will have ordering by created 
timestamp? Is it need to be such a huge, 10 seconds? 
   
   I see now the test will be spending 10s of seconds just for this. We had put 
some efforts of reducing test execution time and I don't want to defer the 
effort to the future. Have we tried this by half (5) or even 1-2 seconds?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-50275][PYTHON][SS] Enable test_pandas_transform_with_state unit test [spark]

Reply via email to