HyukjinKwon commented on a change in pull request #25593:
[SPARK-27992][SPARK-28881][PYTHON][2.4] Allow Python to join with connection
thread to propagate errors
URL: https://github.com/apache/spark/pull/25593#discussion_r318109047
##########
File path: python/pyspark/sql/tests.py
##########
@@ -4550,6 +4550,32 @@ def test_timestamp_dst(self):
self.assertPandasEqual(pdf, df_from_pandas.toPandas())
[email protected](
+ not _have_pandas or not _have_pyarrow,
+ _pandas_requirement_message or _pyarrow_requirement_message)
+class MaxResultArrowTests(unittest.TestCase):
+ # These tests are separate as 'spark.driver.maxResultSize' configuration
+ # is a static configuration to Spark context.
+
+ @classmethod
+ def setUpClass(cls):
+ cls.spark = SparkSession(SparkContext(
+ 'local[4]', cls.__name__,
conf=SparkConf().set("spark.driver.maxResultSize", "10k")))
Review comment:
Okay, the last test failure looks weird and flaky
(https://github.com/apache/spark/pull/25593#issuecomment-525286876). This test
itself passed but seems like previously set `spark.driver.maxResultSize=10k`
affects the other tests even though we stop the session and context explicitly.
This is fine for now in the master branch because this test is in a separate
file and launched in a separate process; however, this is potentially an issue.
I am working around, in branch-2.4 specifically, by using
`SparkSession(SparkContext(...))` for now since it's an orthogonal issue.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]