This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new f62b36c [SPARK-38128][PYTHON][TESTS] Show full stacktrace in tests by
default in PySpark tests
f62b36c is described below
commit f62b36c6d3964c40336959b129b284edb8097f61
Author: Hyukjin Kwon <[email protected]>
AuthorDate: Mon Feb 7 21:18:04 2022 +0900
[SPARK-38128][PYTHON][TESTS] Show full stacktrace in tests by default in
PySpark tests
### What changes were proposed in this pull request?
This PR proposes to show full stacktrace of Python worker and JVM in
PySpark by controlling `spark.sql.pyspark.jvmStacktrace.enabled` and
`spark.sql.execution.pyspark.udf.simplifiedTraceback.enabled` only in tests.
### Why are the changes needed?
[SPARK-33407](https://issues.apache.org/jira/browse/SPARK-33407) and
[SPARK-31849](https://issues.apache.org/jira/browse/SPARK-31849) hide Java
stacktrace and internal Python worker side traceback by default for simpler
error messages to end users. However, specifically for unit tests, that makes a
bit harder to debug the test failures. We should probably show the full
stacktrace by default in tests.
### Does this PR introduce _any_ user-facing change?
No, this is test only.
### How was this patch tested?
Manually tested. Now the test failures show the logs as below:
**Before:**
```
=====================================================================
ERROR [3.480s]: test (pyspark.sql.tests.test_functions.FunctionsTests)
----------------------------------------------------------------------
Traceback (most recent call last):
...
pyspark.sql.utils.PythonException:
An exception was thrown from the Python worker. Please see the stack
trace below.
Traceback (most recent call last):
File "/.../pyspark/sql/tests/test_functions.py", line 60, in <lambda>
self.spark.range(1).select(udf(lambda x: x / 0)("id")).show()
ZeroDivisionError: division by zero
----------------------------------------------------------------------
Ran 1 test in 12.468s
FAILED (errors=1)
```
**After:**
```
======================================================================
ERROR [3.259s]: test (pyspark.sql.tests.test_functions.FunctionsTests)
----------------------------------------------------------------------
Traceback (most recent call last):
...
pyspark.sql.utils.PythonException:
An exception was thrown from the Python worker. Please see the stack
trace below.
Traceback (most recent call last):
File "/.../pyspark/worker.py", line 678, in main
process()
File "/.../pyspark/worker.py", line 670, in process
serializer.dump_stream(out_iter, outfile)
File "/.../lib/pyspark/serializers.py", line 217, in dump_stream
self.serializer.dump_stream(self._batched(iterator), stream)
...
ZeroDivisionError: division by zero
JVM stacktrace:
...
at
org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:558)
at
org.apache.spark.sql.execution.python.PythonUDFRunner$$anon$2.read(PythonUDFRunner.scala:86)
at
org.apache.spark.sql.execution.python.PythonUDFRunner$$anon$2.read(PythonUDFRunner.scala:68)
at
org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:511)
at
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
...
Driver stacktrace:
...
Caused by: org.apache.spark.api.python.PythonException: Traceback (most
recent call last):
... 1 more
----------------------------------------------------------------------
Ran 1 test in 12.610s
FAILED (errors=1)
```
Closes #35423 from HyukjinKwon/SPARK-38128.
Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
---
.../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index 42979a6..59a896a 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -2383,7 +2383,8 @@ object SQLConf {
"and shows a Python-friendly exception only.")
.version("3.0.0")
.booleanConf
- .createWithDefault(false)
+ // show full stacktrace in tests but hide in production by default.
+ .createWithDefault(Utils.isTesting)
val ARROW_SPARKR_EXECUTION_ENABLED =
buildConf("spark.sql.execution.arrow.sparkr.enabled")
@@ -2440,7 +2441,8 @@ object SQLConf {
"shows the exception messages from UDFs. Note that this works only
with CPython 3.7+.")
.version("3.1.0")
.booleanConf
- .createWithDefault(true)
+ // show full stacktrace in tests but hide in production by default.
+ .createWithDefault(!Utils.isTesting)
val PANDAS_GROUPED_MAP_ASSIGN_COLUMNS_BY_NAME =
buildConf("spark.sql.legacy.execution.pandas.groupedMap.assignColumnsByName")
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]