ueshin commented on code in PR #49535:
URL: https://github.com/apache/spark/pull/49535#discussion_r1927795362
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -3475,6 +3475,15 @@ object SQLConf {
.checkValues(Set("legacy", "row", "dict"))
.createWithDefaultString("legacy")
+ val PYSPARK_HIDE_TRACEBACK =
+ buildConf("spark.sql.execution.pyspark.udf.hideTraceback.enabled")
+ .doc(
+ "When true, only show the message of the exception from Python UDFs, "
+
+ "hiding the stack trace.")
Review Comment:
We may want to describe a bit more about the relationship between this and
`simplifiedTraceback`, here or in the doc of `simplifiedTraceback`.
Seems like if this is enabled, `simplifiedTraceback` will be ignored?
##########
python/pyspark/util.py:
##########
@@ -462,22 +462,40 @@ def wrapped(*args: Any, **kwargs: Any) -> Any:
return f # type: ignore[return-value]
-def handle_worker_exception(e: BaseException, outfile: IO) -> None:
+def handle_worker_exception(
+ e: BaseException, outfile: IO, hide_traceback: Optional[bool] = None
Review Comment:
Do we need to pass `hide_traceback`?
##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -122,6 +122,7 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
protected val authSocketTimeout = conf.get(PYTHON_AUTH_SOCKET_TIMEOUT)
private val reuseWorker = conf.get(PYTHON_WORKER_REUSE)
protected val faultHandlerEnabled: Boolean =
conf.get(PYTHON_WORKER_FAULTHANLDER_ENABLED)
+ protected val hideTraceback: Boolean = false
Review Comment:
The conf in `SQLConf` is session-based conf that also can be set in runtime,
and any conf in `core` module or `StaticSQLConf` is cluster-wide conf and can't
be changed while the cluster is running.
##########
python/pyspark/util.py:
##########
@@ -462,22 +462,40 @@ def wrapped(*args: Any, **kwargs: Any) -> Any:
return f # type: ignore[return-value]
-def handle_worker_exception(e: BaseException, outfile: IO) -> None:
+def handle_worker_exception(
+ e: BaseException, outfile: IO, hide_traceback: Optional[bool] = None
+) -> None:
"""
Handles exception for Python worker which writes
SpecialLengths.PYTHON_EXCEPTION_THROWN (-2)
and exception traceback info to outfile. JVM could then read from the
outfile and perform
exception handling there.
+
+ Parameters
+ ----------
+ e : BaseException
+ Exception handled
+ outfile : IO
+ IO object to write the exception info
+ hide_traceback : bool, optional
+ Whether to hide the traceback in the output.
+ By default, hides the traceback if environment variable
SPARK_HIDE_TRACEBACK is set.
Review Comment:
Thanks for adding the parameters here!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]