[GitHub] [spark] itholic commented on a diff in pull request #39591: [SPARK-42078][PYTHON] Migrate errors thrown by JVM into `PySparkException`.

GitBox Mon, 16 Jan 2023 19:01:38 -0800


itholic commented on code in PR #39591:
URL: https://github.com/apache/spark/pull/39591#discussion_r1071695829



##########
python/pyspark/errors/exceptions.py:
##########
@@ -74,3 +79,187 @@ def getMessageParameters(self) -> Optional[Dict[str, str]]:
 
     def __str__(self) -> str:
         return f"[{self.getErrorClass()}] {self.message}"
+
+
+class CapturedException(PySparkException):
+    def __init__(
+        self,
+        desc: Optional[str] = None,
+        stackTrace: Optional[str] = None,
+        cause: Optional[Py4JJavaError] = None,
+        origin: Optional[Py4JJavaError] = None,
+    ):
+        # desc & stackTrace vs origin are mutually exclusive.
+        # cause is optional.
+        assert (origin is not None and desc is None and stackTrace is None) or 
(
+            origin is None and desc is not None and stackTrace is not None
+        )
+
+        self.desc = desc if desc is not None else cast(Py4JJavaError, 
origin).getMessage()
+        assert SparkContext._jvm is not None
+        self.stackTrace = (
+            stackTrace
+            if stackTrace is not None
+            else 
(SparkContext._jvm.org.apache.spark.util.Utils.exceptionString(origin))
+        )
+        self.cause = convert_exception(cause) if cause is not None else None
+        if self.cause is None and origin is not None and origin.getCause() is 
not None:
+            self.cause = convert_exception(origin.getCause())
+        self._origin = origin
+
+    def __str__(self) -> str:
+        assert SparkContext._jvm is not None
+
+        jvm = SparkContext._jvm
+        sql_conf = jvm.org.apache.spark.sql.internal.SQLConf.get()
+        debug_enabled = sql_conf.pysparkJVMStacktraceEnabled()
+        desc = self.desc
+        if debug_enabled:
+            desc = desc + "\n\nJVM stacktrace:\n%s" % self.stackTrace
+        return str(desc)

Review Comment:
   If I understand the comment correctly, actually it's already show the same 
string with our new error framework.
   
   the string `desc` here as shown below:
   ```python
   >>> desc
   [UNPIVOT_REQUIRES_VALUE_COLUMNS] At least one value column needs to be 
specified for UNPIVOT, all columns specified as ids;
   'Unpivot ArraySeq(id#0L), ArraySeq(), var, [val]
   +- Range (0, 10, step=1, splits=Some(16))
   ```
   
   So, the final error message will be like:
   ```python
   >>> df.unpivot("id", [], "var", "val").collect()
   Traceback (most recent call last):
   ...
   pyspark.errors.exceptions.AnalysisException: 
[UNPIVOT_REQUIRES_VALUE_COLUMNS] At least one value column needs to be 
specified for UNPIVOT, all columns specified as ids;
   'Unpivot ArraySeq(id#0L), ArraySeq(), var, [val]
   +- Range (0, 10, step=1, splits=Some(16))
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] itholic commented on a diff in pull request #39591: [SPARK-42078][PYTHON] Migrate errors thrown by JVM into `PySparkException`.

Reply via email to