This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 87e6605184f [SPARK-42080][PYTHON][DOCS] Add guideline for PySpark
errors
87e6605184f is described below
commit 87e6605184fe2cb5a8a370b20fb85d1e17cb1f81
Author: itholic <[email protected]>
AuthorDate: Thu Jan 19 22:35:06 2023 +0900
[SPARK-42080][PYTHON][DOCS] Add guideline for PySpark errors
### What changes were proposed in this pull request?
This PR proposes to add a guideline for PySpark errors.

### Why are the changes needed?
To help developers understand the PySpark error framework.
### Does this PR introduce _any_ user-facing change?
No. It's documentation.
### How was this patch tested?
manually built docs, by running `make html` on `python/docs`.
Closes #39639 from itholic/SPARK-42080.
Authored-by: itholic <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
---
python/docs/source/development/contributing.rst | 100 ++++++++++++++++++++++++
1 file changed, 100 insertions(+)
diff --git a/python/docs/source/development/contributing.rst
b/python/docs/source/development/contributing.rst
index b12a982a565..17c90abea68 100644
--- a/python/docs/source/development/contributing.rst
+++ b/python/docs/source/development/contributing.rst
@@ -233,3 +233,103 @@ For instance, the first block is for the statements for
preparation, the second
and third block is for another argument. As a example, please refer
`DataFrame.rsub
<https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rsub.html#pandas.DataFrame.rsub>`_
in pandas.
These blocks should be consistently separated in PySpark doctests, and more
doctests should be added if the coverage of the doctests or the number of
examples to show is not enough.
+
+
+Contributing Error and Exception
+--------------------------------
+
+.. currentmodule:: pyspark.errors
+
+To throw a standardized user-facing error or exception, developers should
specify the error class and message parameters rather than an arbitrary error
message.
+
+
+Usage
+~~~~~
+
+1. Check if an appropriate error class already exists in `error_classes.py`.
+ If true, use the error class and skip to step 3.
+2. Add a new class to `error_classes.py`; keep in mind the invariants below.
+3. Check if the exception type already extends `PySparkException`.
+ If true, skip to step 5.
+4. Mix `PySparkException` into the exception.
+5. Throw the exception with the error class and message parameters.
+
+
+**Before**
+
+Throw with arbitrary error message:
+
+.. code-block:: python
+
+ raise ValueError("Problem A because B")
+
+
+**After**
+
+`error_classes.py`
+
+.. code-block:: python
+
+ "PROBLEM_BECAUSE": {
+ "message": ["Problem <problem> because <cause>"]
+ }
+
+`exceptions.py`
+
+.. code-block:: python
+
+ class PySparkTestError(PySparkException):
+ def __init__(self, error_class: str, message_parameters: Dict[str, str]):
+ super().__init__(error_class=error_class,
message_parameters=message_parameters)
+
+ def getMessageParameters(self) -> Optional[Dict[str, str]]:
+ return super().getMessageParameters()
+
+Throw with error class and message parameters:
+
+.. code-block:: python
+
+ raise PySparkTestError("PROBLEM_BECAUSE", {"problem": "A", "cause": "B"})
+
+
+Access fields
+~~~~~~~~~~~~~
+
+To access error fields, catch exceptions that extend :class:`PySparkException`
and access to error class with :func:`PySparkException.getErrorClass`.
+
+.. code-block:: python
+
+ try:
+ ...
+ except PySparkException as pe:
+ if pe.getErrorClass() == "PROBLEM_BECAUSE":
+ ...
+
+
+Fields
+~~~~~~
+
+**Error class**
+
+Error classes are a succinct, human-readable representation of the error
category.
+
+An uncategorized errors can be assigned to a legacy error class with the
prefix `_LEGACY_ERROR_TEMP_` and an unused sequential number, for instance
`_LEGACY_ERROR_TEMP_0053`.
+
+Invariants:
+
+* Unique
+
+* Consistent across releases
+
+* Sorted alphabetically
+
+**Message**
+
+Error messages provide a descriptive, human-readable representation of the
error.
+The message format accepts string parameters via the C-style printf syntax.
+
+The quality of the error message should match the `Apache Spark Error Message
Guidelines <https://spark.apache.org/error-message-guidelines.html>`_
+
+Invariants:
+
+* Unique
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]