Nicholas Chammas created SPARK-46810:
----------------------------------------
Summary: Clarify error class terminology
Key: SPARK-46810
URL: https://issues.apache.org/jira/browse/SPARK-46810
Project: Spark
Issue Type: Improvement
Components: Documentation, SQL
Affects Versions: 4.0.0
Reporter: Nicholas Chammas
We use inconsistent terminology when talking about error classes. I'd like to
get some clarity on that before contributing any potential improvements to this
part of the documentation.
Consider
[INCOMPLETE_TYPE_DEFINITION|https://spark.apache.org/docs/3.5.0/sql-error-conditions-incomplete-type-definition-error-class.html].
It has several key pieces of hierarchical information that have inconsistent
names throughout our documentation and codebase:
* 42
** K01
*** INCOMPLETE_TYPE_DEFINITION
**** ARRAY
**** MAP
**** STRUCT
What are the names of these different levels of information?
Some examples of inconsistent terminology:
* [Over
here|https://spark.apache.org/docs/latest/sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation]
we call 42 the "class". Yet on the main page for INCOMPLETE_TYPE_DEFINITION we
call that an "error class". So what exactly is a class, the 42 or the
INCOMPLETE_TYPE_DEFINITION?
* [Over
here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/README.md#L122]
we call K01 the "subclass". But [over
here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/error-classes.json#L1452-L1467]
we call the ARRAY, MAP, and STRUCT the subclasses. And on the main page for
INCOMPLETE_TYPE_DEFINITION we call those same things "derived error classes".
So what exactly is a subclass?
I propose the following terminology, which we should use consistently
throughout our code and documentation:
* Error class: 42
* Error subclass: K01
* Error state: 42K01
* Error condition: INCOMPLETE_TYPE_DEFINITION
* Error sub-conditions: ARRAY, MAP, STRUCT
Side note: With this terminology, I believe talking about error classes and
subclasses in front of users is not helpful. I don't think anybody cares about
what 42 by itself means, or what K01 by itself means. Accordingly, we should
limit how much we talk about these concepts.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]