[ 
https://issues.apache.org/jira/browse/SPARK-46810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Chammas updated SPARK-46810:
-------------------------------------
    Description: 
We use inconsistent terminology when talking about error classes. I'd like to 
get some clarity on that before contributing any potential improvements to this 
part of the documentation.

Consider 
[INCOMPLETE_TYPE_DEFINITION|https://spark.apache.org/docs/3.5.0/sql-error-conditions-incomplete-type-definition-error-class.html].
 It has several key pieces of hierarchical information that have inconsistent 
names throughout our documentation and codebase:
 * 42
 ** K01
 *** INCOMPLETE_TYPE_DEFINITION
 **** ARRAY
 **** MAP
 **** STRUCT

What are the names of these different levels of information?

Some examples of inconsistent terminology:
 * [Over 
here|https://spark.apache.org/docs/latest/sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation]
 we call 42 the "class". Yet on the main page for INCOMPLETE_TYPE_DEFINITION we 
call that an "error class". So what exactly is a class, the 42 or the 
INCOMPLETE_TYPE_DEFINITION?
 * [Over 
here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/README.md#L122]
 we call K01 the "subclass". But [over 
here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/error-classes.json#L1452-L1467]
 we call the ARRAY, MAP, and STRUCT the subclasses. And on the main page for 
INCOMPLETE_TYPE_DEFINITION we call those same things "derived error classes". 
So what exactly is a subclass?

I propose the following terminology, which we should use consistently 
throughout our code and documentation:
 * Error class: 42
 * Error subclass: K01
 * Error state: 42K01
 * Error condition: INCOMPLETE_TYPE_DEFINITION
 * Error sub-conditions: ARRAY, MAP, STRUCT

Side note: With this terminology, I believe talking about error classes and 
subclasses in front of users is not helpful. I don't think anybody cares about 
what "42" by itself means, or what "K01" by itself means. Accordingly, we 
should limit how much we talk about these concepts in the user-facing 
documentation.

  was:
We use inconsistent terminology when talking about error classes. I'd like to 
get some clarity on that before contributing any potential improvements to this 
part of the documentation.

Consider 
[INCOMPLETE_TYPE_DEFINITION|https://spark.apache.org/docs/3.5.0/sql-error-conditions-incomplete-type-definition-error-class.html].
 It has several key pieces of hierarchical information that have inconsistent 
names throughout our documentation and codebase:
 * 42
 ** K01
 *** INCOMPLETE_TYPE_DEFINITION
 **** ARRAY
 **** MAP
 **** STRUCT

What are the names of these different levels of information?

Some examples of inconsistent terminology:
 * [Over 
here|https://spark.apache.org/docs/latest/sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation]
 we call 42 the "class". Yet on the main page for INCOMPLETE_TYPE_DEFINITION we 
call that an "error class". So what exactly is a class, the 42 or the 
INCOMPLETE_TYPE_DEFINITION?
 * [Over 
here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/README.md#L122]
 we call K01 the "subclass". But [over 
here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/error-classes.json#L1452-L1467]
 we call the ARRAY, MAP, and STRUCT the subclasses. And on the main page for 
INCOMPLETE_TYPE_DEFINITION we call those same things "derived error classes". 
So what exactly is a subclass?

I propose the following terminology, which we should use consistently 
throughout our code and documentation:
 * Error class: 42
 * Error subclass: K01
 * Error state: 42K01
 * Error condition: INCOMPLETE_TYPE_DEFINITION
 * Error sub-conditions: ARRAY, MAP, STRUCT

Side note: With this terminology, I believe talking about error classes and 
subclasses in front of users is not helpful. I don't think anybody cares about 
what 42 by itself means, or what K01 by itself means. Accordingly, we should 
limit how much we talk about these concepts.


> Clarify error class terminology
> -------------------------------
>
>                 Key: SPARK-46810
>                 URL: https://issues.apache.org/jira/browse/SPARK-46810
>             Project: Spark
>          Issue Type: Improvement
>          Components: Documentation, SQL
>    Affects Versions: 4.0.0
>            Reporter: Nicholas Chammas
>            Priority: Minor
>
> We use inconsistent terminology when talking about error classes. I'd like to 
> get some clarity on that before contributing any potential improvements to 
> this part of the documentation.
> Consider 
> [INCOMPLETE_TYPE_DEFINITION|https://spark.apache.org/docs/3.5.0/sql-error-conditions-incomplete-type-definition-error-class.html].
>  It has several key pieces of hierarchical information that have inconsistent 
> names throughout our documentation and codebase:
>  * 42
>  ** K01
>  *** INCOMPLETE_TYPE_DEFINITION
>  **** ARRAY
>  **** MAP
>  **** STRUCT
> What are the names of these different levels of information?
> Some examples of inconsistent terminology:
>  * [Over 
> here|https://spark.apache.org/docs/latest/sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation]
>  we call 42 the "class". Yet on the main page for INCOMPLETE_TYPE_DEFINITION 
> we call that an "error class". So what exactly is a class, the 42 or the 
> INCOMPLETE_TYPE_DEFINITION?
>  * [Over 
> here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/README.md#L122]
>  we call K01 the "subclass". But [over 
> here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/error-classes.json#L1452-L1467]
>  we call the ARRAY, MAP, and STRUCT the subclasses. And on the main page for 
> INCOMPLETE_TYPE_DEFINITION we call those same things "derived error classes". 
> So what exactly is a subclass?
> I propose the following terminology, which we should use consistently 
> throughout our code and documentation:
>  * Error class: 42
>  * Error subclass: K01
>  * Error state: 42K01
>  * Error condition: INCOMPLETE_TYPE_DEFINITION
>  * Error sub-conditions: ARRAY, MAP, STRUCT
> Side note: With this terminology, I believe talking about error classes and 
> subclasses in front of users is not helpful. I don't think anybody cares 
> about what "42" by itself means, or what "K01" by itself means. Accordingly, 
> we should limit how much we talk about these concepts in the user-facing 
> documentation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to