MaxGekk commented on code in PR #56611:
URL: https://github.com/apache/spark/pull/56611#discussion_r3449034560
##########
common/utils/src/main/resources/error/error-conditions.json:
##########
@@ -8167,12 +8167,12 @@
},
"SET_OPERATION_ON_MAP_TYPE" : {
"message" : [
- "Cannot have MAP type columns in DataFrame which calls set
operations (INTERSECT, EXCEPT, etc.), but the type of column <colName> is
<dataType>."
+ "Cannot have MAP type columns in a set operation (INTERSECT, EXCEPT,
etc.), but the type of column <colName> is <dataType>."
Review Comment:
Nice fix -- the old "DataFrame which calls set operations" wording was
misleading since these fire for SQL set operations too.
One small thing the new golden makes topical: `SELECT DISTINCT m FROM
map_view` and the bare `UNION` query (de-duplication, not a set operation) also
raise `SET_OPERATION_ON_MAP_TYPE`/`..._VARIANT_TYPE` in the new `.sql.out`. So
a user who wrote `SELECT DISTINCT m` would now see "Cannot have MAP type
columns in a set operation (INTERSECT, EXCEPT, etc.)" for a query that contains
no set operation.
Optional: acknowledge the de-duplication path in the wording, e.g. "...in a
set operation or DISTINCT/de-duplication (INTERSECT, EXCEPT, DISTINCT, ...)".
This framing is pre-existing and the error-class name itself is
`SET_OPERATION_ON_*`, so it's reasonable to keep the current wording -- the PR
is already a strict improvement. Non-blocking.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]