[
https://issues.apache.org/jira/browse/SPARK-33917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18090335#comment-18090335
]
Piotr Piecha commented on SPARK-33917:
--------------------------------------
This ticket can be closed. Error message mentioned in description is not
misleading anymore. See details below.
I retested it using the following test in which WHEN NOT MATCHED uses column
(s.outstanding) which is not present in source table
{code:scala}
test("merge into table with analysis failure when not matched and
transactional checks") {
createAndInitTable(
"pk INT NOT NULL, salary INT, dep STRING, outstanding STRING",
"""{ "pk": 1, "salary": 100, "dep": "hr", "outstanding": "yes" }
|{ "pk": 2, "salary": 200, "dep": "software", "outstanding": "no" }
|{ "pk": 3, "salary": 300, "dep": "hr", "outstanding": "yes" }
|""".stripMargin)
sql(s"CREATE TABLE $sourceNameAsString (pk INT NOT NULL, salary INT, dep
STRING)")
sql(s"INSERT INTO $sourceNameAsString VALUES (1, 150, 'support'), (4, 400,
'finance')")
//val exception = intercept[AnalysisException] {
val (txn, txnTables) = executeTransaction {
sql(
s"""MERGE INTO $tableNameAsString t
|USING $sourceNameAsString s
|ON t.pk = s.pk
|WHEN MATCHED THEN
| UPDATE SET salary = 1
|WHEN NOT MATCHED THEN
| INSERT (pk, salary, dep, outstanding) VALUES (s.pk, s.salary,
'pending', s.outstanding)
|""".stripMargin)
}
// assert(exception.getMessage.contains("invalid_column"))
// assert(catalog.lastTransaction.currentState == Aborted)
// assert(catalog.lastTransaction.isClosed)
}
{code}
The error message is:
{noformat}
org.apache.spark.sql.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGESTION] A
column, variable, or function parameter with name `s`.`outstanding` cannot be
resolved. Did you mean one of the following? [`dep`, `pk`, `salary`].{noformat}
The error message lists columns from source table only.
> Fix misleading error message for unresolved attributes inside INSERT action
> of MERGE INTO
> -----------------------------------------------------------------------------------------
>
> Key: SPARK-33917
> URL: https://issues.apache.org/jira/browse/SPARK-33917
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.0.0, 3.1.0, 3.2.0
> Reporter: Anton Okolnychyi
> Priority: Minor
>
> Per spec, INSERT assignments are resolved only against the source table in
> MERGE operations. However, the error message does not take this into account
> and prints all columns from both the source and target tables. This leads to
> confusing error messages.
> {{cannot resolve '`id`' given input columns: [t.c, s.c1, s.c2, t.id]}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]