[ 
https://issues.apache.org/jira/browse/SPARK-33917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18090335#comment-18090335
 ] 

Piotr Piecha commented on SPARK-33917:
--------------------------------------

This ticket can be closed. Error message mentioned in description is not 
misleading anymore. See details below.

I retested it using the following test in which WHEN NOT MATCHED uses column 
(s.outstanding) which is not present in source table
{code:scala}
  test("merge into table with analysis failure when not matched and 
transactional checks") {
    createAndInitTable(
      "pk INT NOT NULL, salary INT, dep STRING, outstanding STRING",
      """{ "pk": 1, "salary": 100, "dep": "hr", "outstanding": "yes" }
        |{ "pk": 2, "salary": 200, "dep": "software", "outstanding": "no" }
        |{ "pk": 3, "salary": 300, "dep": "hr", "outstanding": "yes" }
        |""".stripMargin)

    sql(s"CREATE TABLE $sourceNameAsString (pk INT NOT NULL, salary INT, dep 
STRING)")
    sql(s"INSERT INTO $sourceNameAsString VALUES (1, 150, 'support'), (4, 400, 
'finance')")

    //val exception = intercept[AnalysisException] {
    val (txn, txnTables) = executeTransaction {
      sql(
        s"""MERGE INTO $tableNameAsString t
           |USING $sourceNameAsString s
           |ON t.pk = s.pk
           |WHEN MATCHED THEN
           | UPDATE SET salary = 1
           |WHEN NOT MATCHED THEN
           | INSERT (pk, salary, dep, outstanding) VALUES (s.pk, s.salary, 
'pending', s.outstanding)
           |""".stripMargin)
    }

    // assert(exception.getMessage.contains("invalid_column"))
    // assert(catalog.lastTransaction.currentState == Aborted)
    // assert(catalog.lastTransaction.isClosed)
  } 
{code}
The error message is:
{noformat}
org.apache.spark.sql.AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGESTION] A 
column, variable, or function parameter with name `s`.`outstanding` cannot be 
resolved. Did you mean one of the following? [`dep`, `pk`, `salary`].{noformat}
The error message lists columns from source table only.

> Fix misleading error message for unresolved attributes inside INSERT action 
> of MERGE INTO
> -----------------------------------------------------------------------------------------
>
>                 Key: SPARK-33917
>                 URL: https://issues.apache.org/jira/browse/SPARK-33917
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0, 3.1.0, 3.2.0
>            Reporter: Anton Okolnychyi
>            Priority: Minor
>
> Per spec, INSERT assignments are resolved only against the source table in 
> MERGE operations. However, the error message does not take this into account 
> and prints all columns from both the source and target tables. This leads to 
> confusing error messages.
> {{cannot resolve '`id`' given input columns: [t.c, s.c1, s.c2, t.id]}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to