[ 
https://issues.apache.org/jira/browse/SPARK-39354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545016#comment-17545016
 ] 

Yang Jie commented on SPARK-39354:
----------------------------------

This issue was introduced by SPARK-38118:

 

[https://github.com/apache/spark/blob/5a3ba9b0b301a3b0c43f8d0d88e2b6bdce57d0e6/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L4353-L4372]

 

 
{code:java}
      // HAVING clause will be resolved as a Filter. When having func(column 
with wrong data type),
      // the column could be wrapped by a TempResolvedColumn, e.g. 
mean(tempresolvedcolumn(t.c)).
      // Because TempResolvedColumn can still preserve column data type, here 
is a chance to check
      // if the data type matches with the required data type of the function. 
We can throw an error
      // when data types mismatches.
      case operator: Filter =>
        operator.expressions.foreach(_.foreachUp {
          case e: Expression if e.childrenResolved && 
e.checkInputDataTypes().isFailure =>
            e.checkInputDataTypes() match {
              case TypeCheckResult.TypeCheckFailure(message) =>
                e.setTagValue(DATA_TYPE_MISMATCH_ERROR, true)
                e.failAnalysis(
                  s"cannot resolve '${e.sql}' due to data type mismatch: 
$message" +
                    extraHintForAnsiTypeCoercionExpression(plan))
            }
          case _ =>
        })
      case _ => {code}
 

`case operator: Filter =>` is too broad, should add some restrictions 

 

 

> The analysis exception is incorrect
> -----------------------------------
>
>                 Key: SPARK-39354
>                 URL: https://issues.apache.org/jira/browse/SPARK-39354
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Yuming Wang
>            Priority: Minor
>
> {noformat}
> scala> spark.sql("create table t1(user_id int, auct_end_dt date) using 
> parquet;")
> res0: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("select * from t1 join t2 on t1.user_id = t2.user_id where 
> t1.auct_end_dt >= Date_sub('2020-12-27', 90)").show
> org.apache.spark.sql.AnalysisException: cannot resolve 
> 'date_sub('2020-12-27', 90)' due to data type mismatch: argument 1 requires 
> date type, however, ''2020-12-27'' is of string type.; line 1 pos 76
>   at 
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
>   at 
> org.apache.spark.sql.catalyst.analysis.RemoveTempResolvedColumn$.$anonfun$apply$82(Analyzer.scala:4334)
>   at 
> org.apache.spark.sql.catalyst.analysis.RemoveTempResolvedColumn$.$anonfun$apply$82$adapted(Analyzer.scala:4327)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:365)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:364)
>   at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:364)
> {noformat}
> The analysis exception should be:
> {noformat}
> org.apache.spark.sql.AnalysisException: Table or view not found: t2
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to