[
https://issues.apache.org/jira/browse/SPARK-39354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545016#comment-17545016
]
Yang Jie commented on SPARK-39354:
----------------------------------
This issue was introduced by SPARK-38118:
[https://github.com/apache/spark/blob/5a3ba9b0b301a3b0c43f8d0d88e2b6bdce57d0e6/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L4353-L4372]
{code:java}
// HAVING clause will be resolved as a Filter. When having func(column
with wrong data type),
// the column could be wrapped by a TempResolvedColumn, e.g.
mean(tempresolvedcolumn(t.c)).
// Because TempResolvedColumn can still preserve column data type, here
is a chance to check
// if the data type matches with the required data type of the function.
We can throw an error
// when data types mismatches.
case operator: Filter =>
operator.expressions.foreach(_.foreachUp {
case e: Expression if e.childrenResolved &&
e.checkInputDataTypes().isFailure =>
e.checkInputDataTypes() match {
case TypeCheckResult.TypeCheckFailure(message) =>
e.setTagValue(DATA_TYPE_MISMATCH_ERROR, true)
e.failAnalysis(
s"cannot resolve '${e.sql}' due to data type mismatch:
$message" +
extraHintForAnsiTypeCoercionExpression(plan))
}
case _ =>
})
case _ => {code}
`case operator: Filter =>` is too broad, should add some restrictions
> The analysis exception is incorrect
> -----------------------------------
>
> Key: SPARK-39354
> URL: https://issues.apache.org/jira/browse/SPARK-39354
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.3.0
> Reporter: Yuming Wang
> Priority: Minor
>
> {noformat}
> scala> spark.sql("create table t1(user_id int, auct_end_dt date) using
> parquet;")
> res0: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("select * from t1 join t2 on t1.user_id = t2.user_id where
> t1.auct_end_dt >= Date_sub('2020-12-27', 90)").show
> org.apache.spark.sql.AnalysisException: cannot resolve
> 'date_sub('2020-12-27', 90)' due to data type mismatch: argument 1 requires
> date type, however, ''2020-12-27'' is of string type.; line 1 pos 76
> at
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
> at
> org.apache.spark.sql.catalyst.analysis.RemoveTempResolvedColumn$.$anonfun$apply$82(Analyzer.scala:4334)
> at
> org.apache.spark.sql.catalyst.analysis.RemoveTempResolvedColumn$.$anonfun$apply$82$adapted(Analyzer.scala:4327)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:365)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:364)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:364)
> {noformat}
> The analysis exception should be:
> {noformat}
> org.apache.spark.sql.AnalysisException: Table or view not found: t2
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]