Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22563#discussion_r220830491
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
 ---
    @@ -155,37 +155,35 @@ case class InSubquery(values: Seq[Expression], query: 
ListQuery)
     
     
       override def checkInputDataTypes(): TypeCheckResult = {
    -    val mismatchOpt = !DataType.equalsStructurally(query.dataType, 
value.dataType,
    -      ignoreNullability = true)
    -    if (mismatchOpt) {
    -      if (values.length != query.childOutputs.length) {
    -        TypeCheckResult.TypeCheckFailure(
    -          s"""
    -             |The number of columns in the left hand side of an IN 
subquery does not match the
    -             |number of columns in the output of subquery.
    -             |#columns in left hand side: ${values.length}.
    -             |#columns in right hand side: ${query.childOutputs.length}.
    -             |Left side columns:
    -             |[${values.map(_.sql).mkString(", ")}].
    -             |Right side columns:
    -             |[${query.childOutputs.map(_.sql).mkString(", 
")}].""".stripMargin)
    -      } else {
    -        val mismatchedColumns = values.zip(query.childOutputs).flatMap {
    -          case (l, r) if l.dataType != r.dataType =>
    -            Seq(s"(${l.sql}:${l.dataType.catalogString}, 
${r.sql}:${r.dataType.catalogString})")
    -          case _ => None
    -        }
    -        TypeCheckResult.TypeCheckFailure(
    -          s"""
    -             |The data type of one or more elements in the left hand side 
of an IN subquery
    -             |is not compatible with the data type of the output of the 
subquery
    -             |Mismatched columns:
    -             |[${mismatchedColumns.mkString(", ")}]
    -             |Left side:
    -             |[${values.map(_.dataType.catalogString).mkString(", ")}].
    -             |Right side:
    -             
|[${query.childOutputs.map(_.dataType.catalogString).mkString(", 
")}].""".stripMargin)
    +    if (values.length != query.childOutputs.length) {
    --- End diff --
    
    That's not how it's designed to work. The reason we have `CheckAnalysis` is 
to do analysis work as much as we can, and report the most precise error at the 
end. Performance is not a big concern in analyzer, giving users the best error 
message is more important. Although in this case the error message is not a 
problem, it's better to follow the designed way.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to