cloud-fan commented on a change in pull request #27580: 
[SPARK-27619][SQL]MapType should be prohibited in hash expressions
URL: https://github.com/apache/spark/pull/27580#discussion_r379404394
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala
 ##########
 @@ -249,10 +247,19 @@ abstract class HashExpression[E] extends Expression {
 
   override def nullable: Boolean = false
 
+  protected def hasMapType(dt: DataType): Boolean = {
+    dt.existsRecursively(_.isInstanceOf[MapType])
+  }
+
   override def checkInputDataTypes(): TypeCheckResult = {
     if (children.length < 1) {
       TypeCheckResult.TypeCheckFailure(
         s"input to function $prettyName requires at least one argument")
+    } else if (children.forall(child => hasMapType(child.dataType)) &&
+      !SQLConf.get.getConf(SQLConf.LEGACY_USE_HASH_ON_MAPTYPE)) {
+      TypeCheckResult.TypeCheckFailure(
+        s"input to function $prettyName cannot contain elements of MapType. To 
restore previous " +
+          s"behavior set spark.sql.legacy.useHashOnMapType to true.")
 
 Review comment:
   we should also warn users about the consequence: logically same maps may 
have different hashcode.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to