[GitHub] [spark] maropu commented on a change in pull request #32552: [SPARK-34819][SQL] MapType supports comparable semantics

GitBox Mon, 17 May 2021 11:16:56 -0700


maropu commented on a change in pull request #32552:
URL: https://github.com/apache/spark/pull/32552#discussion_r633204609




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala
##########
@@ -32,6 +32,23 @@ class ArrayBasedMapData(val keyArray: ArrayData, val 
valueArray: ArrayData) exte
 
   override def copy(): MapData = new ArrayBasedMapData(keyArray.copy(), 
valueArray.copy())
 
+  override def equals(o: Any): Boolean = {
+    if (!o.isInstanceOf[ArrayBasedMapData]) {
+      return false
+    }
+
+    val other = o.asInstanceOf[ArrayBasedMapData]
+    if (other eq null) {
+      return false
+    }
+
+    this.keyArray == other.keyArray && this.valueArray == other.valueArray
+  }
+
+  override def hashCode: Int = {
+    keyArray.hashCode() * 37 + valueArray.hashCode()
+  }
+

Review comment:
       This part was copied from https://github.com/apache/spark/pull/13847

##########
File path: 
sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeMapData.java
##########
@@ -112,6 +114,22 @@ public UnsafeArrayData valueArray() {
     return values;
   }
 
+  @Override
+  public int hashCode() {
+    return Murmur3_x86_32.hashUnsafeBytes(baseObject, baseOffset, sizeInBytes, 
42);
+  }
+
+  @Override
+  public boolean equals(Object other) {
+    if (other instanceof UnsafeMapData) {
+      UnsafeMapData o = (UnsafeMapData) other;
+      return (sizeInBytes == o.sizeInBytes) &&
+        ByteArrayMethods.arrayEquals(baseObject, baseOffset, o.baseObject, 
o.baseOffset,
+          sizeInBytes);
+    }
+    return false;
+  }

Review comment:
       This part is the same with the `UnsafeArrayData` one.

##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
##########
@@ -466,20 +453,8 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
             // aggregates have different column expressions.
             val distinctExpressions =
               
functionsWithDistinct.head.aggregateFunction.children.filterNot(_.foldable)
-            val normalizedNamedDistinctExpressions = distinctExpressions.map { 
e =>
-              // Ideally this should be done in `NormalizeFloatingNumbers`, 
but we do it here
-              // because `distinctExpressions` is not extracted during logical 
phase.
-              NormalizeFloatingNumbers.normalize(e) match {
-                case ne: NamedExpression => ne
-                case other =>
-                  // Keep the name of the original expression.
-                  val name = e match {
-                    case ne: NamedExpression => ne.name
-                    case _ => e.toString
-                  }
-                  Alias(other, name)()
-              }
-            }
+            val normalizedNamedDistinctExpressions =
+              AggUtils.normalizeDistinctGroupingExprs(distinctExpressions)

Review comment:
       `normalizeDistinctGroupingExprs` is called only once, but I moved this 
logic into the `AggUtils` side so that the similar logics are located in close 
places.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] maropu commented on a change in pull request #32552: [SPARK-34819][SQL] MapType supports comparable semantics

Reply via email to