Re: [PR] [SPARK-47353][SQL] Enable collation support for the Mode expression [spark]

via GitHub Sat, 15 Jun 2024 09:54:55 -0700


GideonPotok commented on code in PR #46597:
URL: https://github.com/apache/spark/pull/46597#discussion_r1641301489



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Mode.scala:
##########
@@ -74,16 +90,25 @@ case class Mode(
     if (buffer.isEmpty) {
       return null
     }
-
+    val collationAwareBuffer = child.dataType match {
+      case c: StringType if
+        !CollationFactory.fetchCollation(c.collationId).supportsBinaryEquality 
=>
+        val collationId = c.collationId
+        val modeMap = buffer.toSeq.groupMapReduce {

Review Comment:
   @dbatomic see comment here 
https://github.com/apache/spark/pull/46917#discussion_r1640081873
   
   And let us know if on board to continue with this approach. 
   
   It's always an option to not have mode support collations. But if supporting 
collations, this is the best way to go. We have tried a bunch of approaches and 
this way is simple, tested, decoupled and at the correct layer of abstraction, 
and easy to modify. 
   
   I can also add support for complex types and for PandasMode in this pr once 
I know you are on board with this approach. 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-47353][SQL] Enable collation support for the Mode expression [spark]

Reply via email to