[GitHub] [spark] huaxingao commented on a change in pull request #34914: [SPARK-37627][SQL][FOLLOWUP] Add tests for sorted BucketTransform

GitBox Mon, 03 Jan 2022 17:57:58 -0800


huaxingao commented on a change in pull request #34914:
URL: https://github.com/apache/spark/pull/34914#discussion_r777785755




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/expressions/expressions.scala
##########
@@ -106,26 +106,31 @@ private[sql] final case class BucketTransform(
     columns: Seq[NamedReference],
     sortedColumns: Seq[NamedReference] = Seq.empty[NamedReference]) extends 
RewritableTransform {
 
-  override val name: String = "bucket"
+  override val name: String = if (sortedColumns.nonEmpty) "sortedBucket" else 
"bucket"
 
   override def references: Array[NamedReference] = {
     arguments.collect { case named: NamedReference => named }
   }
 
-  override def arguments: Array[Expression] = numBuckets +: columns.toArray
-
-  override def describe: String =
+  override def arguments: Array[Expression] = {
     if (sortedColumns.nonEmpty) {
-      s"bucket(${arguments.map(_.describe).mkString(", ")}," +
-        s" ${sortedColumns.map(_.describe).mkString(", ")})"
+      (columns.toArray :+ numBuckets) ++ sortedColumns
     } else {
-      s"bucket(${arguments.map(_.describe).mkString(", ")})"
+      numBuckets +: columns.toArray

Review comment:
       If there are `sortedColumn`, we need `numBuckets` in between of 
`columns` and `sortedColumns`, because we need a way to figure out which 
elements in the array are for columns, and which elements are for 
sortedColumns. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] huaxingao commented on a change in pull request #34914: [SPARK-37627][SQL][FOLLOWUP] Add tests for sorted BucketTransform

Reply via email to