AngersZhuuuu commented on a change in pull request #27861:
URL: https://github.com/apache/spark/pull/27861#discussion_r413534302



##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##########
@@ -1691,7 +1691,19 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   override def visitWindowDef(ctx: WindowDefContext): WindowSpecDefinition = 
withOrigin(ctx) {
     // CLUSTER BY ... | PARTITION BY ... ORDER BY ...
     val partition = ctx.partition.asScala.map(expression)
-    val order = ctx.sortItem.asScala.map(visitSortItem)
+    val order = if (ctx.sortItem.asScala.nonEmpty) {
+      ctx.sortItem.asScala.map(visitSortItem)
+    } else if (ctx.windowFrame != null &&
+      ctx.windowFrame().frameType.getType == SqlBaseParser.RANGE) {
+      // for RANGE window frame, we won't add default order spec
+      ctx.sortItem.asScala.map(visitSortItem)
+    } else {
+      // Same default behaviors like hive, when order spec is null
+      // set partition spec expression as order spec
+      ctx.partition.asScala.map { expr =>
+        SortOrder(expression(expr), Ascending, Ascending.defaultNullOrdering, 
Set.empty)

Review comment:
       > I think we should not fix it because Spark side at least the results 
will be non-deterministic. I doubt if this is good to add this support only 
because of compatibility with other DMBSes when the output is expected to be 
useless.
   > 
   > Maybe disallowing it might be a better idea than finding another problem 
later caused by the different and indeterministic data.
   > 
   > Do you maybe know other cases from other distributed DBMSs such as presto?
   
   but in my fix,  we add default order spec, the result will be deterministic.
   In origin way, this kind sql can't run since it will get non-deterministic 
result and is rejected by  
https://github.com/apache/spark/blob/a28ed86a387b286745b30cd4d90b3d558205a5a7/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L2773-L2776




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to