Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21330#discussion_r212800158
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
    @@ -1883,7 +1883,19 @@ class Analyzer(
           // Second, we group extractedWindowExprBuffer based on their 
Partition and Order Specs.
           val groupedWindowExpressions = extractedWindowExprBuffer.groupBy { 
expr =>
             val distinctWindowSpec = expr.collect {
    -          case window: WindowExpression => window.windowSpec
    +          case window: WindowExpression =>
    +            val winExpr = window.windowFunction
    +            val distinctOpt = winExpr.find (expr => 
expr.isInstanceOf[AggregateExpression]
    +                && expr.asInstanceOf[AggregateExpression].isDistinct)
    +            if (distinctOpt.nonEmpty && 
window.windowSpec.orderSpec.nonEmpty) {
    +              failAnalysis(s"ORDER BY cannot be used with DISTINCT: 
$window")
    --- End diff --
    
    Just out of curiosity, does hive have the same limitation? If so, the 
current way, roughly ordered rows and checking previous row for distinct 
windows makes sense to me.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to