Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19193
  
    For 1), yes let's forbid it.
    
    For 2), My feeling is, in `Dataset` API we don't need `having` because it's 
easy to change the order of the operator, users can call `filter` first then 
`agg`. While in SQL you will need subquery so `having` is convenient.
    
    For `df.groupBy('a).agg(max('b), rank().over(window)).where(sum('b) === 
5)`, I think it's valid to fail, as Spark is not smart enough to rewrite your 
query and make it work. If we can find a way to rewrite and fix the query, we 
can support it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to