Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19056#discussion_r135360650
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelation.scala
 ---
    @@ -65,11 +66,12 @@ object PropagateEmptyRelation extends Rule[LogicalPlan] 
with PredicateHelper {
           case _: RepartitionByExpression => empty(p)
           // An aggregate with non-empty group expression will return one 
output row per group when the
           // input to the aggregate is not empty. If the input to the 
aggregate is empty then all groups
    -      // will be empty and thus the output will be empty.
    +      // will be empty and thus the output will be empty. If we're working 
on batch data, we can
    +      // then treat the aggregate as redundant.
           //
           // If the grouping expressions are empty, however, then the 
aggregate will always produce a
           // single output row and thus we cannot propagate the EmptyRelation.
    -      case Aggregate(ge, _, _) if ge.nonEmpty => empty(p)
    +      case Aggregate(ge, _, _) if ge.nonEmpty and !p.isStreaming => 
empty(p)
    --- End diff --
    
    Can you add to the docs above why we are avoiding this when its streaming.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to