[GitHub] spark issue #15289: [SPARK-17712][SQL] Fix invalid pushdown of data-independ...

ioana-delaney Fri, 30 Sep 2016 15:53:44 -0700

Github user ioana-delaney commented on the issue:

    https://github.com/apache/spark/pull/15289
  
    @JoshRosen In your example, we don't want to first count one million rows 
coming from the base table and then to return zero rows based on the false 
predicate in the outer query block. Instead, by pushing down the predicate to 
the base table, you do a pre-filtering and return zero rows early in the plan. 
Then you apply the aggregate followed by the the original predicate that will 
do the final filtering.  Anyway, just some thought for further optimizations of 
predicates pushed down through aggregation. Also, a more realistic query would 
imply false predicates through predicate transitivity e.g. a != b and a =1 and 
b =1 => 1 != 1 So there might be some real customer queries that can take 
advantage of these optimizations.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #15289: [SPARK-17712][SQL] Fix invalid pushdown of data-independ...

Reply via email to