[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

ioana-delaney Fri, 17 Mar 2017 18:53:43 -0700

Github user ioana-delaney commented on the issue:

    https://github.com/apache/spark/pull/17286
  
    @wzhfy Some thoughts on how to solve the Cartesian problem as part of the 
join enumeration algorithm is to apply a similar strategy to the one that we 
discuss for star-plans. You keep track of "connected" tables and "unconnected" 
tables. During join enumeration, mixed combinations are pruned until the plan 
for the âconnectedâ set of tables was built. Then, we add tables from the 
"unconnected" set - maybe only as left-deep trees (i.e. the size of the inner 
is one). Also, knowing that a set of tables are connected through join 
conditions, will allow further plan pruning based on the presence of join 
predicates. Integration with star-join, and probably other heuristics, would 
require to introduce some filtering/pruning strategies on top of the search 
engine. Just some thoughtsâ¦



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

Reply via email to