[
https://issues.apache.org/jira/browse/TAJO-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14315496#comment-14315496
]
ASF GitHub Bot commented on TAJO-1277:
--------------------------------------
Github user jihoonson commented on a diff in the pull request:
https://github.com/apache/tajo/pull/379#discussion_r24472049
--- Diff:
tajo-plan/src/main/java/org/apache/tajo/plan/joinorder/GreedyHeuristicJoinOrderAlgorithm.java
---
@@ -57,17 +54,69 @@ public FoundJoinOrder findBestOrder(LogicalPlan plan,
LogicalPlan.QueryBlock blo
JoinEdge bestPair;
while (remainRelations.size() > 1) {
+ Set<LogicalNode> checkingRelations = new
LinkedHashSet<LogicalNode>();
+
+ for (LogicalNode relation : remainRelations) {
+ Collection <String> relationStrings =
PlannerUtil.getRelationLineageWithinQueryBlock(plan, relation);
+ List<JoinEdge> joinEdges = new ArrayList<JoinEdge>();
+ String relationCollection =
TUtil.collectionToString(relationStrings, ",");
+ List<JoinEdge> joinEdgesForGiven =
joinGraph.getIncomingEdges(relationCollection);
+ if (joinEdgesForGiven != null) {
+ joinEdges.addAll(joinEdgesForGiven);
+ }
+ for (String relationString: relationStrings) {
--- End diff --
Would you mind explaning these lines?
I found that the same join edge is added twice to the
```joinEdgesForGiven``` list.
> GreedyHeuristicJoinOrderAlgorithm sometimes wrongly assumes associativity of
> joins
> ----------------------------------------------------------------------------------
>
> Key: TAJO-1277
> URL: https://issues.apache.org/jira/browse/TAJO-1277
> Project: Tajo
> Issue Type: Bug
> Reporter: Keuntae Park
> Assignee: Keuntae Park
>
> It looks like GreedyHeuristicJoinOrderAlgorithm always assumes every joins
> are associative.
> Following query returns in inaccurate result:
> {code}
> select * FROM
> customer c
> right outer join nation n on c.c_custkey = n.n_nationkey
> join region r on c.c_custkey = r.r_regionkey;
> {code}
> because GreedyHeuristicJoinOrderAlgorithm changes join order as
> {code}
> select * FROM
> customer c
> join region r on c.c_custkey = r.r_regionkey
> right outer join nation n on c.c_custkey = n.n_nationkey;
> {code}
> I think getBestPair() should be fixed to avoid wrong join ordering.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)