Github user sirpkt commented on a diff in the pull request:
https://github.com/apache/tajo/pull/379#discussion_r24553012
--- Diff:
tajo-plan/src/main/java/org/apache/tajo/plan/joinorder/GreedyHeuristicJoinOrderAlgorithm.java
---
@@ -57,17 +54,69 @@ public FoundJoinOrder findBestOrder(LogicalPlan plan,
LogicalPlan.QueryBlock blo
JoinEdge bestPair;
while (remainRelations.size() > 1) {
+ Set<LogicalNode> checkingRelations = new
LinkedHashSet<LogicalNode>();
+
+ for (LogicalNode relation : remainRelations) {
+ Collection <String> relationStrings =
PlannerUtil.getRelationLineageWithinQueryBlock(plan, relation);
+ List<JoinEdge> joinEdges = new ArrayList<JoinEdge>();
+ String relationCollection =
TUtil.collectionToString(relationStrings, ",");
+ List<JoinEdge> joinEdgesForGiven =
joinGraph.getIncomingEdges(relationCollection);
+ if (joinEdgesForGiven != null) {
+ joinEdges.addAll(joinEdgesForGiven);
+ }
+ for (String relationString: relationStrings) {
--- End diff --
Oh, it's my mistake.
When relationStrings has only one entry, this code may adds that entry
twice.
When a LogicalNode contains two relations, for example, A and B,
above code first finds joinEdges whose right relation is "A, B", which is
obtained by TUtil.collectionToString().
Next, it finds joinEdges whose right relation is "A" or "B", which is
obtained by 'for (String relationString: relationStrings)'.
So, if a LogicalNode contains just one relation, this code may adds that
relation repeatedly.
Duplicated relation does not affect the result but I'll patch not to have
duplicated relations.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---