Github user amansinha100 commented on a diff in the pull request:
https://github.com/apache/drill/pull/794#discussion_r109197541
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
---
@@ -105,6 +103,29 @@
public static final PositiveLongValidator
PARQUET_ROWGROUP_FILTER_PUSHDOWN_PLANNING_THRESHOLD = new
PositiveLongValidator(PARQUET_ROWGROUP_FILTER_PUSHDOWN_PLANNING_THRESHOLD_KEY,
Long.MAX_VALUE, 10000);
+ /*
+ Enables rules that re-write query joins in the most optimal way.
+ Though its turned on be default and its value in query optimization
is undeniable, user may want turn off such
+ optimization to leave join order indicated in sql query unchanged.
+
+ For example:
+ Currently only nested loop join allows non-equi join conditions
usage.
+ During planning stage nested loop join will be chosen when non-equi
join is detected
+ and {@link #NLJOIN_FOR_SCALAR} set to false. Though query
performance may not be the most optimal in such case,
+ user may use such workaround to execute queries with non-equi joins.
+
+ Nested loop join allows only INNER and LEFT join usage and implies
that right input is smaller that left input.
+ During LEFT join when join optimization is enabled and detected that
right input is larger that left,
+ join will be optimized: left and right inputs will be flipped and
LEFT join type will be changed to RIGHT one.
+ If query contains non-equi joins, after such optimization it will
fail, since nested loop does not allow
+ RIGHT join. In this case if user accepts probability of non optimal
performance, he may turn off join optimization.
+ Turning off join optimization, makes sense only if user are not sure
that right output is less or equal to left,
+ otherwise join optimization can be left turned on.
+
+ Note: once hash and merge joins will allow non-equi join conditions,
+ the need to turn off join optimization may go away.
+ */
+ public static final BooleanValidator JOIN_OPTIMIZATION = new
BooleanValidator("planner.enable_join_optimization", true);
--- End diff --
Ah, you added this option to enable/disable the *logical* join rules.
Since NestedLoopJoin is a physical join implementation, from the comments I
interpreted that this was intended for the swapping of left and right inputs of
the (physical) NL join, which is why I mentioned about hashjoin_swap option.
It seems to me that if there is an LEFT OUTER JOIN and condition is
non-equality, then we should not allow changing to a Right Outer Join by
flipping the left and right sides, since that would make the query fail. What
do you think ?
I suppose we could keep your boolean option for this PR and address the
left outer join issue separately.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---