Github user amansinha100 commented on a diff in the pull request:
https://github.com/apache/drill/pull/794#discussion_r107960020
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
---
@@ -105,6 +103,29 @@
public static final PositiveLongValidator
PARQUET_ROWGROUP_FILTER_PUSHDOWN_PLANNING_THRESHOLD = new
PositiveLongValidator(PARQUET_ROWGROUP_FILTER_PUSHDOWN_PLANNING_THRESHOLD_KEY,
Long.MAX_VALUE, 10000);
+ /*
+ Enables rules that re-write query joins in the most optimal way.
+ Though its turned on be default and its value in query optimization
is undeniable, user may want turn off such
+ optimization to leave join order indicated in sql query unchanged.
+
+ For example:
+ Currently only nested loop join allows non-equi join conditions
usage.
+ During planning stage nested loop join will be chosen when non-equi
join is detected
+ and {@link #NLJOIN_FOR_SCALAR} set to false. Though query
performance may not be the most optimal in such case,
+ user may use such workaround to execute queries with non-equi joins.
+
+ Nested loop join allows only INNER and LEFT join usage and implies
that right input is smaller that left input.
+ During LEFT join when join optimization is enabled and detected that
right input is larger that left,
+ join will be optimized: left and right inputs will be flipped and
LEFT join type will be changed to RIGHT one.
+ If query contains non-equi joins, after such optimization it will
fail, since nested loop does not allow
+ RIGHT join. In this case if user accepts probability of non optimal
performance, he may turn off join optimization.
+ Turning off join optimization, makes sense only if user are not sure
that right output is less or equal to left,
+ otherwise join optimization can be left turned on.
+
+ Note: once hash and merge joins will allow non-equi join conditions,
+ the need to turn off join optimization may go away.
+ */
+ public static final BooleanValidator JOIN_OPTIMIZATION = new
BooleanValidator("planner.enable_join_optimization", true);
--- End diff --
The name 'join_optimization' is misleading since that term is quite broad
and includes both ordering of multiple joins and the left vs right inputs of a
join. Here we are talking about the latter only and for a specific join type.
For HashJoins, we have a planner option 'enable_hashjoin_swap'. It sounds to
me that your new option is targeted for the same thing for nested loop join.
Would be better to call it 'enable_nljoin_swap'. Does that convey the intent
you want or is there something more ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---