Github user arina-ielchiieva commented on a diff in the pull request:
https://github.com/apache/drill/pull/794#discussion_r108035986
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
---
@@ -105,6 +103,29 @@
public static final PositiveLongValidator
PARQUET_ROWGROUP_FILTER_PUSHDOWN_PLANNING_THRESHOLD = new
PositiveLongValidator(PARQUET_ROWGROUP_FILTER_PUSHDOWN_PLANNING_THRESHOLD_KEY,
Long.MAX_VALUE, 10000);
+ /*
+ Enables rules that re-write query joins in the most optimal way.
+ Though its turned on be default and its value in query optimization
is undeniable, user may want turn off such
+ optimization to leave join order indicated in sql query unchanged.
+
+ For example:
+ Currently only nested loop join allows non-equi join conditions
usage.
+ During planning stage nested loop join will be chosen when non-equi
join is detected
+ and {@link #NLJOIN_FOR_SCALAR} set to false. Though query
performance may not be the most optimal in such case,
+ user may use such workaround to execute queries with non-equi joins.
+
+ Nested loop join allows only INNER and LEFT join usage and implies
that right input is smaller that left input.
+ During LEFT join when join optimization is enabled and detected that
right input is larger that left,
+ join will be optimized: left and right inputs will be flipped and
LEFT join type will be changed to RIGHT one.
+ If query contains non-equi joins, after such optimization it will
fail, since nested loop does not allow
+ RIGHT join. In this case if user accepts probability of non optimal
performance, he may turn off join optimization.
+ Turning off join optimization, makes sense only if user are not sure
that right output is less or equal to left,
+ otherwise join optimization can be left turned on.
+
+ Note: once hash and merge joins will allow non-equi join conditions,
+ the need to turn off join optimization may go away.
+ */
+ public static final BooleanValidator JOIN_OPTIMIZATION = new
BooleanValidator("planner.enable_join_optimization", true);
--- End diff --
JOIN_OPTIMIZATION enables two rules `DRILL_JOIN_TO_MULTIJOIN_RULE` and
`DRILL_LOPT_OPTIMIZE_JOIN_RULE` which are applicable for any types of joins.
That's why naming is quite broad, I believe these two rules are not only
responsible for join swap but for all other join optimization techniques. In
our use case, user may want to disable them when he doesn't won't join swap to
be performed but there may other reasons. Though as I have noted, when we
implement non-equality joins for hash and merge joins, we may remove this
configuration parameter.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---