gianm commented on a change in pull request #9773:
URL: https://github.com/apache/druid/pull/9773#discussion_r421051659
##########
File path: sql/src/test/java/org/apache/druid/sql/calcite/CalciteQueryTest.java
##########
@@ -8346,6 +8346,166 @@ public void testLeftJoinTwoLookupsUsingJoinOperator()
throws Exception
);
}
+ @Test
+ public void testInnerJoinTableLookupLookupWithFilterWithOuterLimit() throws
Exception
+ {
+ testQuery(
+ "SELECT dim1\n"
+ + "FROM foo\n"
+ + "INNER JOIN lookup.lookyloo l ON foo.dim2 = l.k\n"
+ + "INNER JOIN lookup.lookyloo l2 ON foo.dim2 = l2.k\n"
+ + "WHERE l.v = 'xa'\n"
+ + "LIMIT 100\n",
+ ImmutableList.of(
+ newScanQueryBuilder()
+ .dataSource(
+ join(
+ join(
+ new TableDataSource(CalciteTests.DATASOURCE1),
+ new LookupDataSource("lookyloo"),
+ "j0.",
+
equalsCondition(DruidExpression.fromColumn("dim2"),
DruidExpression.fromColumn("j0.k")),
+ JoinType.INNER
+ ),
+ new LookupDataSource("lookyloo"),
+ "_j0.",
+ equalsCondition(DruidExpression.fromColumn("dim2"),
DruidExpression.fromColumn("_j0.k")),
+ JoinType.INNER
+ )
+ )
+ .intervals(querySegmentSpec(Filtration.eternity()))
+ .limit(100)
+ .filters(selector("j0.v", "xa", null))
+ .columns("dim1")
+ .context(QUERY_CONTEXT_DEFAULT)
+ .build()
+ ),
+ ImmutableList.of(
+ new Object[]{""},
+ new Object[]{"1"}
+ )
+ );
+ }
+
+ @Test
+ public void testInnerJoinTableLookupLookupWithFilterWithoutLimit() throws
Exception
+ {
+ testQuery(
Review comment:
Could you also add a super-long version of this (just in case)? Such as:
```java
"SELECT dim1\n"
+ "FROM foo\n"
+ "INNER JOIN lookup.lookyloo l ON foo.dim2 = l.k\n"
+ "INNER JOIN lookup.lookyloo l2 ON foo.dim2 = l2.k\n"
+ "INNER JOIN lookup.lookyloo l3 ON foo.dim2 = l3.k\n"
+ "INNER JOIN lookup.lookyloo l4 ON foo.dim2 = l4.k\n"
+ "INNER JOIN lookup.lookyloo l5 ON foo.dim2 = l5.k\n"
+ "INNER JOIN lookup.lookyloo l6 ON foo.dim2 = l6.k\n"
+ "INNER JOIN lookup.lookyloo l7 ON foo.dim2 = l7.k\n"
+ "INNER JOIN lookup.lookyloo l8 ON foo.dim2 = l8.k\n"
+ "INNER JOIN lookup.lookyloo l9 ON foo.dim2 = l9.k\n"
+ "INNER JOIN lookup.lookyloo l10 ON foo.dim2 = l10.k\n"
+ "INNER JOIN lookup.lookyloo l11 ON foo.dim2 = l11.k\n"
+ "INNER JOIN lookup.lookyloo l12 ON foo.dim2 = l12.k\n"
+ "INNER JOIN lookup.lookyloo l13 ON foo.dim2 = l13.k\n"
+ "INNER JOIN lookup.lookyloo l14 ON foo.dim2 = l14.k\n"
+ "INNER JOIN lookup.lookyloo l15 ON foo.dim2 = l15.k\n"
+ "INNER JOIN lookup.lookyloo l16 ON foo.dim2 = l16.k\n"
+ "INNER JOIN lookup.lookyloo l17 ON foo.dim2 = l17.k\n"
+ "INNER JOIN lookup.lookyloo l18 ON foo.dim2 = l18.k\n"
+ "INNER JOIN lookup.lookyloo l19 ON foo.dim2 = l19.k\n"
+ "WHERE l.v = 'xa'\n",
```
The reason I'm suggesting this is we want to make sure the approach we're
using doesn't change behavior as the stack gets deeper.
##########
File path: sql/src/main/java/org/apache/druid/sql/calcite/rel/DruidQueryRel.java
##########
@@ -183,6 +183,10 @@ public RelWriter explainTerms(final RelWriter pw)
@Override
public RelOptCost computeSelfCost(final RelOptPlanner planner, final
RelMetadataQuery mq)
{
- return planner.getCostFactory().makeCost(partialQuery.estimateCost(), 0,
0);
+ double cost = partialQuery.estimateCost();
+ double multiplier = DruidRels.isScanOrMapping(this, true)
Review comment:
Could you add a comment about why this is being done here?
##########
File path:
sql/src/main/java/org/apache/druid/sql/calcite/rel/DruidJoinQueryRel.java
##########
@@ -293,23 +293,17 @@ protected RelDataType deriveRowType()
@Override
public RelOptCost computeSelfCost(final RelOptPlanner planner, final
RelMetadataQuery mq)
{
- double cost;
-
- if (computeLeftRequiresSubquery(getSomeDruidChild(left))) {
- cost = CostEstimates.COST_JOIN_SUBQUERY;
- } else {
- cost = partialQuery.estimateCost();
- }
-
- if (computeRightRequiresSubquery(getSomeDruidChild(right))) {
- cost += CostEstimates.COST_JOIN_SUBQUERY;
- }
-
+ double cost = partialQuery.estimateCost();
if (joinRel.getCondition().isA(SqlKind.LITERAL) &&
!joinRel.getCondition().isAlwaysFalse()) {
cost += CostEstimates.COST_JOIN_CROSS;
}
-
- return planner.getCostFactory().makeCost(cost, 0, 0);
+ // This is to cancel out the MULTIPLIER_FILTER (value=0.1) from
partialQuery.estimateCost() to discourage
+ // filter push down if pushing down the filter makes this
DruidJoinQueryRel not a scan or mapping.
+ // This will leave the filter at the topmost DruidJoinQueryRel (due to the
order of applying/popping rules).
Review comment:
This "due to the order of applying/popping rules" sounds a bit sketchy.
Does it mean that a good plan and bad plan will have equal cost, and we'll pick
the good plan due to order-based tiebreaking? If so, I'm not sure it would be
safe to rely on that. Ideally the good plans should be cheaper than the bad
plans, which ensures we pick them.
##########
File path:
sql/src/main/java/org/apache/druid/sql/calcite/rel/DruidJoinQueryRel.java
##########
@@ -293,23 +293,17 @@ protected RelDataType deriveRowType()
@Override
public RelOptCost computeSelfCost(final RelOptPlanner planner, final
RelMetadataQuery mq)
{
- double cost;
-
- if (computeLeftRequiresSubquery(getSomeDruidChild(left))) {
- cost = CostEstimates.COST_JOIN_SUBQUERY;
- } else {
- cost = partialQuery.estimateCost();
- }
-
- if (computeRightRequiresSubquery(getSomeDruidChild(right))) {
- cost += CostEstimates.COST_JOIN_SUBQUERY;
- }
-
+ double cost = partialQuery.estimateCost();
if (joinRel.getCondition().isA(SqlKind.LITERAL) &&
!joinRel.getCondition().isAlwaysFalse()) {
cost += CostEstimates.COST_JOIN_CROSS;
}
-
- return planner.getCostFactory().makeCost(cost, 0, 0);
+ // This is to cancel out the MULTIPLIER_FILTER (value=0.1) from
partialQuery.estimateCost() to discourage
+ // filter push down if pushing down the filter makes this
DruidJoinQueryRel not a scan or mapping.
+ // This will leave the filter at the topmost DruidJoinQueryRel (due to the
order of applying/popping rules).
+ double multiplier = DruidRels.isScanOrMapping(this, true)
+ ? 1
+ : 1 / CostEstimates.MULTIPLIER_FILTER;
Review comment:
It seems like this happens even if there is no filter. Is that right?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]