Wrapping an input JdbcRel forces the planner into an infinite loop

Muhammad Gelbana Fri, 30 Jun 2017 08:37:04 -0700

Well it's not accurately an infinite loop, let me explain.

First of all, this is the loop (I'm using Drill v1.9, which uses Calcite
v1.4)
https://github.com/apache/calcite/blob/branch-1.4/core/src/main/java/org/apache/calcite/plan/hep/HepPlanner.java#L389


The nMatches variable keeps on increasing without breaking the loop.
Theoretically it should eventually break the loop, but I can't accept this
as a solution because it would take minutes to just plan a query ! There
must be another efficient way to break the loop.

I *assume* since Calcite v1.4 doesn't support unparsing OFFSET and FETCH
clauses, Drill tries to apply this pagination using a *DrillLimitRel* node.
But this implies pulling the whole set of data from the JDBC source, then
filtering it within Drill, which is a huge waste for huge datasets if you
ask me. Please correct me if I'm wrong.

*My query:* SELECT CT_ID FROM gelbana.SLS.CTS LIMIT 3

Which is planned as a
*LogicalSort* -> *LogicalProject* -> *JdbcTableScan*

*LogicalSort* is then converted into a Drill specific node which is
*DrillLimitRel*.

*LogicalSort* is the node that holds the fetch, offset and sorting
information. I'm trying to pushdown the fetch value (only if its a literal
and there is no offset specified) to a custom scan node. Then I can pass
the fetch value to the Jdbc statement
<https://docs.oracle.com/javase/8/docs/api/java/sql/Statement.html#setMaxRows-int->
and
achieve the limit I need.

I'm trying to do so by wrapping the *JdbcTableScan* and another custom Jdbc
scan node (i.e. *GelbanaJdbcJoin*), within a new kind of JdbcRel
implementation. This implementation exposes the same methods a *JdbcRel*
would, and the implementation of most of these methods just calls the
equivalent method of the wrapped JdbcRel node. The code the below.

What happens is that the previously mentioned loop keeps on going on and on
without breaking. I appreciate if someone tells me where did I mess up ?

*My rule*
public class GelbanaLimitRule extends RelOptRule {

    public GelbanaLimitRule() {
        super(operand(LogicalSort.class, operand(LogicalProject.class,
operand(JdbcRel.class, any()))), "GelbanaPushdownLimit");
    }

    @Override
    public boolean matches(RelOptRuleCall call) {
        LogicalSort limit = (LogicalSort) call.rels[0];
        RelNode input = call.rels[2];

        boolean jdbcInputCheck = input.getClass() == JdbcTableScan.class ||
input.getClass() == GelbanaJdbcJoin.class;
        return jdbcInputCheck && limit.fetch != null &&
limit.fetch.getClass() == RexLiteral.class && limit.offset == null;
    }

    @Override
    public void onMatch(RelOptRuleCall call) {
        LogicalSort limit = (LogicalSort) call.rels[0];
        LogicalProject project = (LogicalProject) call.rels[1];
        JdbcRel input = (JdbcRel) call.rels[2];

        BigDecimal limitValue = (BigDecimal) ((RexLiteral)
limit.fetch).getValue();

        GelbanaScanWithLimit newInput = new GelbanaScanWithLimit(input,
limitValue.intValue());
        LogicalProject newProject = project.copy(project.getTraitSet(),
newInput, project.getProjects(), project.getRowType());
        Sort newLimit = limit.copy(limit.getTraitSet(), newProject,
limit.getCollation());

        call.transformTo(newLimit);
    }
}

This is a portion of the code of the *GelbanaScanWithLimit* node.

public class GelbanaScanWithLimit implements JdbcRel {
    private JdbcRel input;
    private Integer limit;

    public GelbanaScanWithLimit(JdbcRel input, Integer limit) {
        this.input = input;
        this.limit = limit;
    }

    public boolean hasLimit() {
        return this.limit != null;
    }

    public int getLimit() {
        assert hasLimit();
        return this.limit;
    }

    public RelOptCost computeSelfCost(RelOptPlanner planner) {
        return planner.getCostFactory().makeZeroCost(); //It's the best
case for me to push down *JdbcTableScan*s and joins to my datasource, with
limits of course
    }

    // More methods exist to expose the wrapped *JdbcRel* node. The upper
methods override the equivalent ones of the wrapped *JdbcRel* node.
}

Thanks !

*---------------------*
*Muhammad Gelbana*

Wrapping an input JdbcRel forces the planner into an infinite loop

Reply via email to