Re: Wrapping an input JdbcRel forces the planner into an infinite loop

Julian Hyde Fri, 30 Jun 2017 09:36:37 -0700

The 4-argument Sort.copy retains the old offset and fetch fields. So you are 
producing another LogicalSort that has a not-null fetch.



> On Jun 30, 2017, at 9:33 AM, Atri Sharma <[email protected]> wrote:
> 
> Did you try attaching debugger and see where the code is hanging?
> 
> My guess is that the code flow is hanging in applyRules in HepPlanner.
> The iterator is not moving over the plan hence is stuck in an infinite
> loop.
> 
> This is a known bug in HepPlanner. I will create a JIRA case for this.
> 
> Please add your case there.
> 
> Regards,
> 
> Atri
> 
> On Fri, Jun 30, 2017 at 9:06 PM, Muhammad Gelbana <[email protected]> wrote:
>> Well it's not accurately an infinite loop, let me explain.
>> 
>> First of all, this is the loop (I'm using Drill v1.9, which uses Calcite
>> v1.4)
>> https://github.com/apache/calcite/blob/branch-1.4/core/src/main/java/org/apache/calcite/plan/hep/HepPlanner.java#L389
>> 
>> The nMatches variable keeps on increasing without breaking the loop.
>> Theoretically it should eventually break the loop, but I can't accept this
>> as a solution because it would take minutes to just plan a query ! There
>> must be another efficient way to break the loop.
>> 
>> I *assume* since Calcite v1.4 doesn't support unparsing OFFSET and FETCH
>> clauses, Drill tries to apply this pagination using a *DrillLimitRel* node.
>> But this implies pulling the whole set of data from the JDBC source, then
>> filtering it within Drill, which is a huge waste for huge datasets if you
>> ask me. Please correct me if I'm wrong.
>> 
>> *My query:* SELECT CT_ID FROM gelbana.SLS.CTS LIMIT 3
>> 
>> Which is planned as a
>> *LogicalSort* -> *LogicalProject* -> *JdbcTableScan*
>> 
>> *LogicalSort* is then converted into a Drill specific node which is
>> *DrillLimitRel*.
>> 
>> *LogicalSort* is the node that holds the fetch, offset and sorting
>> information. I'm trying to pushdown the fetch value (only if its a literal
>> and there is no offset specified) to a custom scan node. Then I can pass
>> the fetch value to the Jdbc statement
>> <https://docs.oracle.com/javase/8/docs/api/java/sql/Statement.html#setMaxRows-int->
>> and
>> achieve the limit I need.
>> 
>> I'm trying to do so by wrapping the *JdbcTableScan* and another custom Jdbc
>> scan node (i.e. *GelbanaJdbcJoin*), within a new kind of JdbcRel
>> implementation. This implementation exposes the same methods a *JdbcRel*
>> would, and the implementation of most of these methods just calls the
>> equivalent method of the wrapped JdbcRel node. The code the below.
>> 
>> What happens is that the previously mentioned loop keeps on going on and on
>> without breaking. I appreciate if someone tells me where did I mess up ?
>> 
>> *My rule*
>> public class GelbanaLimitRule extends RelOptRule {
>> 
>>    public GelbanaLimitRule() {
>>        super(operand(LogicalSort.class, operand(LogicalProject.class,
>> operand(JdbcRel.class, any()))), "GelbanaPushdownLimit");
>>    }
>> 
>>    @Override
>>    public boolean matches(RelOptRuleCall call) {
>>        LogicalSort limit = (LogicalSort) call.rels[0];
>>        RelNode input = call.rels[2];
>> 
>>        boolean jdbcInputCheck = input.getClass() == JdbcTableScan.class ||
>> input.getClass() == GelbanaJdbcJoin.class;
>>        return jdbcInputCheck && limit.fetch != null &&
>> limit.fetch.getClass() == RexLiteral.class && limit.offset == null;
>>    }
>> 
>>    @Override
>>    public void onMatch(RelOptRuleCall call) {
>>        LogicalSort limit = (LogicalSort) call.rels[0];
>>        LogicalProject project = (LogicalProject) call.rels[1];
>>        JdbcRel input = (JdbcRel) call.rels[2];
>> 
>>        BigDecimal limitValue = (BigDecimal) ((RexLiteral)
>> limit.fetch).getValue();
>> 
>>        GelbanaScanWithLimit newInput = new GelbanaScanWithLimit(input,
>> limitValue.intValue());
>>        LogicalProject newProject = project.copy(project.getTraitSet(),
>> newInput, project.getProjects(), project.getRowType());
>>        Sort newLimit = limit.copy(limit.getTraitSet(), newProject,
>> limit.getCollation());
>> 
>>        call.transformTo(newLimit);
>>    }
>> }
>> 
>> This is a portion of the code of the *GelbanaScanWithLimit* node.
>> 
>> public class GelbanaScanWithLimit implements JdbcRel {
>>    private JdbcRel input;
>>    private Integer limit;
>> 
>>    public GelbanaScanWithLimit(JdbcRel input, Integer limit) {
>>        this.input = input;
>>        this.limit = limit;
>>    }
>> 
>>    public boolean hasLimit() {
>>        return this.limit != null;
>>    }
>> 
>>    public int getLimit() {
>>        assert hasLimit();
>>        return this.limit;
>>    }
>> 
>>    public RelOptCost computeSelfCost(RelOptPlanner planner) {
>>        return planner.getCostFactory().makeZeroCost(); //It's the best
>> case for me to push down *JdbcTableScan*s and joins to my datasource, with
>> limits of course
>>    }
>> 
>>    // More methods exist to expose the wrapped *JdbcRel* node. The upper
>> methods override the equivalent ones of the wrapped *JdbcRel* node.
>> }
>> 
>> Thanks !
>> 
>> *---------------------*
>> *Muhammad Gelbana*
> 
> 
> 
> -- 
> Regards,
> 
> Atri
> l'apprenant

Re: Wrapping an input JdbcRel forces the planner into an infinite loop

Reply via email to