Hello Julian,

Thank you for the reply.
I agree that there should be something which should be common to combine
all the equivalence nodes together. I think then it might be needed from
DRILL's end to handle what is equivalent as the default equivalence is not
distinguishing the two nodes which are equivalent barring configuration
objects.

I think I should have been clear about the specific case which I was having
an issue with in the earlier mail.

explain plan for
SELECT t1.driverlicense, t2.driverlicense,t2.`id.ssn` as cnt FROM
table(dfs.`/tmp/index_hint_test_primary`(type => 'maprdb', index =>
'i_lic')) as t1, table(dfs.`/tmp/index_hint_test_primary`(type => 'maprdb',
index => 'i_ssn')) as t2
where t1.driverlicense = t2.driverlicense and t1.driverlicense = 100000003
and t2.id.ssn = 100000003;

In the above example t1 and t2 are exactly same barring the index hint. One
needs to pick i_lic and the other needs to pick i_ssn.

What I observed here is that DRILL creates a tree in the following lines


                      ----- table with index 'i_lic'
Filter---join ---
                      ----- table with index 'i_ssn'

Once the HEP planner kicks in it tries to build a graph with HepRel nodes.
At this point before adding each node it is checked in the graph to see if
we already found it in the graph.

In this specific case table t1 is already found when it is processing the
table t2. Hence it treats they are equivalent and modifies the tree into a
graph. Doing so will create an issue as soon as the next optimization rule
tries to do some optimization (in this specific case it couldn't find the
relevant columns) and dies with NullPointerException.

As to equivalence, here even though these tables are same they still
different in the context of the whole query. I think we need to do
distinguish these cases.

Please correct my understanding if I am assuming anything wrong here.

Thanks,
-Hanu



On Tue, Aug 8, 2017 at 3:43 PM, Julian Hyde <jh...@apache.org> wrote:

> It is important that the digest is not unique. We want to identify
> RelNodes that are different objects but have identical content. Such
> RelNodes will have identical digests. Then we can combine these into one
> when we register them in a Volcano planner.
>
>
> > On Aug 8, 2017, at 10:55 AM, Hanumath Rao Maduri <hanu....@gmail.com>
> wrote:
> >
> > Hello All,
> >
> > I was looking at a bug (in Apache Drill) where in a join query with table
> > options with different configurations but having same table name is
> causing
> > a NULL pointer exception. In the case of the table function it
> preprocesses
> > and converts the TranslatableTable to DrillScanRel (here it is a subclass
> > of AbstractRelNode).
> >
> > Please do not worry about Drill specific nodes here. However during the
> > HepPlanning phase drill calls setRoot
> >
> >    public void setRoot(RelNode rel) {
> >        this.root = this.addRelToGraph(rel);
> >        this.dumpGraph();
> >    }
> >
> > This prepares the graph. But during debugging this code what I observed
> is
> > the
> >
> > AbstractRelNode.java
> >
> >  public String recomputeDigest() {
> >    String tempDigest = computeDigest();
> >    assert tempDigest != null : "post: return != null";
> >    String prefix = "rel#" + id + ":";
> >
> >    // Substring uses the same underlying array of chars, so saves a bit
> >    // of memory.
> >    this.desc = prefix + tempDigest;
> >    this.digest = this.desc.substring(prefix.length());
> >    return this.digest;
> >  }
> >
> >
> > original value for digest is assigned as follows
> >    this.digest = getRelTypeName() + "#" + id;
> >
> > This is unique because id value is unique wheres in recomputeDigest
> digest
> > will not be unique as this.desc.substring(prefix.length()) is skipping
> the
> > id part
> >
> >
> > I am not sure as to whether is this an issue. I could fix the particular
> > drill case by overriding the recomputeDigest in DrillScanRel.
> >
> >
> > Please advise me if this is an issue at the calcite code.
> >
> >
> > Thanks,
> > -Hanu
>
>

Reply via email to