Re: Use of noDag parameter in HepPlanner

2019-02-17 Thread Stamatis Zampetakis
Thanks for the additional info Vitalii!

It made me also realize that I had a typo in my previous email.
I meant to write that I tried setting *noDag=false* (since I wanted to
enable DAGs) everywhere but I had failures in various places.
Setting *noDag=true* globally will not work and most likely is not
desirable either.

Στις Κυρ, 17 Φεβ 2019 στις 8:04 μ.μ., ο/η Vitalii Diravka <
vita...@apache.org> έγραψε:

> Stamatis,
>
> Just FYI, maybe it will be useful for you,
> Drill uses *noDAG: true *as default value for HepPlanner [1].
> After changing it to false, a lot of Drill unit tests failed [2].
>
> [1]
>
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java#L416
> [2] https://travis-ci.org/vdiravka/drill/jobs/494499462
>
> Kind regards
> Vitalii
>
>
> On Fri, Feb 15, 2019 at 2:43 PM Stamatis Zampetakis 
> wrote:
>
> > FYI, what I concluded by going through the code and the various test
> cases
> > is the following.
> >
> > By allowing DAGs the planner can detect common sub expressions in queries
> > and re-use an existing result without re-applying a rule if that is not
> > necessary. This should lead to fewer object creations and rule
> > applications, which may in turn lead to improved performance. In the
> > existing use cases noDag=false should appear more often since it is the
> > default value for two out of three constructors in the HepPlanner.
> >
> > In principle it seems that using or not using DAGs should give the same
> > expression in the end so I would say that using DAGs is always a better
> > option. I tried setting noDag to be always true but various test fail
> with
> > StackOverflowError so it seems there are rules who tend to execute
> infinite
> > number of times as a result of this change. I would tend to thing that
> this
> > is a bug but I didn't look further.
> >
> > Στις Τρί, 12 Φεβ 2019 στις 9:35 μ.μ., ο/η Julian Hyde 
> > έγραψε:
> >
> > > I don’t recall.
> > >
> > > Could you review the tests and see whether tests tend to use noDag=true
> > or
> > > false most of the time? Are there any tests that use the less popular
> > > value, and if so, is there a particular reason that those tests use
> that
> > > option?
> > >
> > > Julian
> > >
> > >
> > > > On Feb 12, 2019, at 6:47 AM, Stamatis Zampetakis 
> > > wrote:
> > > >
> > > > Hi all,
> > > >
> > > > I don't understand what is the correct way to set the noDag [1]
> > parameter
> > > > in HepPlanner. I understand what it does (internal query graph
> becomes
> > a
> > > > tree or a DAG) but I don't see why should I use the one or the other
> > and
> > > > when.
> > > >
> > > > Is it performance related?
> > > > Are there implications on the rules that can be used with the
> planner?
> > > > Does it limit the class of queries that need to be transformed?
> > > >
> > > > Thanks in advance,
> > > > Stamatis
> > > >
> > > > [1]
> > > >
> > >
> >
> https://github.com/apache/calcite/blob/883666929478aabe07ee5b9e572c43a6f1a703e2/core/src/main/java/org/apache/calcite/plan/hep/HepPlanner.java#L131
> > >
> > >
> >
>


Re: Use of noDag parameter in HepPlanner

2019-02-17 Thread Vitalii Diravka
Stamatis,

Just FYI, maybe it will be useful for you,
Drill uses *noDAG: true *as default value for HepPlanner [1].
After changing it to false, a lot of Drill unit tests failed [2].

[1]
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java#L416
[2] https://travis-ci.org/vdiravka/drill/jobs/494499462

Kind regards
Vitalii


On Fri, Feb 15, 2019 at 2:43 PM Stamatis Zampetakis 
wrote:

> FYI, what I concluded by going through the code and the various test cases
> is the following.
>
> By allowing DAGs the planner can detect common sub expressions in queries
> and re-use an existing result without re-applying a rule if that is not
> necessary. This should lead to fewer object creations and rule
> applications, which may in turn lead to improved performance. In the
> existing use cases noDag=false should appear more often since it is the
> default value for two out of three constructors in the HepPlanner.
>
> In principle it seems that using or not using DAGs should give the same
> expression in the end so I would say that using DAGs is always a better
> option. I tried setting noDag to be always true but various test fail with
> StackOverflowError so it seems there are rules who tend to execute infinite
> number of times as a result of this change. I would tend to thing that this
> is a bug but I didn't look further.
>
> Στις Τρί, 12 Φεβ 2019 στις 9:35 μ.μ., ο/η Julian Hyde 
> έγραψε:
>
> > I don’t recall.
> >
> > Could you review the tests and see whether tests tend to use noDag=true
> or
> > false most of the time? Are there any tests that use the less popular
> > value, and if so, is there a particular reason that those tests use that
> > option?
> >
> > Julian
> >
> >
> > > On Feb 12, 2019, at 6:47 AM, Stamatis Zampetakis 
> > wrote:
> > >
> > > Hi all,
> > >
> > > I don't understand what is the correct way to set the noDag [1]
> parameter
> > > in HepPlanner. I understand what it does (internal query graph becomes
> a
> > > tree or a DAG) but I don't see why should I use the one or the other
> and
> > > when.
> > >
> > > Is it performance related?
> > > Are there implications on the rules that can be used with the planner?
> > > Does it limit the class of queries that need to be transformed?
> > >
> > > Thanks in advance,
> > > Stamatis
> > >
> > > [1]
> > >
> >
> https://github.com/apache/calcite/blob/883666929478aabe07ee5b9e572c43a6f1a703e2/core/src/main/java/org/apache/calcite/plan/hep/HepPlanner.java#L131
> >
> >
>


Re: Use of noDag parameter in HepPlanner

2019-02-15 Thread Stamatis Zampetakis
FYI, what I concluded by going through the code and the various test cases
is the following.

By allowing DAGs the planner can detect common sub expressions in queries
and re-use an existing result without re-applying a rule if that is not
necessary. This should lead to fewer object creations and rule
applications, which may in turn lead to improved performance. In the
existing use cases noDag=false should appear more often since it is the
default value for two out of three constructors in the HepPlanner.

In principle it seems that using or not using DAGs should give the same
expression in the end so I would say that using DAGs is always a better
option. I tried setting noDag to be always true but various test fail with
StackOverflowError so it seems there are rules who tend to execute infinite
number of times as a result of this change. I would tend to thing that this
is a bug but I didn't look further.

Στις Τρί, 12 Φεβ 2019 στις 9:35 μ.μ., ο/η Julian Hyde 
έγραψε:

> I don’t recall.
>
> Could you review the tests and see whether tests tend to use noDag=true or
> false most of the time? Are there any tests that use the less popular
> value, and if so, is there a particular reason that those tests use that
> option?
>
> Julian
>
>
> > On Feb 12, 2019, at 6:47 AM, Stamatis Zampetakis 
> wrote:
> >
> > Hi all,
> >
> > I don't understand what is the correct way to set the noDag [1] parameter
> > in HepPlanner. I understand what it does (internal query graph becomes a
> > tree or a DAG) but I don't see why should I use the one or the other and
> > when.
> >
> > Is it performance related?
> > Are there implications on the rules that can be used with the planner?
> > Does it limit the class of queries that need to be transformed?
> >
> > Thanks in advance,
> > Stamatis
> >
> > [1]
> >
> https://github.com/apache/calcite/blob/883666929478aabe07ee5b9e572c43a6f1a703e2/core/src/main/java/org/apache/calcite/plan/hep/HepPlanner.java#L131
>
>


Re: Use of noDag parameter in HepPlanner

2019-02-12 Thread Julian Hyde
I don’t recall.

Could you review the tests and see whether tests tend to use noDag=true or 
false most of the time? Are there any tests that use the less popular value, 
and if so, is there a particular reason that those tests use that option?

Julian


> On Feb 12, 2019, at 6:47 AM, Stamatis Zampetakis  wrote:
> 
> Hi all,
> 
> I don't understand what is the correct way to set the noDag [1] parameter
> in HepPlanner. I understand what it does (internal query graph becomes a
> tree or a DAG) but I don't see why should I use the one or the other and
> when.
> 
> Is it performance related?
> Are there implications on the rules that can be used with the planner?
> Does it limit the class of queries that need to be transformed?
> 
> Thanks in advance,
> Stamatis
> 
> [1]
> https://github.com/apache/calcite/blob/883666929478aabe07ee5b9e572c43a6f1a703e2/core/src/main/java/org/apache/calcite/plan/hep/HepPlanner.java#L131