Re: ElasticSearch Adapter. converting APPROX_COUNT_DISTINCT into Elastic cardinality

Andrei Sereda Tue, 22 Jan 2019 13:23:05 -0800

Does it make sense to remove this rule from the planner during the
optimization of queries to Elastic?


Thanks for the hint, Stamatis. I did remove
AggregateExpandDistinctAggregatesRule from elastic planner and things are
working.
For more info see PR-1008 <https://github.com/apache/calcite/pull/1008>.

On Mon, Jan 21, 2019 at 3:24 AM Stamatis Zampetakis <[email protected]>
wrote:

> Hi Andrei,
>
> From what you say it seems that if AggregateExpandDistinctAggregatesRule
> was not applied you wouldn't have a problem translating this to Elastic.
> Does it make sense to remove this rule from the planner during the
> optimization of queries to Elastic?
>
> Best,
> Stamatis
>
> Στις Σάβ, 19 Ιαν 2019 στις 1:35 π.μ., ο/η Andrei Sereda <[email protected]>
> έγραψε:
>
> > Hello,
> >
> > I’m trying to push-down SQL APPROX_COUNT_DISTINCT() function into elastic
> > as cardinality
> > <
> >
> https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html
> > >
> > aggregation.
> > Example of SQL
> >
> > select col1, APPROX_COUNT_DISTINCT(col2) from elastic group by col1
> >
> > Above gets converted into the following plan (edited to make more
> readable)
> > :
> >
> > ElasticsearchToEnumerableConverter
> >   ElasticsearchAggregate(group=[{0}], EXPR$1=[COUNT($1)])
> >     ElasticsearchAggregate(group=[{0, 1}])
> >       ElasticsearchProject(EXPR$0=[CAST(ITEM($0, 'col1'))
> > EXPR$1=[CAST(ITEM($0, 'col2'))])
> >         ElasticsearchTableScan(table=[[elastic, zips]])
> >
> > I presume AggregateExpandDistinctAggregatesRule creates two aggregations
> ?
> > If so, what is the correct / recommended way to identify those as
> > originated from APPROX_COUNT_DISTINCT in ElasticSearchAggregate
> > <
> >
> https://github.com/apache/calcite/blob/master/elasticsearch/src/main/java/org/apache/calcite/adapter/elasticsearch/ElasticsearchAggregate.java
> > >
> > ? Note no distinct in first aggregation.
> >
> > Alos note that when multiple columns are used for approx count (select
> > approx(c1), approx(c2)) there is just a single ElasticsearchAggregate so
> it
> > is not an issue (since use of cardinality can be inferred from
> > AggregateCall.isDistinct / isApproximate flags).
> >
> > Druid adapter has some logic around APPROX_COUNT_DISTINCT() it but it
> looks
> > too complicated.
> >
> > Any hints would be appreciated.
> >
> > Regards,
> > Andrei.
> >
>

Re: ElasticSearch Adapter. converting APPROX_COUNT_DISTINCT into Elastic cardinality

Reply via email to