Hi Andrei,

>From what you say it seems that if AggregateExpandDistinctAggregatesRule
was not applied you wouldn't have a problem translating this to Elastic.
Does it make sense to remove this rule from the planner during the
optimization of queries to Elastic?

Best,
Stamatis

Στις Σάβ, 19 Ιαν 2019 στις 1:35 π.μ., ο/η Andrei Sereda <[email protected]>
έγραψε:

> Hello,
>
> I’m trying to push-down SQL APPROX_COUNT_DISTINCT() function into elastic
> as cardinality
> <
> https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html
> >
> aggregation.
> Example of SQL
>
> select col1, APPROX_COUNT_DISTINCT(col2) from elastic group by col1
>
> Above gets converted into the following plan (edited to make more readable)
> :
>
> ElasticsearchToEnumerableConverter
>   ElasticsearchAggregate(group=[{0}], EXPR$1=[COUNT($1)])
>     ElasticsearchAggregate(group=[{0, 1}])
>       ElasticsearchProject(EXPR$0=[CAST(ITEM($0, 'col1'))
> EXPR$1=[CAST(ITEM($0, 'col2'))])
>         ElasticsearchTableScan(table=[[elastic, zips]])
>
> I presume AggregateExpandDistinctAggregatesRule creates two aggregations ?
> If so, what is the correct / recommended way to identify those as
> originated from APPROX_COUNT_DISTINCT in ElasticSearchAggregate
> <
> https://github.com/apache/calcite/blob/master/elasticsearch/src/main/java/org/apache/calcite/adapter/elasticsearch/ElasticsearchAggregate.java
> >
> ? Note no distinct in first aggregation.
>
> Alos note that when multiple columns are used for approx count (select
> approx(c1), approx(c2)) there is just a single ElasticsearchAggregate so it
> is not an issue (since use of cardinality can be inferred from
> AggregateCall.isDistinct / isApproximate flags).
>
> Druid adapter has some logic around APPROX_COUNT_DISTINCT() it but it looks
> too complicated.
>
> Any hints would be appreciated.
>
> Regards,
> Andrei.
>

Reply via email to