[
https://issues.apache.org/jira/browse/CALCITE-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nishant Bangarwa updated CALCITE-1670:
--------------------------------------
Description:
Right now count distinct on Druid is pushed as a 'cardinality' aggregator which
uses hyperloglog and return approximate results. See cardinality aggregator
here - http://druid.io/docs/latest/querying/aggregations.html for details.
https://github.com/apache/calcite/blob/master/druid/src/main/java/org/apache/calcite/adapter/druid/DruidQuery.java#L721
{code}
case COUNT:
if (aggCall.isDistinct()) {
return new JsonCardinalityAggregation("cardinality", name, list);
}
return new JsonAggregation("count", name, only);
{code}
The current recommended way in druid to get exact counts is to do a nested
groupby query.
was:
Right now count distinct on Druid is pushed as a 'cardinality' aggregator which
uses hyperloglog and return approximate results. See cardinality aggregator
here - http://druid.io/docs/latest/querying/aggregations.html for details.
https://github.com/apache/calcite/blob/master/druid/src/main/java/org/apache/calcite/adapter/druid/DruidQuery.java#L721
{code}
case COUNT:
if (aggCall.isDistinct()) {
return new JsonCardinalityAggregation("cardinality", name, list);
}
return new JsonAggregation("count", name, only);
{code}
> Count distinct on druid is translated to Cardinality aggregator which is
> approximate
> ------------------------------------------------------------------------------------
>
> Key: CALCITE-1670
> URL: https://issues.apache.org/jira/browse/CALCITE-1670
> Project: Calcite
> Issue Type: Bug
> Reporter: Nishant Bangarwa
> Assignee: Julian Hyde
>
> Right now count distinct on Druid is pushed as a 'cardinality' aggregator
> which uses hyperloglog and return approximate results. See cardinality
> aggregator here - http://druid.io/docs/latest/querying/aggregations.html for
> details.
> https://github.com/apache/calcite/blob/master/druid/src/main/java/org/apache/calcite/adapter/druid/DruidQuery.java#L721
> {code}
> case COUNT:
> if (aggCall.isDistinct()) {
> return new JsonCardinalityAggregation("cardinality", name, list);
> }
> return new JsonAggregation("count", name, only);
> {code}
> The current recommended way in druid to get exact counts is to do a nested
> groupby query.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)