[ 
https://issues.apache.org/jira/browse/CALCITE-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16051239#comment-16051239
 ] 

Julian Hyde commented on CALCITE-1787:
--------------------------------------

I would remove the "metricName" field. 

Consider the case of the "user" field. We would allow it to be queried via 
"count(distinct user)" (if the "hyperUnique" metric exists) and maybe via 
"where user > 1000" (if some kind of histogram sketch exists) but we would not 
allow people to write "select user from table" because we do not store the raw 
data for "user".

So, the "user" field is virtual. You can query certain expressions derived from 
"user", but you cannot query it itself. That's why I would remove the 
"metricName" field. A complex metric isn't derived from any other metric.

I know a virtual field is difficult concept for people to get their heads 
around. But it creates a greater simplicity, because it means we are presenting 
the data via the relational model. (Even though the relational data, the 
original rows and columns, has been discarded.)

If someone tried to execute "select user from table", I would imagine that the 
adapter would throw. But if people would prefer that "user" evaluates to some 
expression, say 0, I could support that too. 

> thetaSketch Support for Druid Adapter
> -------------------------------------
>
>                 Key: CALCITE-1787
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1787
>             Project: Calcite
>          Issue Type: New Feature
>          Components: druid
>    Affects Versions: 1.12.0
>            Reporter: Zain Humayun
>            Assignee: Zain Humayun
>            Priority: Minor
>
> Currently, the Druid adapter does not support the 
> [thetaSketch|http://druid.io/docs/latest/development/extensions-core/datasketches-aggregators.html]
>  aggregate type, which is used to measure the cardinality of a column 
> quickly. Many Druid instances support theta sketches, so I think it would be 
> a nice feature to have.
> I've been looking at the Druid adapter, and propose we add a new DruidType 
> called {{thetaSketch}} and then add logic in the {{getJsonAggregation}} 
> method in class {{DruidQuery}} to generate the {{thetaSketch}} aggregate. 
> This will require accessing information about the columns (what data type 
> they are) so that the thetaSketch aggregate is only produced if the column's 
> type is {{thetaSketch}}. 
> Also, I've noticed that a {{hyperUnique}} DruidType is currently defined, but 
> a {{hyperUnique}} aggregate is never produced. Since both are approximate 
> aggregators, I could also couple in the logic for {{hyperUnique}}.
> I'd love to hear your thoughts on my approach, and any suggestions you have 
> for this feature.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to