[jira] [Commented] (CALCITE-1591) Druid adapter: Use "groupBy" query with extractionFn for time dimension

slim bouguerra (JIRA) Wed, 18 Apr 2018 11:49:12 -0700

    [ 
https://issues.apache.org/jira/browse/CALCITE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16443028#comment-16443028
 ]


slim bouguerra commented on CALCITE-1591:
-----------------------------------------

i think this one can be closed too, we are using Extraction functions to 
project expression on the top of Druid columns. 

> Druid adapter: Use "groupBy" query with extractionFn for time dimension
> -----------------------------------------------------------------------
>
>                 Key: CALCITE-1591
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1591
>             Project: Calcite
>          Issue Type: Bug
>          Components: druid
>            Reporter: Julian Hyde
>            Assignee: Julian Hyde
>            Priority: Major
>
> For queries that aggregate on the time dimension, or a function of it such as 
> {{FLOOR(__time TO DAY)}}, as of the fix for CALCITE-1579 we generate a 
> "groupBy" query that does not sort or apply limit. It would be better (in the 
> sense that Druid is doing more of the work, and Hive is doing less work) if 
> we use an extractionFn to create a dimension that we can sort on.
> In CALCITE-1578, [~nishantbangarwa] gives the following example query:
> {code}
> {
>   "queryType": "groupBy",
>   "dataSource": "druid_tpcds_ss_sold_time_subset",
>   "granularity": "ALL",
>   "dimensions": [
>     "i_brand_id",
>     {
>       "type" : "extraction",
>       "dimension" : "__time",
>       "outputName" :  "year",
>       "extractionFn" : {
>         "type" : "timeFormat",
>         "granularity" : "YEAR"
>       }
>     }
>   ],
>   "limitSpec": {
>     "type": "default",
>     "limit": 10,
>     "columns": [
>       {
>         "dimension": "$f3",
>         "direction": "ascending"
>       }
>     ]
>   },
>   "aggregations": [
>     {
>       "type": "longMax",
>       "name": "$f2",
>       "fieldName": "ss_quantity"
>     },
>     {
>       "type": "doubleSum",
>       "name": "$f3",
>       "fieldName": "ss_wholesale_cost"
>     }
>   ],
>   "intervals": [
>     "1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z"
>   ]
> }
> {code}
> and for {{DruidAdapterIt. testGroupByDaySortDescLimit}}, [~bslim] suggests
> {code}
> {
>   "queryType": "groupBy",
>   "dataSource": "foodmart",
>   "granularity": "all",
>   "dimensions": [
>     "brand_name",
>     {
>       "type": "extraction",
>       "dimension": "__time",
>       "outputName": "day",
>       "extractionFn": {
>         "type": "timeFormat",
>         "granularity": "DAY"
>       }
>     }
>   ],
>   "aggregations": [
>     {
>       "type": "longSum",
>       "name": "S",
>       "fieldName": "unit_sales"
>     }
>   ],
>   "limitSpec": {
>     "type": "default",
>     "limit": 30,
>     "columns": [
>       {
>         "dimension": "S",
>         "direction": "ascending"
>       }
>     ]
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-1591) Druid adapter: Use "groupBy" query with extractionFn for time dimension

Reply via email to