Julian Hyde created CALCITE-1334:
------------------------------------

             Summary: Converting predicates on date dimension columns into date 
ranges
                 Key: CALCITE-1334
                 URL: https://issues.apache.org/jira/browse/CALCITE-1334
             Project: Calcite
          Issue Type: Bug
            Reporter: Julian Hyde
            Assignee: Julian Hyde


We would like to convert predicates on date dimension columns into date ranges. 
This is particularly useful for Druid, which has a single timestamp column.

Consider the case of a materialized view

{code}
SELECT sales.*, product.*, time_by_day.*
FROM sales
JOIN product USING (product_id)
JOIN time_by_day USING (time_id)
{code}

that corresponds to a Druid table

{noformat}
sales_product_time(
  product_id int not null,
  time_id int not null,
  units int not null,
  the_year int not null,
  the_quarter int not null,
  the_month int not null,
  the_timestamp timestamp not null,
  product_name varchar(20) not null)
{noformat}

And suppose we have the following check constraints:
* {{CHECK the_year = EXTRACT(YEAR FROM the_timestamp)}}
* {{CHECK the_month = EXTRACT(MONTH FROM the_timestamp)}}

Given a query

{code}
SELECT product_id, count(*)
FROM sales
JOIN product USING (product_id)
JOIN time_by_day USING (time_id)
WHERE the_year = 2016
AND the_month IN (4, 5, 6)
{code}

we would like to transform it into the following query to be run against Druid:

{code}
SELECT product_id, count(*)
FROM sales_product_time
WHERE the_timestamp BETWEEN '2016-04-01' AND '2016-06-30'
{code}

Druid can handle timestamp ranges (or disjoint sets of ranges) very efficiently.

I believe we can write a rule that knows the check constraints and also knows 
the properties of the {{EXTRACT}} function:

1. Apply check constraints to convert {{WHERE year = ...}} to {{WHERE 
EXTRACT(YEAR FROM the_timestamp) = ...}}, etc.
2. {{EXTRACT(YEAR FROM ...)}} is monotonic, therefore we can deduce the range 
of the_timestamp values such that {{EXTRACT(YEAR FROM the_timestamp)}} returns 
2016.
3. Then we need to use the fact that {{EXTRACT(MONTH FROM the_timestamp)}} is 
monotonic if {{the_timestamp}} is bounded within a particular year.
4. And we need to merge month ranges somehow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to