[ 
https://issues.apache.org/jira/browse/CALCITE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17598673#comment-17598673
 ] 

Julian Hyde edited comment on CALCITE-5155 at 9/1/22 12:50 AM:
---------------------------------------------------------------

I think it is sufficient to do the following:
 * Add variants of {{FLOOR}} and {{CEIL}} SQL functions that take a time unit 
name as an argument. For example, {{FLOOR(d TO '3 hours')}}. The return type is 
always the same as {{d}}.
 * Add a variant of {{EXTRACT}} SQL function that takes a time unit name as an 
argument.
 * Add {{interface UserDefinedTimeUnit}} (see below) and a property 
{{TIME_UNIT_FACTORY}} in 
[CalciteConnectionProperty|https://calcite.apache.org/javadocAggregate/org/apache/calcite/config/CalciteConnectionProperty.html]
 of type {{UserDefinedTimeUnit.Factory}}.

{code:java}
interface UserDefinedTimeUnit {
  // Returns name, e.g. "3 HOUR". Should be upper case, but matching is 
case-insensitive.
  String name();

  // Returns any other aliases.
  List<String> aliasList();

  // If this time unit nests within other time units, returns those time units 
and their
  // multipliers. For "3 HOUR", returns [(1/8, DAY), (3, HOUR)].
  Map<UserDefinedTimeUnit, Rational> parent();

  // Rational number (fraction).  
  class Rational {
    public final int numerator;
    public final int denominator;
  }

  // Built-in time units based on Avatica's enum TimeUnit
  Map<String, UserDefinedTimeUnit> BUILT_IN_TIME_UNIT_MAP;
}
{code}
{*}Note 1{*}. {{FLOOR}} and {{CEIL}} are the canonical functions. There are 
many other functions such as {{TIMESTAMP_TRUNC}} which can be expressed in 
terms of them (both in incoming SQL, handled by the parser and validator for a 
particular {{fun}} setting, and in outgoing SQL generated by the JDBC adapter).

{*}Note 2{*}. Calcite will have rules to optimize {{FLOOR(FLOOR(d TO '8 HOUR') 
TO DAY)}} to {{FLOOR(d TO DAY)}} based on the parent/child relationships. 
Similarly {{CEIL}} of {{CEIL}}, {{FLOOR}} of {{CEIL}}, {{CEIL}} of {{FLOOR}}.

{*}Note 3{*}. {{FLOOR(d TO u)}} and {{CEIL(d TO u)}} are assumed monotonic in 
{{d}} for all time units. Calcite may use this assumption when optimizing range 
filters, sorts. They are also assumed idempotent: for all d, u, {{FLOOR(FLOOR(d 
TO u) TO u)}} = {{FLOOR(CEIL(d TO u) TO u))}} = {{FLOOR(d TO u)}}.

{*}Note 4{*}. Calcite uses the parent-child relationships (and relationships 
inferred by transitive closure and whole-number ratios) in materialized view 
matching. For example, it deduces that {{2 HOUR}} is an ancestor of {{3 HOUR}} 
since both are a descendant of {{DAY}} and 3 divides 12. And therefore a query 
with {{GROUP BY FLOOR(d TO '12 HOUR')}} can be answered by a materialized view 
on {{FLOOR(d TO '3 HOUR')}}.

{*}Note 5{*}. Do we need a new function {{EXTRACT_NAME(unit FROM d)}}? 
{{EXTRACT('quarter' FROM DATE '2020-01-01')}} would return integer 1, but 
{{EXTRACT_NAME('quarter' FROM DATE '2020-01-01')}} would return string 'Q1'. 
Strings may depend on the current locale, and are not monotonic with respect to 
{{d}}.

{*}Note 6{*}. Do we need a function {{{}LAST(d, u1, u2){}}}? For example 
{{LAST(DATE '2020-04-29', 'month', 'quarter')}} returns {{{}DATE 
'2020-06-01'{}}}, the (floor of) June, 2020, the last month in the quarter that 
{{2020-04-29}} belongs to. Perhaps {{LAST(d, u1, u2)}} == 
{{DATE_ADD(DATE_ADD(FLOOR(d TO u2), 1, u2), -1, u1)}}.


was (Author: julianhyde):
I think it is sufficient to do the following:
* Add variants of {{FLOOR}} and {{CEIL}} SQL functions that take a time unit 
name as an argument. For example, {{FLOOR(d TO '3 hours')}}. The return type is 
always the same as {{d}}.
* Add a variant of {{EXTRACT}} SQL function that takes a time unit name as an 
argument.
* Add {{interface UserDefinedTimeUnit}} and a property {{TIME_UNIT_FACTORY}} in 
[CalciteConnectionProperty|https://calcite.apache.org/javadocAggregate/org/apache/calcite/config/CalciteConnectionProperty.html]
 of type {{UserDefinedTimeUnit.Factory}}.

{code}
interface UserDefinedTimeUnit {
  // Returns name, e.g. "3 HOUR". Should be upper case, but matching is 
case-insensitive.
  String name();

  // Returns any other aliases.
  List<String> aliasList();

  // If this time unit nests within other time units, returns those time units 
and their
  // multipliers. For "3 HOUR", returns [(1/8, DAY), (3, HOUR)].
  Map<UserDefinedTimeUnit, Rational> parent();

  // Rational number (fraction).  
  class Rational {
    public final int numerator;
    public final int denominator;
  }

  // Built-in time units based on Avatica's enum TimeUnit
  Map<String, UserDefinedTimeUnit> BUILT_IN_TIME_UNIT_MAP;
}
{code} 

*Note 1*. {{FLOOR}} and {{CEIL}} are the canonical functions. There are many 
other functions such as {{TIMESTAMP_TRUNC}} which can be expressed in terms of 
them (both in incoming SQL, handled by the parser and validator for a 
particular {{fun}} setting, and in outgoing SQL generated by the JDBC adapter).

*Note 2*. Calcite will have rules to optimize {{FLOOR(FLOOR(d TO '8 HOUR') TO 
DAY)}} to {{FLOOR(d TO DAY)}} based on the parent/child relationships. 
Similarly {{CEIL}} of {{CEIL}}, {{FLOOR}} of {{CEIL}}, {{CEIL}} of {{FLOOR}}.

*Note 3*. {{FLOOR(d TO u)}} and {{CEIL(d TO u)}} are assumed monotonic in {{d}} 
for all time units. Calcite may use this assumption when optimizing range 
filters, sorts. They are also assumed idempotent: for all d, u,  
{{FLOOR(FLOOR(d TO u) TO u)}} = {{FLOOR(CEIL(d TO u) TO u))}} = {{FLOOR(d TO 
u)}}.

*Note 4*. Calcite uses the parent-child relationships (and relationships 
inferred by transitive closure and whole-number ratios) in materialized view 
matching. For example, it deduces that {{2 HOUR}} is an ancestor of {{3 HOUR}} 
since both are a descendant of {{DAY}} and 3 divides 12. And therefore a query 
with {{GROUP BY FLOOR(d TO '12 HOUR')}} can be answered by a materialized view 
on {{FLOOR(d TO '3 HOUR')}}.

*Note 5*. Do we need a new function {{EXTRACT_NAME(unit FROM d)}}? 
{{EXTRACT('quarter' FROM DATE '2020-01-01')}} would return integer 1, but 
{{EXTRACT_NAME('quarter' FROM DATE '2020-01-01')}} would return string 'Q1'. 
Strings may depend on the current locale, and are not monotonic with respect to 
{{d}}.

*Note 6*. Do we need a function {{LAST(d, u1, u2)}}? For example {{LAST(DATE 
'2020-04-29', 'month', 'quarter')}} returns {{DATE '2020-06-01'}} the (floor 
of) June, 2020, the last month in the quarter that {{2020-04-29}} belongs to. 
Perhaps {{LAST(d, u1, u2)}} == {{DATE_ADD(DATE_ADD(FLOOR(d TO u2), 1, u2), -1, 
u1)}}.

> Extensible time units
> ---------------------
>
>                 Key: CALCITE-5155
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5155
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Priority: Major
>
> Allow a type system to define its own time units and how they are rolled up.
> Currently, time units are used in the {{EXTRACT}}, {{FLOOR}}, {{TRUNC}} 
> functions, and include {{YEAR}}, {{QUARTER}}, {{MONTH}}, {{HOUR}}, 
> {{MINUTE}}, {{NANOSECOND}}. For example {{FLOOR(t TO HOUR)}} is valid.
> A type system would be allowed to define extra time units. Once a time unit 
> is defined the {{EXTRACT}}, {{FLOOR}} and {{TRUNC}} functions should just 
> work.
> The definition of might consist of a base unit and multiplier. So 
> {{MINUTE15}} would be based on {{MINUTE}} with a multiplier of 15. 
> Various rules know that you can roll up {{FLOOR(t TO DAY)}} to {{FLOOR(t TO 
> MONTH)}} but you cannot roll {{FLOOR(t TO WEEK)}} to {{FLOOR(t TO MONTH)}}. 
> When you define a new time unit, the type system can deduce that full set of 
> time units that it can roll up to, and which can roll up to it.
> Should we support time units that do not evenly divide the next largest time 
> unit? For example the number of seconds since the top of the hour modulo 7. 
> 60 and 3,600 are not a multiples of 7, so {{SecondOfHourMod7}} would be 
> different from {{SecondOfMinuteMod7}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to