Re: A New GroupBy Syntax

Jialin Qiao Tue, 12 Nov 2019 20:34:04 -0800

Hi,

Great! I like this new grammar. It's very clear and friendly.


Just one question, as the timeInterval may be not equal to the sliding step, do 
we need to return two time column? One is start time, the other is end time of 
each interval.

For example: 

 select count(status), max_value(temperature) from root.ln.wf01.wt01 
 group by ([2019-7-01T08:00:00, 2019-8-31T23:59:59], 3h, 1d);

The result could be this:

| Start Time   | End Time |  count(root.ln.wf01.wt01.status) | 
max_value(root.ln.wf01.wt01.temperature)
| ---------------------| ----- | ---------- |
| 2019-7-01T08:00:00   | 2019-7-01T011:00:00 | 1440  | 25.996933  |
| 2019-7-02T08:00:00   | 2019-7-02T011:00:00 | 1440  | 26.102223  |
| 2019-7-03T08:00:00   | 2019-7-03T011:00:00 | 1440  | 26.211028  |
| 2019-7-04T08:00:00   | 2019-7-04T011:00:00 | 1440  | 25.999288  |
| 2019-7-05T08:00:00   | 2019-7-05T011:00:00 | 1440  | 25.958802  |

Best,
--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

> -----原始邮件-----
> 发件人: "田原" <[email protected]>
> 发送时间: 2019-11-12 16:16:48 (星期二)
> 收件人: [email protected]
> 抄送: 
> 主题: A New GroupBy Syntax
> 
> Hi,
> 
> 
> Recently, I'm working on the "IOTDB-217: A new predicate of time", the JIRA 
> link is https://issues.apache.org/jira/projects/IOTDB/issues/IOTDB-217.
> 
> 
> I think we need a new GroupByClause to support it.
> 
> 
> The new GroupByClause grammar is shown as the following:
> 
> 
> GroupByClause : LPAREN <TimeInterval> COMMA <TimeUnit> (COMMA <TimeUnit>)? 
> RPAREN
> TimeUnit : Integer <DurationUnit>
> DurationUnit : "ms" | "s" | "m" | "h" | "d" | "w"
> TimeInterval: '[' TimeValue ',' TimeValue ']'
> 
> 
> group by ([startTime, endTime], timeInterval, slidingStep(optional，default 
> value is equal to timeInterval，it must be greater than or equal to 
> timeInterval if specified))
> 
> 
> If we want to do aggregate query for the data of 8:00 a.m. to 11:00 a.m. 
> every day in July and August of 2019, the query sql will be like this:
> 
> 
> select count(status), max_value(temperature) from root.ln.wf01.wt01 
> group by ([2019-7-01T08:00:00, 2019-8-31T23:59:59], 3h, 1d);
> 
> 
> Part of the query results are as follows:
> 
> 
> | Time   | count(root.ln.wf01.wt01.status) | 
> max_value(root.ln.wf01.wt01.temperature)
> | ---------------------| ----- | ---------- |
> | 2019-7-01T08:00:00   | 1440  | 25.996933  |
> | 2019-7-02T08:00:00   | 1440  | 26.102223  |
> | 2019-7-03T08:00:00   | 1440  | 26.211028  |
> | 2019-7-04T08:00:00   | 1440  | 25.999288  |
> | 2019-7-05T08:00:00   | 1440  | 25.958802  |
> | 2019-7-06T08:00:00   | 1430  | 30.222344  |
> 
> 
> How does the original group by statement correspond to the new grammar? 
> Simply omit the optional parameter of the second parameter 'sliding step'.
> 
> 
> Original group by sql
> select count(status), max_value(temperature) from root.ln.wf01.wt01 
> group by (1d, [2017-11-01T00:00:00, 2017-11-07T23:00:00]);
> 
> 
> Transformed to new group by grammar
> select count(status), max_value(temperature) from root.ln.wf01.wt01 
> group by ([2017-11-01T00:00:00, 2017-11-07T23:00:00], 1d);
> 
> 
> It can be seen that the new groupby sql is almost identical to the original 
> one when the original groupbysql does not specify a starting point, and the 
> second optional parameter given by the original groupby syntax - the starting 
> point of the axis, I think, to some extent, it coincides with the starting 
> point of the latter, which increases the complexity of the group by syntax, 
> so in the new groupby syntax, this optional parameter is directly discarded.
> 
> 
>

Re: A New GroupBy Syntax

Reply via email to