[
https://issues.apache.org/jira/browse/CALCITE-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022551#comment-17022551
]
Rui Wang edited comment on CALCITE-3737 at 1/23/20 10:15 PM:
-------------------------------------------------------------
Addressed your comments and have two responses to two of the comments:
> Can HOP and TUMBLE share implementation?
I tried to share most of the code and just implemented the windowing part
(computing window_start and window_end). Later I gave it up cause hopping need
call one function to return a list of hopping's window_start and window_end,
and we won't know the size of the list so we cannot really write a for loop in
Java. (note that I need to build a list of lin4j expressions and you can check
discussion here:
[link|https://lists.apache.org/thread.html/86e5aa132de0656419843cab6c1f4fbea5941d4401dbde36cc11827e%40%3Cdev.calcite.apache.org%3E]).
Also considering later I will add per-key sessionazation and bucket_gap_filling
table functions, they will have even more complicated code to write and is also
less sharable. For example, per-key sessionazation will need know all data
first and then apply sorting to find window start and window end. Thus I will
prefer implement those by the way that implements hopping (e.g. provide a
AbstractEnumerable<Object[]> implementation).
As I am building more table functions and add support for streaming sql, if I
want better way to unified table functions implementation, I will add patches
for that.
>Changes to reference.md need some copy-editing.
I tried to check the changes in reference.md and made some changes. However I
am not a native English speaker so I might not really fix what in your mind
before.
was (Author: amaliujia):
Addressed your comments and have two responses to two of the comments:
> Can HOP and TUMBLE share implementation?
I tried to share most of the code and just implemented the windowing part
(computing window_start and window_end). Later I gave it up cause hopping need
call one function to return a list of hopping's window_start and window_end,
and we won't know the size of the list so we cannot really write a for loop in
Java. (note that I need to build a list of lin4j expressions and you can check
discussion here:
[link|https://lists.apache.org/thread.html/86e5aa132de0656419843cab6c1f4fbea5941d4401dbde36cc11827e%40%3Cdev.calcite.apache.org%3E]).
Also considering later I will add per-key sessionazation and bucket_gap_filling
table functions, they will have even more complicated code to write thus I will
prefer implement those by the way that implements hopping (e.g. provide a
AbstractEnumerable<Object[]> implementation).
>Changes to reference.md need some copy-editing.
I tried to check the changes in reference.md and made some changes. However I
am not a native English speaker so I might not really fix what in your mind
before.
> HOP Table-valued Function
> -------------------------
>
> Key: CALCITE-3737
> URL: https://issues.apache.org/jira/browse/CALCITE-3737
> Project: Calcite
> Issue Type: Sub-task
> Reporter: Rui Wang
> Assignee: Rui Wang
> Priority: Major
> Labels: pull-request-available
> Time Spent: 2h
> Remaining Estimate: 0h
>
> Hopping windows place intervals of a fixed size evenly spaced across event
> time. Most importantly, in the most common use a given event time timestamp
> will generally fall into more than one window.
> The table-valued function Hop may produce zero, one, or multiple rows
> corresponding to each row of input. Hop takes four required parameters and
> one optional parameter. All parameters are analogous to those for Tumble
> except for hopsize, which specifies the duration between the starting points
> (and endpoints) of the hopping windows, allowing for overlapping windows
> (hopsize < dur, common) or gaps in the data (hopsize > dur, rarely useful).
> {code:java}
> Hop (data , timecol , dur, hopsize)
> {code}
> The return value of Hop is a relation that includes all columns of data as
> well as additional event time columns wstart and wend. Here is an example
> (from https://s.apache.org/streaming-beam-sql ):
> {code:sql}
> SELECT *
> FROM Hop (
> data => TABLE Bids ,
> timecol => DESCRIPTOR ( bidtime ) ,
> dur => INTERVAL '10' MINUTES ,
> hopsize => INTERVAL '5' MINUTES );
> ------------------------------------------
> | wstart | wend | bidtime | price | item |
> ------------------------------------------
> | 8:00 | 8:10 | 8:07 | $2 | A |
> | 8:05 | 8:15 | 8:07 | $2 | A |
> | 8:05 | 8:15 | 8:11 | $3 | B |
> | 8:10 | 8:20 | 8:11 | $3 | B |
> | 8:00 | 8:10 | 8:05 | $4 | C |
> | 8:05 | 8:15 | 8:05 | $4 | C |
> | 8:00 | 8:10 | 8:09 | $5 | D |
> | 8:05 | 8:15 | 8:09 | $5 | D |
> | 8:05 | 8:15 | 8:13 | $1 | E |
> | 8:10 | 8:20 | 8:13 | $1 | E |
> | 8:10 | 8:20 | 8:17 | $6 | F |
> | 8:15 | 8:25 | 8:17 | $6 | F |
> ------------------------------------------
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)