[jira] [Comment Edited] (FLINK-7293) Support custom order by in PatternStream

2017-07-31 Thread Dian Fu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106909#comment-16106909
 ] 

Dian Fu edited comment on FLINK-7293 at 7/31/17 7:35 AM:
-

Agree that we can sort both by the time and the custom order by in the upstream 
and then forward the results to CEP. While I still think that having a custom 
sort logic in CEP is very important. This would give users who use CEP API more 
flexibility to control the order of the matching events. As for CEP, the order 
of the events is very important. Without this feature, the matched results will 
be non-deterministic for many use case. 


was (Author: dian.fu):
Agree that we can sort both by the time and the custom order by in the upstream 
and then forward the results to CEP. While I still think that having a custom 
sort logic in CEP is very important. This would give users who use CEP API more 
flexibility to control the order of the matching events. As for CEP, the order 
of the events is very important, without this feature, the matched results will 
be non-deterministic for many use case. 

> Support custom order by in PatternStream
> 
>
> Key: FLINK-7293
> URL: https://issues.apache.org/jira/browse/FLINK-7293
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP
>Reporter: Dian Fu
>Assignee: Dian Fu
>
> Currently, when {{ProcessingTime}} is configured, the events are fed to NFA 
> in the order of the arriving time and when {{EventTime}} is configured, the 
> events are fed to NFA in the order of the event time. It should also allow 
> custom {{order by}} to allow users to define the order of the events besides 
> the above factors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (FLINK-7293) Support custom order by in PatternStream

2017-07-31 Thread Dawid Wysakowicz (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106882#comment-16106882
 ] 

Dawid Wysakowicz edited comment on FLINK-7293 at 7/31/17 7:04 AM:
--

I agree there is sort by time, but if the previous operator before CEP would 
sort both by the time and custom order, the sorting in CEP would not have any 
impact.

You could even reuse the code from DataStreamSort.


was (Author: dawidwys):
I agree there is sort by time, but if the previous operator before CEP would 
sort both by the time and custom order, the sorting in CEP would not have any 
impact.

> Support custom order by in PatternStream
> 
>
> Key: FLINK-7293
> URL: https://issues.apache.org/jira/browse/FLINK-7293
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP
>Reporter: Dian Fu
>Assignee: Dian Fu
>
> Currently, when {{ProcessingTime}} is configured, the events are fed to NFA 
> in the order of the arriving time and when {{EventTime}} is configured, the 
> events are fed to NFA in the order of the event time. It should also allow 
> custom {{order by}} to allow users to define the order of the events besides 
> the above factors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (FLINK-7293) Support custom order by in PatternStream

2017-07-31 Thread Dawid Wysakowicz (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106882#comment-16106882
 ] 

Dawid Wysakowicz edited comment on FLINK-7293 at 7/31/17 7:04 AM:
--

I agree there is sort by time, but if the previous operator before CEP would 
sort both by the time and custom order, the sorting in CEP would not have any 
impact.


was (Author: dawidwys):
I agree there is sort by time, but if the previous operator before CEP would 
sort both by the time and custom order, the sorting in CEP would not have any 
impact.

You could even reuse the code from DataStreamSort.

> Support custom order by in PatternStream
> 
>
> Key: FLINK-7293
> URL: https://issues.apache.org/jira/browse/FLINK-7293
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP
>Reporter: Dian Fu
>Assignee: Dian Fu
>
> Currently, when {{ProcessingTime}} is configured, the events are fed to NFA 
> in the order of the arriving time and when {{EventTime}} is configured, the 
> events are fed to NFA in the order of the event time. It should also allow 
> custom {{order by}} to allow users to define the order of the events besides 
> the above factors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (FLINK-7293) Support custom order by in PatternStream

2017-07-31 Thread Dian Fu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106877#comment-16106877
 ] 

Dian Fu edited comment on FLINK-7293 at 7/31/17 7:00 AM:
-

As the {{event-time}}/{{process-time}} has higher priority over custom {{order 
by}}, so we can not first apply the custom sort and then pass it to the CEP 
library.
{quote}
This is the same case as in DataStream, which does not have sort function.
{quote}
Actually there are some differences. For example, there is no sort logic in 
DataStream at all, so all the sort logic can be implemented in Table API. While 
there is already sort logic in CEP library (event time) which makes it 
impossible to implement the sort in Table API without making the changes in 
this JIRA.  Thoughts?


was (Author: dian.fu):
As the {{event-time}}/{{process-time}} has higher priority over custom {{order 
by}}, so we can not first apply the custom sort and then pass it to the CEP 
library.
{quote}
This is the same case as in DataStream, which does not have sort function.
{quote}
Actually there are some differences. For example, there is no sort logic in 
DataStream at all, so all the sort logic can be implemented in Table API. While 
there is already sort logic in CEP library (event time) which makes us can not 
implement the sort in Table API alone.  Thoughts?

> Support custom order by in PatternStream
> 
>
> Key: FLINK-7293
> URL: https://issues.apache.org/jira/browse/FLINK-7293
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP
>Reporter: Dian Fu
>Assignee: Dian Fu
>
> Currently, when {{ProcessingTime}} is configured, the events are fed to NFA 
> in the order of the arriving time and when {{EventTime}} is configured, the 
> events are fed to NFA in the order of the event time. It should also allow 
> custom {{order by}} to allow users to define the order of the events besides 
> the above factors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (FLINK-7293) Support custom order by in PatternStream

2017-07-31 Thread Dian Fu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106877#comment-16106877
 ] 

Dian Fu edited comment on FLINK-7293 at 7/31/17 6:59 AM:
-

As the {{event-time}}/{{process-time}} has higher priority over custom {{order 
by}}, so we can not first apply the custom sort and then pass it to the CEP 
library.
{quote}
This is the same case as in DataStream, which does not have sort function.
{quote}
Actually there are some differences. For example, there is no sort logic in 
DataStream at all, so all the sort logic can be implemented in Table API. While 
there is already sort logic in CEP library (event time) which makes us can not 
implement the sort in Table API alone.  Thoughts?


was (Author: dian.fu):
As the {{event-time}}/{{process-time}} has higher priority over custom {{order 
by}}, so we can not first apply the custom sort and then pass it to the CEP 
library.
{quote}
This is the same case as in DataStream, which does not have sort function.
{quote}
Actually there are some differences. For example, there is no sort function in 
DataStream at all, so all the sort logic can be implemented in Table API. While 
there is already sort logic in CEP library (event time) which makes us can not 
implement the sort in Table API alone.  Thoughts?

> Support custom order by in PatternStream
> 
>
> Key: FLINK-7293
> URL: https://issues.apache.org/jira/browse/FLINK-7293
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP
>Reporter: Dian Fu
>Assignee: Dian Fu
>
> Currently, when {{ProcessingTime}} is configured, the events are fed to NFA 
> in the order of the arriving time and when {{EventTime}} is configured, the 
> events are fed to NFA in the order of the event time. It should also allow 
> custom {{order by}} to allow users to define the order of the events besides 
> the above factors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (FLINK-7293) Support custom order by in PatternStream

2017-07-31 Thread Dian Fu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16106877#comment-16106877
 ] 

Dian Fu edited comment on FLINK-7293 at 7/31/17 6:59 AM:
-

As the {{event-time}}/{{process-time}} has higher priority over custom {{order 
by}}, so we can not first apply the custom sort and then pass it to the CEP 
library.
{quote}
This is the same case as in DataStream, which does not have sort function.
{quote}
Actually there are some differences. For example, there is no sort function in 
DataStream at all, so all the sort logic can be implemented in Table API. While 
there is already sort logic in CEP library (event time) which makes us can not 
implement the sort in Table API alone.  Thoughts?


was (Author: dian.fu):
As the {{event-time}}/{{process-time}} has higher priority than custom {{order 
by}}, so we can not first apply the custom sort and then pass it to the CEP 
library.
{quote}
This is the same case as in DataStream, which does not have sort function.
{quote}
Actually there are some differences. For example, there is no sort function in 
DataStream at all, so all the sort logic can be implemented in Table API. While 
there is already sort logic in CEP library (event time) which makes us can not 
implement the sort in Table API alone.  Thoughts?

> Support custom order by in PatternStream
> 
>
> Key: FLINK-7293
> URL: https://issues.apache.org/jira/browse/FLINK-7293
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP
>Reporter: Dian Fu
>Assignee: Dian Fu
>
> Currently, when {{ProcessingTime}} is configured, the events are fed to NFA 
> in the order of the arriving time and when {{EventTime}} is configured, the 
> events are fed to NFA in the order of the event time. It should also allow 
> custom {{order by}} to allow users to define the order of the events besides 
> the above factors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (FLINK-7293) Support custom order by in PatternStream

2017-07-28 Thread Dian Fu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16105963#comment-16105963
 ] 

Dian Fu edited comment on FLINK-7293 at 7/29/17 1:59 AM:
-

{quote}
Could you explain a bit why this is needed? 
{quote}
As we need to support clauses such as
{code}
SELECT *
FROM Ticker MATCH_RECOGNIZE (
 PARTITION BY symbol
 ORDER BY tstamp, price
 MEASURES  STRT.tstamp AS start_tstamp,
   LAST(DOWN.tstamp) AS bottom_tstamp,
   LAST(UP.tstamp) AS end_tstamp
 ONE ROW PER MATCH
 AFTER MATCH SKIP TO LAST UP
 PATTERN (STRT DOWN+ UP+)
 DEFINE
DOWN AS DOWN.price < PREV(DOWN.price),
UP AS UP.price > PREV(UP.price)
 ) MR
{code}
There may be multiple columns {{tstamp}} and {{price}} to {{order by}}.

{quote}
I can't see a way to sort an unbounded stream of data  Could you elaborate a 
bit how do you see it working?
how this is going to play well with the Time semantics.
When both event-time and a custom order-by is used, who is going to win?
{quote}
This is working in the same way as the implementation of {{sort by}} in table 
API. That's to say, both the event-time and the custom order-by will be used 
and the event-time will be considered with higher priority and the custom 
order-by will be considered with lower priority. With both event-time and a 
custom order-by are used, when events come, they will be firstly ordered by the 
event time and when watermark come, the events with the same event time before 
watermark will be firstly ordered by the custom order-by before emitted (Please 
refer to 
[DataStreamSort.scala|https://github.com/apache/flink/blob/b8c8f204de718e6d5b7c3df837deafaed7c375f5/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamSort.scala]
 for more details)

Thoughts?




was (Author: dian.fu):
{quote}
Could you explain a bit why this is needed? 
{quote}
As we need to support clauses such as
{code}
SELECT *
FROM Ticker MATCH_RECOGNIZE (
 PARTITION BY symbol
 ORDER BY tstamp, price
 MEASURES  STRT.tstamp AS start_tstamp,
   LAST(DOWN.tstamp) AS bottom_tstamp,
   LAST(UP.tstamp) AS end_tstamp
 ONE ROW PER MATCH
 AFTER MATCH SKIP TO LAST UP
 PATTERN (STRT DOWN+ UP+)
 DEFINE
DOWN AS DOWN.price < PREV(DOWN.price),
UP AS UP.price > PREV(UP.price)
 ) MR
{code}
There may be multiple columns to order by.

{quote}
I can't see a way to sort an unbounded stream of data  Could you elaborate a 
bit how do you see it working?
how this is going to play well with the Time semantics.
When both event-time and a custom order-by is used, who is going to win?
{quote}
This is working in the same way as the implementation of {{sort by}} in table 
API. That's to say, both the event-time and the custom order-by will be used 
and the event-time should be considered with higher priority and the custom 
order-by will be considered with lower priorities. With both event-time and a 
custom order-by are used, when events come, they will be firstly ordered by the 
event time and when watermark come, the events before watermark with the same 
event time will firstly ordered by the custom order-by before emitted (Please 
refer to 
[DataStreamSort.scala|https://github.com/apache/flink/blob/b8c8f204de718e6d5b7c3df837deafaed7c375f5/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamSort.scala]
 for more details)

Thoughts?



> Support custom order by in PatternStream
> 
>
> Key: FLINK-7293
> URL: https://issues.apache.org/jira/browse/FLINK-7293
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP
>Reporter: Dian Fu
>Assignee: Dian Fu
>
> Currently, when {{ProcessingTime}} is configured, the events are fed to NFA 
> in the order of the arriving time and when {{EventTime}} is configured, the 
> events are fed to NFA in the order of the event time. It should also allow 
> custom {{order by}} to allow users to define the order of the events besides 
> the above factors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (FLINK-7293) Support custom order by in PatternStream

2017-07-28 Thread Dian Fu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16105963#comment-16105963
 ] 

Dian Fu edited comment on FLINK-7293 at 7/29/17 1:30 AM:
-

{quote}
Could you explain a bit why this is needed? 
{quote}
As we need to support clauses such as
{code}
SELECT *
FROM Ticker MATCH_RECOGNIZE (
 PARTITION BY symbol
 ORDER BY tstamp, price
 MEASURES  STRT.tstamp AS start_tstamp,
   LAST(DOWN.tstamp) AS bottom_tstamp,
   LAST(UP.tstamp) AS end_tstamp
 ONE ROW PER MATCH
 AFTER MATCH SKIP TO LAST UP
 PATTERN (STRT DOWN+ UP+)
 DEFINE
DOWN AS DOWN.price < PREV(DOWN.price),
UP AS UP.price > PREV(UP.price)
 ) MR
{code}
There may be multiple columns to order by.

{quote}
I can't see a way to sort an unbounded stream of data  Could you elaborate a 
bit how do you see it working?
how this is going to play well with the Time semantics.
When both event-time and a custom order-by is used, who is going to win?
{quote}
This is working in the same way as the implementation of {{sort by}} in table 
API. That's to say, both the event-time and the custom order-by will be used 
and the event-time should be considered with higher priority and the custom 
order-by will be considered with lower priorities. With both event-time and a 
custom order-by are used, when events come, they will be firstly ordered by the 
event time and when watermark come, the events before watermark with the same 
event time will firstly ordered by the custom order-by before emitted (Please 
refer to 
[DataStreamSort.scala|https://github.com/apache/flink/blob/b8c8f204de718e6d5b7c3df837deafaed7c375f5/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamSort.scala]
 for more details)

Thoughts?




was (Author: dian.fu):
{quote}
Could you explain a bit why this is needed? 
{quote}
As we need to support clauses such as
{code}
SELECT *
FROM Ticker MATCH_RECOGNIZE (
 PARTITION BY symbol
 ORDER BY tstamp, price
 MEASURES  STRT.tstamp AS start_tstamp,
   LAST(DOWN.tstamp) AS bottom_tstamp,
   LAST(UP.tstamp) AS end_tstamp
 ONE ROW PER MATCH
 AFTER MATCH SKIP TO LAST UP
 PATTERN (STRT DOWN+ UP+)
 DEFINE
DOWN AS DOWN.price < PREV(DOWN.price),
UP AS UP.price > PREV(UP.price)
 ) MR
{code}
There may be multiple columns to order by.

{quote}
I can't see a way to sort an unbounded stream of data  Could you elaborate a 
bit how do you see it working?
how this is going to play well with the Time semantics.
When both event-time and a custom order-by is used, who is going to win?
{quote}
This is working in the same way as the implementation of {{sort by}} in table 
API. That's to say, both the event-time and the custom order-by will be used 
and the event-time should be considered with higher priority and the custom 
order-by will be considered with lower priorities. (Please refer to 
[DataStreamSort.scala|https://github.com/apache/flink/blob/b8c8f204de718e6d5b7c3df837deafaed7c375f5/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamSort.scala]
 for more details)

Thoughts?



> Support custom order by in PatternStream
> 
>
> Key: FLINK-7293
> URL: https://issues.apache.org/jira/browse/FLINK-7293
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP
>Reporter: Dian Fu
>Assignee: Dian Fu
>
> Currently, when {{ProcessingTime}} is configured, the events are fed to NFA 
> in the order of the arriving time and when {{EventTime}} is configured, the 
> events are fed to NFA in the order of the event time. It should also allow 
> custom {{order by}} to allow users to define the order of the events besides 
> the above factors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)