[
https://issues.apache.org/jira/browse/FLINK-7169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104311#comment-16104311
]
ASF GitHub Bot commented on FLINK-7169:
---------------------------------------
Github user yestinchen commented on the issue:
https://github.com/apache/flink/pull/4331
@dawidwys Thanks for the reviewing.
Problem 1 is easy to fix, we can just start a new match process if the only
left computation state reaches stopState.
Problem 2 can not be avoided with current approach. It's impossible to
know whether there are potential matches.
I think the best wary to implement this correctly is try to start a new
match process after processing each event, and discard unfinished match process
after a successful match according to the skip strategy. In order to do that,
we need to keep the logical order of the events, which is the original idea I
proposed.
As for your general notes, I have some ideas:
1. I agree that the Oracle's specification is designed for bounded data.
But match recoginize in unbounded data is very similar to bounded data, since
all data are being processed one by one, and there's no need for bound
information. As for **_empty match_** , I think we can just use Oracle's
definition.
> Some patterns permit empty matches. For example:
PATTERN (A*)
can be matched by zero or more rows that are mapped to A.
An empty match does not map any rows to primary row pattern variables;
nevertheless, an empty match has a starting row. For example, there can be an
empty match at the first row of a row pattern partition, an empty match at the
second row of a row pattern partition, etc. An empty match is assigned a
sequential match number, based on the ordinal position of its starting row, the
same as any other match.
2. I feel uncomfortable with the RuntimeExceptions too. But these
exceptions are very important to keep the skip semantics right. I understand
your main concern is that Exceptions will stop the matching process, which is
unacceptable to online streaming service. To address this, I think we can
introduce a default strategy(SKIP_TO_NEXT_EVENT, for example). If these
exceptions happens, we can use default strategy to continue the match process,
and change the strategy back after a successful match. We can also add a switch
to let user decide whether to enable this feature.
3. I still think it's useful to support these skip strategies. Don't know
why Esper does not support them.
4. Thanks for the related information. I took a brief look at the PR, which
is very similar to this PR. I wonder why it is closed without merging into the
master code?
Looking forward to your feedbacks. Thanks.
> Support AFTER MATCH SKIP function in CEP library API
> ----------------------------------------------------
>
> Key: FLINK-7169
> URL: https://issues.apache.org/jira/browse/FLINK-7169
> Project: Flink
> Issue Type: Sub-task
> Components: CEP
> Reporter: Yueting Chen
> Assignee: Yueting Chen
>
> In order to support Oracle's MATCH_RECOGNIZE on top of the CEP library, we
> need to support AFTER MATCH SKIP function in CEP API.
> There're four options in AFTER MATCH SKIP, listed as follows:
> 1. AFTER MATCH SKIP TO NEXT ROW: resume pattern matching at the row after the
> first row of the current match.
> 2. AFTER MATCH SKIP PAST LAST ROW: resume pattern matching at the next row
> after the last row of the current match.
> 3. AFTER MATCH SKIP TO FIST *RPV*: resume pattern matching at the first row
> that is mapped to the row pattern variable RPV.
> 4. AFTER MATCH SKIP TO LAST *RPV*: resume pattern matching at the last row
> that is mapped to the row pattern variable RPV.
> I think we can introduce a new function to `CEP` class, which takes a new
> parameter as AfterMatchSKipStrategy.
> The new API may looks like this
> {code}
> public static <T> PatternStream<T> pattern(DataStream<T> input, Pattern<T, ?>
> pattern, AfterMatchSkipStrategy afterMatchSkipStrategy)
> {code}
> We can also make `SKIP TO NEXT ROW` as the default option, because that's
> what CEP library behaves currently.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)