[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143367#comment-16143367 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r135441351 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/nfa/compiler/NFACompiler.java --- @@ -421,6 +437,15 @@ private void addStopStateToLooping(final State loopingState) { untilCondition, true); + if (currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY) && + times.getFrom() != times.getTo()) { + if (untilCondition != null) { + State sinkStateCopy = copy(sinkState); + originalStateMap.put(sinkState.getName(), sinkStateCopy); --- End diff -- originalStateMap is used when compiling the NFA and it will be collected after NFA is created and so I think it's unnecessary to clear the entries. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143360#comment-16143360 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r135440842 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/nfa/compiler/NFACompiler.java --- @@ -526,18 +551,32 @@ private boolean isPatternOptional(Patternpattern) { return createGroupPatternState((GroupPattern) currentPattern, sinkState, proceedState, isOptional); } - final IterativeCondition trueFunction = getTrueFunction(); - final State singletonState = createState(currentPattern.getName(), State.StateType.Normal); // if event is accepted then all notPatterns previous to the optional states are no longer valid final State sink = copyWithoutTransitiveNots(sinkState); singletonState.addTake(sink, takeCondition); + // if no element accepted the previous nots are still valid. + final IterativeCondition proceedCondition = getTrueFunction(); + // for the first state of a group pattern, its PROCEED edge should point to the following state of // that group pattern and the edge will be added at the end of creating the NFA for that group pattern if (isOptional && !headOfGroup(currentPattern)) { - // if no element accepted the previous nots are still valid. - singletonState.addProceed(proceedState, trueFunction); + if (currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY)) { + final IterativeCondition untilCondition = + (IterativeCondition) currentPattern.getUntilCondition(); + if (untilCondition != null) { + singletonState.addProceed( + originalStateMap.get(proceedState.getName()), + new AndCondition<>(proceedCondition, untilCondition)); --- End diff -- When untilCondition holds, the loop should break and the state should proceed to the next state. This is covered by the test case GreedyITCase#testGreedyUntilWithDummyEventsBeforeQuantifier. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142374#comment-16142374 ] ASF GitHub Bot commented on FLINK-7147: --- Github user tedyu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r135367240 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/nfa/compiler/NFACompiler.java --- @@ -526,18 +551,32 @@ private boolean isPatternOptional(Patternpattern) { return createGroupPatternState((GroupPattern) currentPattern, sinkState, proceedState, isOptional); } - final IterativeCondition trueFunction = getTrueFunction(); - final State singletonState = createState(currentPattern.getName(), State.StateType.Normal); // if event is accepted then all notPatterns previous to the optional states are no longer valid final State sink = copyWithoutTransitiveNots(sinkState); singletonState.addTake(sink, takeCondition); + // if no element accepted the previous nots are still valid. + final IterativeCondition proceedCondition = getTrueFunction(); + // for the first state of a group pattern, its PROCEED edge should point to the following state of // that group pattern and the edge will be added at the end of creating the NFA for that group pattern if (isOptional && !headOfGroup(currentPattern)) { - // if no element accepted the previous nots are still valid. - singletonState.addProceed(proceedState, trueFunction); + if (currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY)) { + final IterativeCondition untilCondition = + (IterativeCondition) currentPattern.getUntilCondition(); + if (untilCondition != null) { + singletonState.addProceed( + originalStateMap.get(proceedState.getName()), + new AndCondition<>(proceedCondition, untilCondition)); --- End diff -- Why is this not wrapped with NotCondition ? > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142367#comment-16142367 ] ASF GitHub Bot commented on FLINK-7147: --- Github user tedyu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r135366609 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/nfa/compiler/NFACompiler.java --- @@ -526,18 +551,32 @@ private boolean isPatternOptional(Patternpattern) { return createGroupPatternState((GroupPattern) currentPattern, sinkState, proceedState, isOptional); } - final IterativeCondition trueFunction = getTrueFunction(); - final State singletonState = createState(currentPattern.getName(), State.StateType.Normal); // if event is accepted then all notPatterns previous to the optional states are no longer valid final State sink = copyWithoutTransitiveNots(sinkState); singletonState.addTake(sink, takeCondition); + // if no element accepted the previous nots are still valid. + final IterativeCondition proceedCondition = getTrueFunction(); + // for the first state of a group pattern, its PROCEED edge should point to the following state of // that group pattern and the edge will be added at the end of creating the NFA for that group pattern if (isOptional && !headOfGroup(currentPattern)) { - // if no element accepted the previous nots are still valid. - singletonState.addProceed(proceedState, trueFunction); + if (currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY)) { + final IterativeCondition untilCondition = + (IterativeCondition) currentPattern.getUntilCondition(); + if (untilCondition != null) { + singletonState.addProceed( + originalStateMap.get(proceedState.getName()), + new AndCondition<>(proceedCondition, untilCondition)); + } + singletonState.addProceed(proceedState, + untilCondition != null --- End diff -- Redundant check - see line 568 > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142361#comment-16142361 ] ASF GitHub Bot commented on FLINK-7147: --- Github user tedyu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r135366122 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/nfa/compiler/NFACompiler.java --- @@ -421,6 +437,15 @@ private void addStopStateToLooping(final State loopingState) { untilCondition, true); + if (currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY) && + times.getFrom() != times.getTo()) { + if (untilCondition != null) { + State sinkStateCopy = copy(sinkState); + originalStateMap.put(sinkState.getName(), sinkStateCopy); --- End diff -- When are the old entries cleared in this map ? Shall we consider using map which expires entries by TTL ? > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139688#comment-16139688 ] ASF GitHub Bot commented on FLINK-7147: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/4296 > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139653#comment-16139653 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on the issue: https://github.com/apache/flink/pull/4296 Sure. Have created FLINK-7496 to track this issue. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139644#comment-16139644 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on the issue: https://github.com/apache/flink/pull/4296 merging this. @dianfu could you create a JIRA for the `Optional` after `greedy` to track interest in that case. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16125337#comment-16125337 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on the issue: https://github.com/apache/flink/pull/4296 +1, LGTM > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121564#comment-16121564 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on the issue: https://github.com/apache/flink/pull/4296 @dawidwys Any comments? > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16117808#comment-16117808 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r131804932 --- Diff: flink-libraries/flink-cep/src/test/java/org/apache/flink/cep/nfa/GreedyITCase.java --- @@ -0,0 +1,1068 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.cep.nfa; + +import org.apache.flink.cep.Event; +import org.apache.flink.cep.nfa.compiler.NFACompiler; +import org.apache.flink.cep.pattern.Pattern; +import org.apache.flink.cep.pattern.conditions.SimpleCondition; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; +import org.apache.flink.util.TestLogger; + +import com.google.common.collect.Lists; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.cep.nfa.NFATestUtilities.compareMaps; +import static org.apache.flink.cep.nfa.NFATestUtilities.feedNFA; + +/** + * IT tests covering {@link Pattern#greedy()}. + */ +public class GreedyITCase extends TestLogger { + + @Test + public void testGreedyZeroOrMore() { + ListinputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); + inputEvents.add(new StreamRecord<>(a1, 2)); + inputEvents.add(new StreamRecord<>(a2, 3)); + inputEvents.add(new StreamRecord<>(a3, 4)); + inputEvents.add(new StreamRecord<>(d, 5)); + + // c a* d + Pattern pattern = Pattern.begin("start").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("c"); + } + }).followedBy("middle").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("a"); + } + }).oneOrMore().optional().greedy().followedBy("end").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("d"); + } + }); + + NFA nfa = NFACompiler.compile(pattern, Event.createTypeSerializer(), false); + + final List resultingPatterns = feedNFA(inputEvents, nfa); + + compareMaps(resultingPatterns, Lists.
newArrayList( + Lists.newArrayList(c, a1, a2, a3, d) + )); + } + + @Test + public void testGreedyZeroOrMoreConsecutiveEndWithOptional() { + List
inputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); +
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16117806#comment-16117806 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r131804398 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/nfa/compiler/NFACompiler.java --- @@ -386,6 +387,19 @@ private void checkPatternNameUniqueness(final Pattern pattern) { return copyOfSink; } + private State copy(final State state) { + final State copyOfState = new State<>(state.getName(), state.getStateType()); --- End diff -- Updated. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16117810#comment-16117810 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r131804902 --- Diff: flink-libraries/flink-cep/src/test/java/org/apache/flink/cep/nfa/GreedyITCase.java --- @@ -0,0 +1,1068 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.cep.nfa; + +import org.apache.flink.cep.Event; +import org.apache.flink.cep.nfa.compiler.NFACompiler; +import org.apache.flink.cep.pattern.Pattern; +import org.apache.flink.cep.pattern.conditions.SimpleCondition; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; +import org.apache.flink.util.TestLogger; + +import com.google.common.collect.Lists; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.cep.nfa.NFATestUtilities.compareMaps; +import static org.apache.flink.cep.nfa.NFATestUtilities.feedNFA; + +/** + * IT tests covering {@link Pattern#greedy()}. + */ +public class GreedyITCase extends TestLogger { + + @Test + public void testGreedyZeroOrMore() { + ListinputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); + inputEvents.add(new StreamRecord<>(a1, 2)); + inputEvents.add(new StreamRecord<>(a2, 3)); + inputEvents.add(new StreamRecord<>(a3, 4)); + inputEvents.add(new StreamRecord<>(d, 5)); + + // c a* d + Pattern pattern = Pattern.begin("start").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("c"); + } + }).followedBy("middle").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("a"); + } + }).oneOrMore().optional().greedy().followedBy("end").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("d"); + } + }); + + NFA nfa = NFACompiler.compile(pattern, Event.createTypeSerializer(), false); + + final List resultingPatterns = feedNFA(inputEvents, nfa); + + compareMaps(resultingPatterns, Lists.
newArrayList( + Lists.newArrayList(c, a1, a2, a3, d) + )); + } + + @Test + public void testGreedyZeroOrMoreConsecutiveEndWithOptional() { + List
inputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); +
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16117807#comment-16117807 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r131804887 --- Diff: flink-libraries/flink-cep/src/test/java/org/apache/flink/cep/nfa/GreedyITCase.java --- @@ -0,0 +1,1068 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.cep.nfa; + +import org.apache.flink.cep.Event; +import org.apache.flink.cep.nfa.compiler.NFACompiler; +import org.apache.flink.cep.pattern.Pattern; +import org.apache.flink.cep.pattern.conditions.SimpleCondition; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; +import org.apache.flink.util.TestLogger; + +import com.google.common.collect.Lists; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.cep.nfa.NFATestUtilities.compareMaps; +import static org.apache.flink.cep.nfa.NFATestUtilities.feedNFA; + +/** + * IT tests covering {@link Pattern#greedy()}. + */ +public class GreedyITCase extends TestLogger { + + @Test + public void testGreedyZeroOrMore() { + ListinputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); + inputEvents.add(new StreamRecord<>(a1, 2)); + inputEvents.add(new StreamRecord<>(a2, 3)); + inputEvents.add(new StreamRecord<>(a3, 4)); + inputEvents.add(new StreamRecord<>(d, 5)); + + // c a* d + Pattern pattern = Pattern.begin("start").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("c"); + } + }).followedBy("middle").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("a"); + } + }).oneOrMore().optional().greedy().followedBy("end").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("d"); + } + }); + + NFA nfa = NFACompiler.compile(pattern, Event.createTypeSerializer(), false); + + final List resultingPatterns = feedNFA(inputEvents, nfa); + + compareMaps(resultingPatterns, Lists.
newArrayList( + Lists.newArrayList(c, a1, a2, a3, d) + )); + } + + @Test + public void testGreedyZeroOrMoreConsecutiveEndWithOptional() { + List
inputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); +
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16117809#comment-16117809 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r131804949 --- Diff: flink-libraries/flink-cep/src/test/java/org/apache/flink/cep/nfa/GreedyITCase.java --- @@ -0,0 +1,1068 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.cep.nfa; + +import org.apache.flink.cep.Event; +import org.apache.flink.cep.nfa.compiler.NFACompiler; +import org.apache.flink.cep.pattern.Pattern; +import org.apache.flink.cep.pattern.conditions.SimpleCondition; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; +import org.apache.flink.util.TestLogger; + +import com.google.common.collect.Lists; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.cep.nfa.NFATestUtilities.compareMaps; +import static org.apache.flink.cep.nfa.NFATestUtilities.feedNFA; + +/** + * IT tests covering {@link Pattern#greedy()}. + */ +public class GreedyITCase extends TestLogger { + + @Test + public void testGreedyZeroOrMore() { + ListinputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); + inputEvents.add(new StreamRecord<>(a1, 2)); + inputEvents.add(new StreamRecord<>(a2, 3)); + inputEvents.add(new StreamRecord<>(a3, 4)); + inputEvents.add(new StreamRecord<>(d, 5)); + + // c a* d + Pattern pattern = Pattern.begin("start").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("c"); + } + }).followedBy("middle").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("a"); + } + }).oneOrMore().optional().greedy().followedBy("end").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("d"); + } + }); + + NFA nfa = NFACompiler.compile(pattern, Event.createTypeSerializer(), false); + + final List resultingPatterns = feedNFA(inputEvents, nfa); + + compareMaps(resultingPatterns, Lists.
newArrayList( + Lists.newArrayList(c, a1, a2, a3, d) + )); + } + + @Test + public void testGreedyZeroOrMoreConsecutiveEndWithOptional() { + List
inputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); +
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16117811#comment-16117811 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r131804764 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/nfa/compiler/NFACompiler.java --- @@ -637,9 +675,23 @@ private boolean isPatternOptional(Patternpattern) { untilCondition, true); - final IterativeCondition proceedCondition = getTrueFunction(); + IterativeCondition proceedCondition = getTrueFunction(); final State loopingState = createState(currentPattern.getName(), State.StateType.Normal); - loopingState.addProceed(sinkState, proceedCondition); + + if (currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY)) { + State sinkStateCopy = copy(sinkState); --- End diff -- Make sense. Updated. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16116461#comment-16116461 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on the issue: https://github.com/apache/flink/pull/4296 There is also one more problem. When we have optional after `greedy` it does not work well. E.g. have a look at this test case: @Test public void testGreedyZeroOrMoreBeforeOptional2() { ListinputEvents = new ArrayList<>(); Event c = new Event(40, "c", 1.0); Event a1 = new Event(41, "a", 2.0); Event a2 = new Event(42, "a", 2.0); Event d = new Event(43, "d", 3.0); Event a3 = new Event(42, "a", 2.0); Event e = new Event(44, "e", 3.0); inputEvents.add(new StreamRecord<>(c, 1)); inputEvents.add(new StreamRecord<>(a1, 2)); inputEvents.add(new StreamRecord<>(a2, 3)); inputEvents.add(new StreamRecord<>(d, 4)); inputEvents.add(new StreamRecord<>(a3, 5)); inputEvents.add(new StreamRecord<>(e, 6)); // c a* d e Pattern pattern = Pattern.begin("start").where(new SimpleCondition() { private static final long serialVersionUID = 5726188262756267490L; @Override public boolean filter(Event value) throws Exception { return value.getName().equals("c"); } }).followedBy("middle1").where(new SimpleCondition() { private static final long serialVersionUID = 5726188262756267490L; @Override public boolean filter(Event value) throws Exception { return value.getName().equals("a"); } }).oneOrMore().optional().greedy().followedBy("middle2").where(new SimpleCondition() { private static final long serialVersionUID = 5726188262756267490L; @Override public boolean filter(Event value) throws Exception { return value.getName().equals("d"); } }).optional().followedBy("end").where(new SimpleCondition() { private static final long serialVersionUID = 5726188262756267490L; @Override public boolean filter(Event value) throws Exception { return value.getName().equals("e"); } }); NFA nfa = NFACompiler.compile(pattern, Event.createTypeSerializer(), false); final List resultingPatterns = feedNFA(inputEvents, nfa); compareMaps(resultingPatterns, Lists.
newArrayList( Lists.newArrayList(c, a1, a2, a3, e), Lists.newArrayList(c, a1, a2, d, e) )); } Right know it also returns `c a1 a2 e`, which I think is not correct. I don't think there is an easy way to fix it right now. I would suggest restricting on the Pattern level that greedy must not be followed by an `Optional` patten. I would like to hear opinions on that, @kl0u. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16116443#comment-16116443 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r131629322 --- Diff: flink-libraries/flink-cep/src/test/java/org/apache/flink/cep/nfa/GreedyITCase.java --- @@ -0,0 +1,1068 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.cep.nfa; + +import org.apache.flink.cep.Event; +import org.apache.flink.cep.nfa.compiler.NFACompiler; +import org.apache.flink.cep.pattern.Pattern; +import org.apache.flink.cep.pattern.conditions.SimpleCondition; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; +import org.apache.flink.util.TestLogger; + +import com.google.common.collect.Lists; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.cep.nfa.NFATestUtilities.compareMaps; +import static org.apache.flink.cep.nfa.NFATestUtilities.feedNFA; + +/** + * IT tests covering {@link Pattern#greedy()}. + */ +public class GreedyITCase extends TestLogger { + + @Test + public void testGreedyZeroOrMore() { + ListinputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); + inputEvents.add(new StreamRecord<>(a1, 2)); + inputEvents.add(new StreamRecord<>(a2, 3)); + inputEvents.add(new StreamRecord<>(a3, 4)); + inputEvents.add(new StreamRecord<>(d, 5)); + + // c a* d + Pattern pattern = Pattern.begin("start").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("c"); + } + }).followedBy("middle").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("a"); + } + }).oneOrMore().optional().greedy().followedBy("end").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("d"); + } + }); + + NFA nfa = NFACompiler.compile(pattern, Event.createTypeSerializer(), false); + + final List resultingPatterns = feedNFA(inputEvents, nfa); + + compareMaps(resultingPatterns, Lists.
newArrayList( + Lists.newArrayList(c, a1, a2, a3, d) + )); + } + + @Test + public void testGreedyZeroOrMoreConsecutiveEndWithOptional() { + List
inputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); +
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16116444#comment-16116444 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r131629581 --- Diff: flink-libraries/flink-cep/src/test/java/org/apache/flink/cep/nfa/GreedyITCase.java --- @@ -0,0 +1,1068 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.cep.nfa; + +import org.apache.flink.cep.Event; +import org.apache.flink.cep.nfa.compiler.NFACompiler; +import org.apache.flink.cep.pattern.Pattern; +import org.apache.flink.cep.pattern.conditions.SimpleCondition; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; +import org.apache.flink.util.TestLogger; + +import com.google.common.collect.Lists; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.cep.nfa.NFATestUtilities.compareMaps; +import static org.apache.flink.cep.nfa.NFATestUtilities.feedNFA; + +/** + * IT tests covering {@link Pattern#greedy()}. + */ +public class GreedyITCase extends TestLogger { + + @Test + public void testGreedyZeroOrMore() { + ListinputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); + inputEvents.add(new StreamRecord<>(a1, 2)); + inputEvents.add(new StreamRecord<>(a2, 3)); + inputEvents.add(new StreamRecord<>(a3, 4)); + inputEvents.add(new StreamRecord<>(d, 5)); + + // c a* d + Pattern pattern = Pattern.begin("start").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("c"); + } + }).followedBy("middle").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("a"); + } + }).oneOrMore().optional().greedy().followedBy("end").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("d"); + } + }); + + NFA nfa = NFACompiler.compile(pattern, Event.createTypeSerializer(), false); + + final List resultingPatterns = feedNFA(inputEvents, nfa); + + compareMaps(resultingPatterns, Lists.
newArrayList( + Lists.newArrayList(c, a1, a2, a3, d) + )); + } + + @Test + public void testGreedyZeroOrMoreConsecutiveEndWithOptional() { + List
inputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); +
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16116442#comment-16116442 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r131629357 --- Diff: flink-libraries/flink-cep/src/test/java/org/apache/flink/cep/nfa/GreedyITCase.java --- @@ -0,0 +1,1068 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.cep.nfa; + +import org.apache.flink.cep.Event; +import org.apache.flink.cep.nfa.compiler.NFACompiler; +import org.apache.flink.cep.pattern.Pattern; +import org.apache.flink.cep.pattern.conditions.SimpleCondition; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; +import org.apache.flink.util.TestLogger; + +import com.google.common.collect.Lists; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.cep.nfa.NFATestUtilities.compareMaps; +import static org.apache.flink.cep.nfa.NFATestUtilities.feedNFA; + +/** + * IT tests covering {@link Pattern#greedy()}. + */ +public class GreedyITCase extends TestLogger { + + @Test + public void testGreedyZeroOrMore() { + ListinputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); + inputEvents.add(new StreamRecord<>(a1, 2)); + inputEvents.add(new StreamRecord<>(a2, 3)); + inputEvents.add(new StreamRecord<>(a3, 4)); + inputEvents.add(new StreamRecord<>(d, 5)); + + // c a* d + Pattern pattern = Pattern.begin("start").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("c"); + } + }).followedBy("middle").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("a"); + } + }).oneOrMore().optional().greedy().followedBy("end").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("d"); + } + }); + + NFA nfa = NFACompiler.compile(pattern, Event.createTypeSerializer(), false); + + final List resultingPatterns = feedNFA(inputEvents, nfa); + + compareMaps(resultingPatterns, Lists.
newArrayList( + Lists.newArrayList(c, a1, a2, a3, d) + )); + } + + @Test + public void testGreedyZeroOrMoreConsecutiveEndWithOptional() { + List
inputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); +
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16116445#comment-16116445 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r131629516 --- Diff: flink-libraries/flink-cep/src/test/java/org/apache/flink/cep/nfa/GreedyITCase.java --- @@ -0,0 +1,1068 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.cep.nfa; + +import org.apache.flink.cep.Event; +import org.apache.flink.cep.nfa.compiler.NFACompiler; +import org.apache.flink.cep.pattern.Pattern; +import org.apache.flink.cep.pattern.conditions.SimpleCondition; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; +import org.apache.flink.util.TestLogger; + +import com.google.common.collect.Lists; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.cep.nfa.NFATestUtilities.compareMaps; +import static org.apache.flink.cep.nfa.NFATestUtilities.feedNFA; + +/** + * IT tests covering {@link Pattern#greedy()}. + */ +public class GreedyITCase extends TestLogger { + + @Test + public void testGreedyZeroOrMore() { + ListinputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); + inputEvents.add(new StreamRecord<>(a1, 2)); + inputEvents.add(new StreamRecord<>(a2, 3)); + inputEvents.add(new StreamRecord<>(a3, 4)); + inputEvents.add(new StreamRecord<>(d, 5)); + + // c a* d + Pattern pattern = Pattern.begin("start").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("c"); + } + }).followedBy("middle").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("a"); + } + }).oneOrMore().optional().greedy().followedBy("end").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("d"); + } + }); + + NFA nfa = NFACompiler.compile(pattern, Event.createTypeSerializer(), false); + + final List resultingPatterns = feedNFA(inputEvents, nfa); + + compareMaps(resultingPatterns, Lists.
newArrayList( + Lists.newArrayList(c, a1, a2, a3, d) + )); + } + + @Test + public void testGreedyZeroOrMoreConsecutiveEndWithOptional() { + List
inputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); +
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16116437#comment-16116437 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r131628265 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/nfa/compiler/NFACompiler.java --- @@ -386,6 +387,19 @@ private void checkPatternNameUniqueness(final Pattern pattern) { return copyOfSink; } + private State copy(final State state) { + final State copyOfState = new State<>(state.getName(), state.getStateType()); --- End diff -- The copy should be made with `createState` method which does ensure there is a unique name assigned, which is neccessary for the serialization. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16116435#comment-16116435 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r131628087 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/nfa/compiler/NFACompiler.java --- @@ -637,9 +675,23 @@ private boolean isPatternOptional(Patternpattern) { untilCondition, true); - final IterativeCondition proceedCondition = getTrueFunction(); + IterativeCondition proceedCondition = getTrueFunction(); final State loopingState = createState(currentPattern.getName(), State.StateType.Normal); - loopingState.addProceed(sinkState, proceedCondition); + + if (currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY)) { + State sinkStateCopy = copy(sinkState); --- End diff -- If I understood correctly the copies are needed only in casese where there is the `untilCondition`. Am I right? If so let's create the copy then. Right know there are dangling copies of the next when there is no `untilCondtion`. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113946#comment-16113946 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on the issue: https://github.com/apache/flink/pull/4296 @dawidwys Regarding to the times().greedy(), the result is not expected and have fixed the issue in the latest PR. Also updated the doc. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112698#comment-16112698 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on the issue: https://github.com/apache/flink/pull/4296 I think we are getting close to the final version. Good job! Still though have two higher level comments: - is current times().greedy() behaviour intended? For pattern `a{2, 5} b` and sequence `a1 a2 a3 a4 b` shouldn't it return just one match: `a1 a2 a3 a4 b` Instead of three as in `testGreedyTimesRange` - this feature should be documented in the docs. It should be also very clearly stated there that it does not work for `GroupPatterns` > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110954#comment-16110954 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r130883414 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/nfa/compiler/NFACompiler.java --- @@ -657,25 +663,34 @@ private boolean isPatternOptional(Patternpattern) { true); IterativeCondition proceedCondition = getTrueFunction(); - if (currentPattern.getQuantifier().isGreedy()) { - proceedCondition = getGreedyCondition(proceedCondition, Lists.newArrayList(takeCondition)); + if (currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY)) { + proceedCondition = getGreedyCondition( + proceedCondition, + takeCondition, + ignoreCondition, + followingTakeCondition);; } final State loopingState = createState(currentPattern.getName(), State.StateType.Normal, - currentPattern.getQuantifier().isGreedy()); + currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY)); loopingState.addProceed(sinkState, proceedCondition); loopingState.addTake(takeCondition); addStopStateToLooping(loopingState); if (ignoreCondition != null) { final State ignoreState = createState(currentPattern.getName(), State.StateType.Normal, - currentPattern.getQuantifier().isGreedy()); + currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY)); ignoreState.addTake(loopingState, takeCondition); ignoreState.addIgnore(ignoreCondition); loopingState.addIgnore(ignoreState, ignoreCondition); + if (currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY)) { --- End diff -- @dawidwys Good advice. Thanks a lot :). I have updated the PR per the solution you suggested. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108520#comment-16108520 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r130537682 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/nfa/compiler/NFACompiler.java --- @@ -657,25 +663,34 @@ private boolean isPatternOptional(Patternpattern) { true); IterativeCondition proceedCondition = getTrueFunction(); - if (currentPattern.getQuantifier().isGreedy()) { - proceedCondition = getGreedyCondition(proceedCondition, Lists.newArrayList(takeCondition)); + if (currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY)) { + proceedCondition = getGreedyCondition( + proceedCondition, + takeCondition, + ignoreCondition, + followingTakeCondition);; } final State loopingState = createState(currentPattern.getName(), State.StateType.Normal, - currentPattern.getQuantifier().isGreedy()); + currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY)); loopingState.addProceed(sinkState, proceedCondition); loopingState.addTake(takeCondition); addStopStateToLooping(loopingState); if (ignoreCondition != null) { final State ignoreState = createState(currentPattern.getName(), State.StateType.Normal, - currentPattern.getQuantifier().isGreedy()); + currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY)); ignoreState.addTake(loopingState, takeCondition); ignoreState.addIgnore(ignoreCondition); loopingState.addIgnore(ignoreState, ignoreCondition); + if (currentPattern.getQuantifier().hasProperty(Quantifier.QuantifierProperty.GREEDY)) { --- End diff -- I think the problem you mentioned with `GreedyITCase.testGreedyZeroOrMoreBeforeOptional2` is due to that proceed. The whole reason behind the additional ignoreState was for it not to have the `PROCEED` transition as it creates additional computationStates with the `sinkState`. I think for the greedy to work we need to put customized ignoreCondition into the `sinkState` of a greedy Pattern. What do you think? > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103180#comment-16103180 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on the issue: https://github.com/apache/flink/pull/4296 @dawidwys Thanks a lot for the review. I have updated the patch. Currently, there is something wrong when the greedy state is followed by an optional state. This can be covered by test case GreedyITCase.testGreedyZeroOrMoreBeforeOptional2 (duplicate results will be got). Solutions from my mind are removing the duplicate results before returning the results in NFA or disabling this case for the time being. What's your thought? Do you have any suggestions to this problem? > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103173#comment-16103173 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r129832284 --- Diff: flink-libraries/flink-cep/src/test/java/org/apache/flink/cep/nfa/GreedyITCase.java --- @@ -0,0 +1,290 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.cep.nfa; + +import org.apache.flink.cep.Event; +import org.apache.flink.cep.nfa.compiler.NFACompiler; +import org.apache.flink.cep.pattern.Pattern; +import org.apache.flink.cep.pattern.conditions.SimpleCondition; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; +import org.apache.flink.util.TestLogger; + +import com.google.common.collect.Lists; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.cep.nfa.NFATestUtilities.compareMaps; +import static org.apache.flink.cep.nfa.NFATestUtilities.feedNFA; + +/** + * IT tests covering {@link Pattern#greedy()}. + */ +public class GreedyITCase extends TestLogger { + + @Test + public void testGreedyFollowedBy() { + ListinputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); + inputEvents.add(new StreamRecord<>(a1, 2)); + inputEvents.add(new StreamRecord<>(a2, 3)); + inputEvents.add(new StreamRecord<>(a3, 4)); + inputEvents.add(new StreamRecord<>(d, 5)); + + // c a* d + Pattern pattern = Pattern.begin("start").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("c"); + } + }).followedBy("middle").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("a"); + } + }).oneOrMore().optional().greedy().followedBy("end").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("d"); + } + }); + + NFA nfa = NFACompiler.compile(pattern, Event.createTypeSerializer(), false); + + final List resultingPatterns = feedNFA(inputEvents, nfa); + + compareMaps(resultingPatterns, Lists.
newArrayList( + Lists.newArrayList(c, a1, a2, a3, d) + )); + } + + @Test + public void testGreedyUtil() { + List
inputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 3.0); + Event a3 = new Event(43, "a", 4.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); + inputEvents.add(new
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103164#comment-16103164 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r129830324 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/pattern/Quantifier.java --- @@ -105,6 +107,14 @@ public void optional() { properties.add(Quantifier.QuantifierProperty.OPTIONAL); } + public void greedy() { + greedy = true; --- End diff -- Make sense. Updated. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103165#comment-16103165 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r129830367 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/pattern/Pattern.java --- @@ -492,4 +506,10 @@ private void checkIfQuantifierApplied() { "Current quantifier is: " + quantifier); } } + + private void checkIfNoFollowedByAny() { + if (quantifier.getConsumingStrategy() == ConsumingStrategy.SKIP_TILL_ANY) { --- End diff -- Make sense. Updated. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099835#comment-16099835 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on the issue: https://github.com/apache/flink/pull/4296 I've started reviewing it, but realised it is working as I expected only in case where the inner consuming strategy is `STRICT`. Let's have a look at test like this one: @Test public void testGreedyFollowedByInBetween() { ListinputEvents = new ArrayList<>(); Event c = new Event(40, "c", 1.0); Event a1 = new Event(41, "a", 2.0); Event a2 = new Event(42, "a", 2.0); Event a3 = new Event(43, "a", 2.0); Event d = new Event(44, "d", 3.0); inputEvents.add(new StreamRecord<>(c, 1)); inputEvents.add(new StreamRecord<>(new Event(1, "dummy", ), 2)); inputEvents.add(new StreamRecord<>(a1, 3)); inputEvents.add(new StreamRecord<>(new Event(1, "dummy", ), 4)); inputEvents.add(new StreamRecord<>(a2, 5)); inputEvents.add(new StreamRecord<>(new Event(1, "dummy", ), 6)); inputEvents.add(new StreamRecord<>(a3, 7)); inputEvents.add(new StreamRecord<>(d, 8)); // c a* d Pattern pattern = Pattern.begin("start").where(new SimpleCondition() { private static final long serialVersionUID = 5726188262756267490L; @Override public boolean filter(Event value) throws Exception { return value.getName().equals("c"); } }).followedBy("middle").where(new SimpleCondition() { private static final long serialVersionUID = 5726188262756267490L; @Override public boolean filter(Event value) throws Exception { return value.getName().equals("a"); } }).oneOrMore().optional().greedy().followedBy("end").where(new SimpleCondition() { private static final long serialVersionUID = 5726188262756267490L; @Override public boolean filter(Event value) throws Exception { return value.getName().equals("d"); } }); NFA nfa = NFACompiler.compile(pattern, Event.createTypeSerializer(), false); final List resultingPatterns = feedNFA(inputEvents, nfa); compareMaps(resultingPatterns, Lists.
newArrayList( Lists.newArrayList(c, a1, a2, a3, d) )); } I would expect only that one result but I get: Lists.newArrayList(c, a1, a2, a3, d), Lists.newArrayList(c, a1, a2, d), Lists.newArrayList(c, a1, d), Lists.newArrayList(c, d) Which is the same with or without the `greedy()` applied. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099829#comment-16099829 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r129262270 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/pattern/Pattern.java --- @@ -492,4 +506,10 @@ private void checkIfQuantifierApplied() { "Current quantifier is: " + quantifier); } } + + private void checkIfNoFollowedByAny() { + if (quantifier.getConsumingStrategy() == ConsumingStrategy.SKIP_TILL_ANY) { --- End diff -- I think it is a valid combination. I think the InnerConsumingStrategy should not be `SKIP_TILL_ANY`. So `followedByAny("loop").oneOrMore().greedy()` in my opinion is a valid one, but `followedByAny("loop").oneOrMore().allowCombinations().greedy()` is not. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099831#comment-16099831 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r129263762 --- Diff: flink-libraries/flink-cep/src/test/java/org/apache/flink/cep/nfa/GreedyITCase.java --- @@ -0,0 +1,290 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.cep.nfa; + +import org.apache.flink.cep.Event; +import org.apache.flink.cep.nfa.compiler.NFACompiler; +import org.apache.flink.cep.pattern.Pattern; +import org.apache.flink.cep.pattern.conditions.SimpleCondition; +import org.apache.flink.streaming.runtime.streamrecord.StreamRecord; +import org.apache.flink.util.TestLogger; + +import com.google.common.collect.Lists; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.List; + +import static org.apache.flink.cep.nfa.NFATestUtilities.compareMaps; +import static org.apache.flink.cep.nfa.NFATestUtilities.feedNFA; + +/** + * IT tests covering {@link Pattern#greedy()}. + */ +public class GreedyITCase extends TestLogger { + + @Test + public void testGreedyFollowedBy() { + ListinputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 2.0); + Event a3 = new Event(43, "a", 2.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); + inputEvents.add(new StreamRecord<>(a1, 2)); + inputEvents.add(new StreamRecord<>(a2, 3)); + inputEvents.add(new StreamRecord<>(a3, 4)); + inputEvents.add(new StreamRecord<>(d, 5)); + + // c a* d + Pattern pattern = Pattern.begin("start").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("c"); + } + }).followedBy("middle").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("a"); + } + }).oneOrMore().optional().greedy().followedBy("end").where(new SimpleCondition() { + private static final long serialVersionUID = 5726188262756267490L; + + @Override + public boolean filter(Event value) throws Exception { + return value.getName().equals("d"); + } + }); + + NFA nfa = NFACompiler.compile(pattern, Event.createTypeSerializer(), false); + + final List resultingPatterns = feedNFA(inputEvents, nfa); + + compareMaps(resultingPatterns, Lists.
newArrayList( + Lists.newArrayList(c, a1, a2, a3, d) + )); + } + + @Test + public void testGreedyUtil() { + List
inputEvents = new ArrayList<>(); + + Event c = new Event(40, "c", 1.0); + Event a1 = new Event(41, "a", 2.0); + Event a2 = new Event(42, "a", 3.0); + Event a3 = new Event(43, "a", 4.0); + Event d = new Event(44, "d", 3.0); + + inputEvents.add(new StreamRecord<>(c, 1)); + inputEvents.add(new
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16099830#comment-16099830 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on a diff in the pull request: https://github.com/apache/flink/pull/4296#discussion_r129261577 --- Diff: flink-libraries/flink-cep/src/main/java/org/apache/flink/cep/pattern/Quantifier.java --- @@ -105,6 +107,14 @@ public void optional() { properties.add(Quantifier.QuantifierProperty.OPTIONAL); } + public void greedy() { + greedy = true; --- End diff -- How about applying greedy to SINGLETON state? I think it is illegal combination. Also maybe change GREEDY into `QuantifierProperty`? It will decrease the number of methods/fields. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085582#comment-16085582 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on the issue: https://github.com/apache/flink/pull/4296 @dawidwys OK. Have a good time. :) > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085458#comment-16085458 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dawidwys commented on the issue: https://github.com/apache/flink/pull/4296 @dianfu Sorry for the delay, but unfortunately I will not have enough time for a proper review before my vacation. I will get back to it after 24.07. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085449#comment-16085449 ] ASF GitHub Bot commented on FLINK-7147: --- Github user dianfu commented on the issue: https://github.com/apache/flink/pull/4296 @dawidwys Could you help to take a look at this PR? Thanks a lot in advance. > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-7147) Support greedy quantifier in CEP
[ https://issues.apache.org/jira/browse/FLINK-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081841#comment-16081841 ] ASF GitHub Bot commented on FLINK-7147: --- GitHub user dianfu opened a pull request: https://github.com/apache/flink/pull/4296 [FLINK-7147] [cep] Support greedy quantifier in CEP Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration. If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide](http://flink.apache.org/how-to-contribute.html). In addition to going through the list, please provide a meaningful description of your changes. - [ ] General - The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text") - The pull request addresses only one issue - Each commit in the PR has a meaningful commit message (including the JIRA id) - [ ] Documentation - Documentation has been added for new functionality - Old documentation affected by the pull request has been updated - JavaDoc for public methods has been added - [ ] Tests & Build - Functionality added by the pull request is covered by tests - `mvn clean verify` has been executed successfully locally or a Travis build has passed You can merge this pull request into a Git repository by running: $ git pull https://github.com/dianfu/flink greedy_quantifier Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4296.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4296 commit 647d4c58309990d9fc924bc4343ef811dacf4ef1 Author: Dian FuDate: 2017-07-11T08:03:42Z [FLINK-7147] [cep] Support greedy quantifier in CEP > Support greedy quantifier in CEP > > > Key: FLINK-7147 > URL: https://issues.apache.org/jira/browse/FLINK-7147 > Project: Flink > Issue Type: Sub-task > Components: CEP, Table API & SQL >Reporter: Dian Fu >Assignee: Dian Fu > > Greedy quantifier will try to match the token as many times as possible. For > example, for pattern {{a b* c}} (skip till next is used) and inputs {{a b1 b2 > c}}, if the quantifier for {{b}} is greedy, it will only output {{a b1 b2 c}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)