Thanks Aljioscha In fact the onFire() method proposed in the JIRA is also included in the FLIP. The onCleanup() I agree that it would be a nice addition as it makes the API more complete. Now we have an onMerge(), an onFire() and onCleanup() which allow a trigger to react to every milestone in the lifespan of a window.
Kostas > On Aug 17, 2016, at 5:31 PM, Aljoscha Krettek <aljos...@apache.org> wrote: > > Hi, > I opened this Jira which should help in implementing the Trigger DSL but is > also independent in that it just enhances the range of things that can be > done with a Trigger: > https://issues.apache.org/jira/browse/FLINK-4415 > > Cheers, > Aljoscha > > On Wed, 17 Aug 2016 at 14:38 Jark Wu <wuchong...@alibaba-inc.com> wrote: > >> Hi Aljoscha, Kostas, thanks for your detailed explanation. It makes sense. >> >> According to the discarding and accumulating, the FLIP says “the mode of >> parent trigger overwrites that of its children”. That means Trigger decide >> whether to discard window contents after firing , right ? But I find the >> origin google doc[1] proposed the Trigger only decide whether to fire or >> not while the purging behavior is determined by a setting on >> WindowedStream. Such as : >> >> datastream.keyBy(0) >> .window(windowAssigner) >> .trigger(compositeTrigger) >> .accumulating() >> >> >> [1] >> https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit#heading=h.e40hqtu6za6u >> < >> https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit#heading=h.e40hqtu6za6u >>> >> >> - Jark Wu >> >>> 在 2016年8月17日,下午6:12,Aljoscha Krettek <aljos...@apache.org> 写道: >>> >>> Hi, >>> I think that would blow up state since there can be several triggers that >>> need this kind of state, Any and All come to mind, possibly. If each of >>> those keeps state that's at least a byte per trigger. If the finished >> state >>> were kept centrally by the TriggerRunner it would just be one byte for >>> everything, in most cases. >>> >>> As I said, in some cases keeping that extra bit can be avoided. For >>> example, if you have Repeat.forever(Some.trigger()) you know that the >>> finished bit will always be false and so you don't keep any state in the >>> TriggerRunner. If every trigger manually does that bookkeeping you remove >>> that possibility while increasing complexity in each Trigger >> implementation. >>> >>> Cheers, >>> Aljoscha >>> >>> On Wed, 17 Aug 2016 at 12:05 Kostas Kloudas <k.klou...@data-artisans.com >>> >>> wrote: >>> >>>> Hi Aljoscha, >>>> >>>> On the Repeat.? addition, I think that each trigger will have to have >>>> its own implementation, e.g. the CountTrigger should just set a dummy >>>> value in the counter in order to know if it should fire again or not. >>>> >>>> In other case, we will have to add more state and this can lead to >>>> significant >>>> performance degradation, as in most cases this state has to be checked >> on >>>> every element. >>>> >>>> Another potential solution, which I am not sure if it covers all cases, >>>> could >>>> be to have a State abstraction like CompositeState, apart from the >>>> Value, List, Reduce, Fold, which can fetch more than one types of state >>>> with one round trip to the backend. Imagine having the “counter" and the >>>> “canceled” states in the same entry in the backend and always fetch them >>>> together. This can lead to zero additional cost for the extra state. >>>> >>>> What do you think? >>>> >>>> Kostas >>>> >>>>> On Aug 17, 2016, at 11:57 AM, Aljoscha Krettek <aljos...@apache.org> >>>> wrote: >>>>> >>>>> Regarding Repeat.forever() and the default being to not repeat. The >>>> simple >>>>> reason is that Beam (née Google Dataflow) provides basically the same >>>> thing >>>>> with their trigger DSL and that their triggers behave like this. I >> think >>>> it >>>>> would not be beneficial to have the same feature in two systems in that >>>>> space where the behavior is the opposite. That would make it confusing >>>> for >>>>> users. >>>>> >>>>> On the implementation side, I think in most cases you need to have a >> way >>>> of >>>>> telling when triggers are finished or not anyways. There could be a >>>> central >>>>> component in the TriggerRunner that has a finished bit for every >> trigger >>>> in >>>>> the tree. In most cases this would be a simple byte. Triggers could set >>>> and >>>>> query this finished bit. In some cases, where you know that triggers >> can >>>>> never finish you could have a dummy implementation of the finished set >>>> that >>>>> does not store any state and always returns false when queried. >>>>> >>>>> On Wed, 17 Aug 2016 at 11:52 Aljoscha Krettek <aljos...@apache.org> >>>> wrote: >>>>> >>>>>> Kostas already nicely explained this! >>>>>> >>>>>> I just want to give some theoretical background. I see the underlying >>>> idea >>>>>> of triggers similar to predicates, i.e. >>>>>> >>>> >> "EventTimeTrigger.afterEndOfWindow().withEarlyTrigger(earlyFiringTrigger)" >>>>>> translates to a predicate "(E and ET) or WT" (where E is a predicate >>>> that >>>>>> is true when we are in early phase, ET is the early trigger and WT is >>>> the >>>>>> watermark trigger). The other trigger translates to "(!E and LT) or >> WT", >>>>>> i.e. it triggers if we're not early and LT is true or if the watermark >>>>>> trigger is true. If we combine the two we get: >>>>>> >>>>>> ((E and ET) or WT) and ((!E and LT) or WT) >>>>>> >>>>>> now we can eliminate the two parts with E and !E because they can >> never >>>> be >>>>>> true and are in an "or": >>>>>> >>>>>> WT and WT >>>>>> >>>>>> which yield just "WT". >>>>>> >>>>>> Hope that makes sense to you. >>>>>> >>>>>> Cheers, >>>>>> Aljoscha >>>>>> >>>>>> >>>>>> On Wed, 17 Aug 2016 at 10:47 Kostas Kloudas < >>>> k.klou...@data-artisans.com> >>>>>> wrote: >>>>>> >>>>>>> Hello Jark Wu, >>>>>>> >>>>>>> Both of them will work in the new DSL. The idea is that there should >>>> be no >>>>>>> restrictions on the combinations one can do. >>>>>>> >>>>>>> Coming to what does the early and the late trigger do, the early >>>> trigger >>>>>>> will >>>>>>> be responsible for specifying when the trigger should fire in the >>>> period >>>>>>> between >>>>>>> the beginning of the window and the time when the watermark passes >> the >>>> end >>>>>>> of the window. The late trigger takes over after the watermark passes >>>> the >>>>>>> end of >>>>>>> the window, and specifies when the trigger should fire in the period >>>>>>> between the >>>>>>> endOfWindow and endOfWindow + allowedLateness. >>>>>>> >>>>>>> So in the case of the: >>>>>>> All(EventTimeTrigger.afterEndOfWindow() >>>>>>> .withEarlyTrigger(earlyFiringTrigger), >>>>>>> EventTimeTrigger.afterEndOfWindow() >>>>>>> .withLateTrigger(lateFiringTrigger)) >>>>>>> >>>>>>> The trigger will only fire at the end of the window, as this is the >>>> only >>>>>>> time both >>>>>>> triggers will say FIRE. >>>>>>> >>>>>>> Although the above will work, the example that you gave is a nice one >>>> as >>>>>>> it >>>>>>> degenerates to an: >>>>>>> >>>>>>> EventTimeTrigger.afterEndOfWindow() >>>>>>> >>>>>>> Detecting this and giving the simplest trigger for the job can lead >> to >>>>>>> further >>>>>>> optimizations, as it can for example reduce the amount of state the >>>>>>> trigger has to keep. >>>>>>> >>>>>>> That would actually be a very nice addition to have as in some cases >> it >>>>>>> can lead >>>>>>> to performance improvements. >>>>>>> >>>>>>> Thanks for the feedback! >>>>>>> >>>>>>> Kostas >>>>>>> >>>>>>>> On Aug 17, 2016, at 4:36 AM, Jark Wu <wuchong...@alibaba-inc.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> It’s a cool design, I really like it ! I have two questions here. >>>>>>>> >>>>>>>> The first is whether do we have the complex composite triggers, i.e. >>>>>>> nested All and Any. Such as : >>>>>>>> >>>>>>>> Any( >>>>>>>> All(trigger1, trigger2), >>>>>>>> Any(trigger3, trigger4) >>>>>>>> ) >>>>>>>> >>>>>>>> Can the above code work? >>>>>>>> >>>>>>>> Another question is : In composite triggers, what’s the behavior of >>>>>>> withEarlyTrigger and withLateTrigger ? For example, >>>>>>>> >>>>>>>> All(EventTimeTrigger.afterEndOfWindow() >>>>>>>> .withEarlyTrigger(earlyFiringTrigger), >>>>>>>> EventTimeTrigger.afterEndOfWindow() >>>>>>>> .withLateTrigger(lateFiringTrigger)) >>>>>>>> >>>>>>>> Is it legal? Will the earlyFiringTrigger and lateFiringTrigger both >>>>>>> work ? >>>>>>>> >>>>>>>> >>>>>>>> - Jark Wu >>>>>>>> >>>>>>>>> 在 2016年8月17日,上午12:24,Kostas Kloudas <k.klou...@data-artisans.com> >>>> 写道: >>>>>>>>> >>>>>>>>> Hi Aljoscha, >>>>>>>>> >>>>>>>>> Thanks for the feedback! >>>>>>>>> >>>>>>>>> It is a nice feature to have. The reason it is not included in the >>>> FLIP >>>>>>>>> is that I have not seen somebody asking for something similar in >> the >>>>>>>>> mailing list. >>>>>>>>> >>>>>>>>> A point that I have to add is that it seems (from the user ML) that >>>>>>>>> most of the times users expect the “Repeated.forever” behavior to >>>>>>>>> be the default. >>>>>>>>> >>>>>>>>> Given this, I would say that we should make this the default and >>>>>>>>> add something like “Repeat.Once” option which will just let the >>>> trigger >>>>>>>>> fire once, e.g. the first time the counter reaches 5 in your >> example, >>>>>>>>> and then stop. >>>>>>>>> >>>>>>>>> In other case, the trigger specification may become too verbose, >>>>>>>>> as the user will have to write the “Repeat.forever” for all child >>>>>>> triggers. >>>>>>>>> >>>>>>>>> What do you think? >>>>>>>>> >>>>>>>>> Kostas >>>>>>>>> >>>>>>>>>> On Aug 16, 2016, at 4:38 PM, Aljoscha Krettek < >> aljos...@apache.org> >>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Ah, I just read the document again and noticed that it might be >> good >>>>>>> to >>>>>>>>>> differentiate between repeatable triggers and non-repeating >>>> triggers. >>>>>>> I'm >>>>>>>>>> proposing to make most triggers non-repeating with the addition >> of a >>>>>>>>>> trigger that makes other triggers repeatable. >>>>>>>>>> >>>>>>>>>> Example Non-Repeating: >>>>>>>>>> EventTimeTrigger.pastEndOfWindow() >>>>>>>>>> .withEarlyFiring(CountTrigger.of(5)) >>>>>>>>>> >>>>>>>>>> this gives me an early firing once I got 5 elements and then an >>>>>>> on-time >>>>>>>>>> firing once the watermark passes the end of the window. >>>>>>>>>> >>>>>>>>>> Example with Repeating: >>>>>>>>>> EventTimeTrigger.pastEndOfWindow() >>>>>>>>>> .withEarlyFiring(Repeated.forever(CountTrigger.of(5))) >>>>>>>>>> >>>>>>>>>> this gives me early firings whenever I see 5 new elements plus the >>>>>>>>>> watermark firing. >>>>>>>>>> >>>>>>>>>> What do you think? >>>>>>>>>> >>>>>>>>>> On Tue, 16 Aug 2016 at 15:31 Kostas Kloudas < >>>>>>> k.klou...@data-artisans.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Thanks Till! >>>>>>>>>>> >>>>>>>>>>> Kostas >>>>>>>>>>> >>>>>>>>>>>> On Aug 16, 2016, at 3:30 PM, Till Rohrmann < >> trohrm...@apache.org> >>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Cool design doc Klou. It's well described with a lot of >> details. I >>>>>>> like >>>>>>>>>>> it >>>>>>>>>>>> a lot :-) +1 for implementing the trigger DSL. >>>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> Till >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Aug 16, 2016 at 3:18 PM, Kostas Kloudas < >>>>>>>>>>> k.klou...@data-artisans.com >>>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Thanks for the feedback Ufuk! >>>>>>>>>>>>> I will do that. >>>>>>>>>>>>> >>>>>>>>>>>>>> On Aug 16, 2016, at 1:41 PM, Ufuk Celebi <u...@apache.org> >>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hey Kostas! Thanks for sharing the documents. I think it makes >>>>>>> sense >>>>>>>>>>>>>> to merge the two documents by moving the Google doc contents >> to >>>>>>> the >>>>>>>>>>>>>> Wiki. I think they form one unit. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Aug 16, 2016 at 12:34 PM, Kostas Kloudas >>>>>>>>>>>>>> <k.klou...@data-artisans.com> wrote: >>>>>>>>>>>>>>> Hi all! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I've created a FLIP for the trigger DSL. This is the triggers >>>>>>>>>>>>>>> that we want Apache Flink to support out-of-the-box. This >>>>>>> proposal >>>>>>>>>>>>>>> builds on various discussions on the mailing list and aims at >>>>>>>>>>>>>>> serving as a base for further ones. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-9%3A+Trigger+DSL >>>>>>>>>>>>> < >>>>>>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-9:+Trigger+DSL> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> FLIP-9 provides a description of the triggers Flink already >>>>>>> offers, >>>>>>>>>>>>>>> the new that we think should be added, how the APIs could >> look >>>>>>> like, >>>>>>>>>>>>>>> some discussion on the implementation implications and some >>>> ideas >>>>>>>>>>>>>>> on how to implement them. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> There is also a shared document giving a bit more insight on >>>> the >>>>>>>>>>>>> implementation >>>>>>>>>>>>>>> implications. Feel free to read but please keep the >> discussion >>>>>>> in the >>>>>>>>>>>>> mailing list. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://docs.google.com/a/data-artisans.com/document/d/ >>>>>>>>>>>>> 1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing < >>>>>>>>>>>>> https://docs.google.com/a/data-artisans.com/document/d/ >>>>>>>>>>>>> 1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I would like to start working on an the implementation next >>>> week. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Let the discussion begin! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Kostas >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>> >>>> >> >>