Update: Here's a link to the documented eviction behavior: https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-BufferEvictionBehavior(akaSuppressEmitBehavior)
On Thu, Oct 4, 2018 at 11:12 AM John Roesler <j...@confluent.io> wrote: > Hello again, all, > > During review, we realized that there is a relationship between this > (KIP-328) and KIP-372. > > KIP-372 proposed to allow naming *all* internal topics, and KIP-328 adds a > new internal topic (the changelog for the suppression buffer). > > However, we didn't consider this relationship in either KIP discussion, > possibly since they were discussed and accepted concurrently. > > I have updated KIP-328 to effectively "merge" the two KIPs by adding a > `withName` builder to Suppressed in the style of the other builders added > in KIP-372: > https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=20&selectedPageVersions=19 > . > > I think this should be uncontroversial, but as always, let me know of any > objections you may have. > > > Also, note that I'll be updating the KIP to document the exact buffer > eviction behavior. I previously treated this as an internal implementation > detail, but after consideration, I think users would want to know the > eviction semantics, especially if they are debugging their applications and > scrutinizing the sequence of emitted records. > > Thanks, > -John > > On Thu, Sep 20, 2018 at 5:34 PM John Roesler <j...@confluent.io> wrote: > >> Hello all, >> >> During review of https://github.com/apache/kafka/pull/5567 for KIP-328, >> the reviewers raised many good suggestions for the API. >> >> The basic design of the suppress operation remains the same, but the >> config object is (in my opinion) far more ergonomic with their >> suggestions. >> >> I have updated the KIP to reflect the new config ( >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-NewSuppressOperator >> ) >> >> Please let me know if anyone wishes to change their vote, and we call for >> a recast. >> >> Thanks, >> -John >> >> On Thu, Aug 23, 2018 at 12:54 PM Matthias J. Sax <matth...@confluent.io> >> wrote: >> >>> It seems nobody has any objections against the change. >>> >>> That's for the KIP improvement. I'll go ahead and merge the PR. >>> >>> >>> -Matthias >>> >>> On 8/21/18 2:44 PM, John Roesler wrote: >>> > Hello again, all, >>> > >>> > I belatedly had a better idea for adding grace period to the Windows >>> class >>> > hierarchy (TimeWindows, UnlimitedWindows, JoinWindows). Instead of >>> > providing the grace-setter in the abstract class and having to retract >>> it >>> > in UnlimitedWindows, I've made the getter abstract method in Windows >>> and >>> > only added setters to Time and Join windows. >>> > >>> > This should not only improve the ergonomics of grace period, but make >>> the >>> > whole class hierarchy more maintainable. >>> > >>> > See the PR for more details: https://github.com/apache/kafka/pull/5536 >>> > >>> > I've updated the KIP accordingly. Here's the diff: >>> > >>> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=11&selectedPageVersions=9 >>> > >>> > Please let me know if this changes your vote. >>> > >>> > Thanks, >>> > -John >>> > >>> > On Mon, Aug 13, 2018 at 5:20 PM John Roesler <j...@confluent.io> >>> wrote: >>> > >>> >> Hey all, >>> >> >>> >> I just wanted to let you know that a few small issues surfaced during >>> >> implementation and review. I've updated the KIP. Here's the diff: >>> >> >>> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=9&selectedPageVersions=8 >>> >> >>> >> Basically: >>> >> * the metrics named "*-event-*" are inconsistent with existing >>> >> nomenclature, and will be "*-record-*" instead (late records instead >>> of >>> >> late events, for example) >>> >> * the apis taking and returning Duration will use long millis >>> instead. We >>> >> do want to transition to Duration in the future, but we shouldn't do >>> it >>> >> piecemeal. >>> >> >>> >> Thanks, >>> >> -John >>> >> >>> >> On Tue, Aug 7, 2018 at 12:07 PM John Roesler <j...@confluent.io> >>> wrote: >>> >> >>> >>> Thanks everyone, KIP-328 has passed with 3 binding votes (Guozhang, >>> >>> Damian, and Matthias) and 3 non-binding (Ted, Bill, and me). >>> >>> >>> >>> Thanks for your time, >>> >>> -John >>> >>> >>> >>> On Mon, Aug 6, 2018 at 6:35 PM Matthias J. Sax < >>> matth...@confluent.io> >>> >>> wrote: >>> >>> >>> >>>> +1 (binding) >>> >>>> >>> >>>> Thanks for the KIP. >>> >>>> >>> >>>> >>> >>>> -Matthias >>> >>>> >>> >>>> On 8/3/18 12:52 AM, Damian Guy wrote: >>> >>>>> Thanks John! +1 >>> >>>>> >>> >>>>> On Mon, 30 Jul 2018 at 23:58 Guozhang Wang <wangg...@gmail.com> >>> wrote: >>> >>>>> >>> >>>>>> Yes, the addendum lgtm as well. Thanks! >>> >>>>>> >>> >>>>>> On Mon, Jul 30, 2018 at 3:34 PM, John Roesler <j...@confluent.io> >>> >>>> wrote: >>> >>>>>> >>> >>>>>>> Another thing that came up after I started working on an >>> >>>> implementation >>> >>>>>> is >>> >>>>>>> that in addition to deprecating "retention" from the Windows >>> >>>> interface, >>> >>>>>> we >>> >>>>>>> also need to deprecate "segmentInterval", for the same reasons. I >>> >>>> simply >>> >>>>>>> overlooked it previously. I've updated the KIP accordingly. >>> >>>>>>> >>> >>>>>>> Hopefully, this doesn't change anyone's vote. >>> >>>>>>> >>> >>>>>>> Thanks, >>> >>>>>>> -John >>> >>>>>>> >>> >>>>>>> On Mon, Jul 30, 2018 at 5:31 PM John Roesler <j...@confluent.io> >>> >>>> wrote: >>> >>>>>>> >>> >>>>>>>> Thanks Guozhang, >>> >>>>>>>> >>> >>>>>>>> Thanks for that catch. to clarify, currently, events are "late" >>> only >>> >>>>>> when >>> >>>>>>>> they are older than the retention period. Currently, we detect >>> this >>> >>>> in >>> >>>>>>> the >>> >>>>>>>> processor and record it as a "skipped-record". We then do not >>> >>>> attempt >>> >>>>>> to >>> >>>>>>>> store the event in the window store. If a user provided a >>> >>>>>> pre-configured >>> >>>>>>>> window store with a retention period smaller than the one they >>> >>>> specify >>> >>>>>>> via >>> >>>>>>>> Windows#until, the segmented store will drop the update with no >>> >>>> metric >>> >>>>>>> and >>> >>>>>>>> record a debug-level log. >>> >>>>>>>> >>> >>>>>>>> With KIP-328, with the introduction of "grace period" and moving >>> >>>>>>> retention >>> >>>>>>>> fully into the state store, we need to have metrics for both >>> "late >>> >>>>>>> events" >>> >>>>>>>> (new records older than the grace period) and "expired window >>> >>>> events" >>> >>>>>>> (new >>> >>>>>>>> records for windows that are no longer retained in the state >>> >>>> store). I >>> >>>>>>>> already proposed metrics for the late events, and I've just >>> updated >>> >>>> the >>> >>>>>>> KIP >>> >>>>>>>> with metrics for the expired window events. I also updated the >>> KIP >>> >>>> to >>> >>>>>>> make >>> >>>>>>>> it clear that neither late nor expired events will count as >>> >>>>>>>> "skipped-records" any more. >>> >>>>>>>> >>> >>>>>>>> -John >>> >>>>>>>> >>> >>>>>>>> On Mon, Jul 30, 2018 at 4:22 PM Guozhang Wang < >>> wangg...@gmail.com> >>> >>>>>>> wrote: >>> >>>>>>>> >>> >>>>>>>>> Hi John, >>> >>>>>>>>> >>> >>>>>>>>> Thanks for the updated KIP, +1 from me, and one minor >>> suggestion: >>> >>>>>>>>> >>> >>>>>>>>> Following your suggestion of the differentiation of >>> >>>> `skipped-records` >>> >>>>>>> v.s. >>> >>>>>>>>> `late-event-drop`, we should probably consider moving the >>> scenarios >>> >>>>>>> where >>> >>>>>>>>> records got ignored due the window not being available any >>> more in >>> >>>>>>>>> windowed >>> >>>>>>>>> aggregation operators from the `skipped-records` metrics >>> recording >>> >>>> to >>> >>>>>>> the >>> >>>>>>>>> `late-event-drop` metrics recording. >>> >>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>> Guozhang >>> >>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>> On Mon, Jul 30, 2018 at 1:36 PM, Bill Bejeck < >>> bbej...@gmail.com> >>> >>>>>> wrote: >>> >>>>>>>>> >>> >>>>>>>>>> Thanks for the KIP! >>> >>>>>>>>>> >>> >>>>>>>>>> +1 >>> >>>>>>>>>> >>> >>>>>>>>>> -Bill >>> >>>>>>>>>> >>> >>>>>>>>>> On Mon, Jul 30, 2018 at 3:42 PM Ted Yu <yuzhih...@gmail.com> >>> >>>> wrote: >>> >>>>>>>>>> >>> >>>>>>>>>>> +1 >>> >>>>>>>>>>> >>> >>>>>>>>>>> On Mon, Jul 30, 2018 at 11:46 AM John Roesler < >>> j...@confluent.io >>> >>>>> >>> >>>>>>>>> wrote: >>> >>>>>>>>>>> >>> >>>>>>>>>>>> Hello devs, >>> >>>>>>>>>>>> >>> >>>>>>>>>>>> The discussion of KIP-328 has gone some time with no new >>> >>>>>> comments, >>> >>>>>>>>> so I >>> >>>>>>>>>>> am >>> >>>>>>>>>>>> calling for a vote! >>> >>>>>>>>>>>> >>> >>>>>>>>>>>> Here's the KIP: >>> https://cwiki.apache.org/confluence/x/sQU0BQ >>> >>>>>>>>>>>> >>> >>>>>>>>>>>> The basic idea is to provide: >>> >>>>>>>>>>>> * more usable control over update rate (vs the current state >>> >>>>>> store >>> >>>>>>>>>>> caches) >>> >>>>>>>>>>>> * the final-result-for-windowed-computations feature which >>> >>>>>>> several >>> >>>>>>>>>> people >>> >>>>>>>>>>>> have requested >>> >>>>>>>>>>>> >>> >>>>>>>>>>>> Thanks, >>> >>>>>>>>>>>> -John >>> >>>>>>>>>>>> >>> >>>>>>>>>>> >>> >>>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>> -- >>> >>>>>>>>> -- Guozhang >>> >>>>>>>>> >>> >>>>>>>> >>> >>>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> -- >>> >>>>>> -- Guozhang >>> >>>>>> >>> >>>>> >>> >>>> >>> >>>> >>> > >>> >>>