Hi John, Thanks for the update, I'm +1 on changes and my +1 vote stands.
-Bill On Fri, Nov 16, 2018 at 4:19 PM John Roesler <j...@confluent.io> wrote: > Hi all, sorry to do this again, but during review of the code to add the > metrics proposed in this KIP, the reviewers and I noticed some > inconsistencies and drawbacks of the metrics I proposed in the KIP. > > Here's the diff: > > https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=24&selectedPageVersions=23 > > * The proposed metrics were all INFO level, but they would be updated on > every record, creating a performance concern. If we can refactor the > metrics framework in the future to resolve this concern, we may move the > metrics back to INFO level. > * having separate metrics for memory and disk buffers is unnecessarily > complex. The main utility is determining how close the buffer is to the > configured limit, which makes a single metric more useful. I've combined > them into one "suppression-buffer-size-*" metric. > * The "intermediate-result-suppression-*" metric would be equivalent to the > "process-*" metric which is already available on the ProcessorNode. I've > removed it from the KIP. > * The "suppression-mem-buffer-evict-*" metric had been proposed as a buffer > metric, but it makes more sense as a processor node metric, since its > counterpart is the "process-*" metric. I've replaced it with a processor > node metric, "suppression-emit-*" > > Let me know if you want to recast votes in response to this change. > > -John > > On Thu, Oct 4, 2018 at 11:26 AM John Roesler <j...@confluent.io> wrote: > > > Update: Here's a link to the documented eviction behavior: > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-BufferEvictionBehavior(akaSuppressEmitBehavior) > > > > On Thu, Oct 4, 2018 at 11:12 AM John Roesler <j...@confluent.io> wrote: > > > >> Hello again, all, > >> > >> During review, we realized that there is a relationship between this > >> (KIP-328) and KIP-372. > >> > >> KIP-372 proposed to allow naming *all* internal topics, and KIP-328 adds > >> a new internal topic (the changelog for the suppression buffer). > >> > >> However, we didn't consider this relationship in either KIP discussion, > >> possibly since they were discussed and accepted concurrently. > >> > >> I have updated KIP-328 to effectively "merge" the two KIPs by adding a > >> `withName` builder to Suppressed in the style of the other builders > added > >> in KIP-372: > >> > https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=20&selectedPageVersions=19 > >> . > >> > >> I think this should be uncontroversial, but as always, let me know of > any > >> objections you may have. > >> > >> > >> Also, note that I'll be updating the KIP to document the exact buffer > >> eviction behavior. I previously treated this as an internal > implementation > >> detail, but after consideration, I think users would want to know the > >> eviction semantics, especially if they are debugging their applications > and > >> scrutinizing the sequence of emitted records. > >> > >> Thanks, > >> -John > >> > >> On Thu, Sep 20, 2018 at 5:34 PM John Roesler <j...@confluent.io> wrote: > >> > >>> Hello all, > >>> > >>> During review of https://github.com/apache/kafka/pull/5567 for > KIP-328, > >>> the reviewers raised many good suggestions for the API. > >>> > >>> The basic design of the suppress operation remains the same, but the > >>> config object is (in my opinion) far more ergonomic with their > >>> suggestions. > >>> > >>> I have updated the KIP to reflect the new config ( > >>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-NewSuppressOperator > >>> ) > >>> > >>> Please let me know if anyone wishes to change their vote, and we call > >>> for a recast. > >>> > >>> Thanks, > >>> -John > >>> > >>> On Thu, Aug 23, 2018 at 12:54 PM Matthias J. Sax < > matth...@confluent.io> > >>> wrote: > >>> > >>>> It seems nobody has any objections against the change. > >>>> > >>>> That's for the KIP improvement. I'll go ahead and merge the PR. > >>>> > >>>> > >>>> -Matthias > >>>> > >>>> On 8/21/18 2:44 PM, John Roesler wrote: > >>>> > Hello again, all, > >>>> > > >>>> > I belatedly had a better idea for adding grace period to the Windows > >>>> class > >>>> > hierarchy (TimeWindows, UnlimitedWindows, JoinWindows). Instead of > >>>> > providing the grace-setter in the abstract class and having to > >>>> retract it > >>>> > in UnlimitedWindows, I've made the getter abstract method in Windows > >>>> and > >>>> > only added setters to Time and Join windows. > >>>> > > >>>> > This should not only improve the ergonomics of grace period, but > make > >>>> the > >>>> > whole class hierarchy more maintainable. > >>>> > > >>>> > See the PR for more details: > >>>> https://github.com/apache/kafka/pull/5536 > >>>> > > >>>> > I've updated the KIP accordingly. Here's the diff: > >>>> > > >>>> > https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=11&selectedPageVersions=9 > >>>> > > >>>> > Please let me know if this changes your vote. > >>>> > > >>>> > Thanks, > >>>> > -John > >>>> > > >>>> > On Mon, Aug 13, 2018 at 5:20 PM John Roesler <j...@confluent.io> > >>>> wrote: > >>>> > > >>>> >> Hey all, > >>>> >> > >>>> >> I just wanted to let you know that a few small issues surfaced > during > >>>> >> implementation and review. I've updated the KIP. Here's the diff: > >>>> >> > >>>> > https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409&selectedPageVersions=9&selectedPageVersions=8 > >>>> >> > >>>> >> Basically: > >>>> >> * the metrics named "*-event-*" are inconsistent with existing > >>>> >> nomenclature, and will be "*-record-*" instead (late records > instead > >>>> of > >>>> >> late events, for example) > >>>> >> * the apis taking and returning Duration will use long millis > >>>> instead. We > >>>> >> do want to transition to Duration in the future, but we shouldn't > do > >>>> it > >>>> >> piecemeal. > >>>> >> > >>>> >> Thanks, > >>>> >> -John > >>>> >> > >>>> >> On Tue, Aug 7, 2018 at 12:07 PM John Roesler <j...@confluent.io> > >>>> wrote: > >>>> >> > >>>> >>> Thanks everyone, KIP-328 has passed with 3 binding votes > (Guozhang, > >>>> >>> Damian, and Matthias) and 3 non-binding (Ted, Bill, and me). > >>>> >>> > >>>> >>> Thanks for your time, > >>>> >>> -John > >>>> >>> > >>>> >>> On Mon, Aug 6, 2018 at 6:35 PM Matthias J. Sax < > >>>> matth...@confluent.io> > >>>> >>> wrote: > >>>> >>> > >>>> >>>> +1 (binding) > >>>> >>>> > >>>> >>>> Thanks for the KIP. > >>>> >>>> > >>>> >>>> > >>>> >>>> -Matthias > >>>> >>>> > >>>> >>>> On 8/3/18 12:52 AM, Damian Guy wrote: > >>>> >>>>> Thanks John! +1 > >>>> >>>>> > >>>> >>>>> On Mon, 30 Jul 2018 at 23:58 Guozhang Wang <wangg...@gmail.com> > >>>> wrote: > >>>> >>>>> > >>>> >>>>>> Yes, the addendum lgtm as well. Thanks! > >>>> >>>>>> > >>>> >>>>>> On Mon, Jul 30, 2018 at 3:34 PM, John Roesler < > j...@confluent.io > >>>> > > >>>> >>>> wrote: > >>>> >>>>>> > >>>> >>>>>>> Another thing that came up after I started working on an > >>>> >>>> implementation > >>>> >>>>>> is > >>>> >>>>>>> that in addition to deprecating "retention" from the Windows > >>>> >>>> interface, > >>>> >>>>>> we > >>>> >>>>>>> also need to deprecate "segmentInterval", for the same > reasons. > >>>> I > >>>> >>>> simply > >>>> >>>>>>> overlooked it previously. I've updated the KIP accordingly. > >>>> >>>>>>> > >>>> >>>>>>> Hopefully, this doesn't change anyone's vote. > >>>> >>>>>>> > >>>> >>>>>>> Thanks, > >>>> >>>>>>> -John > >>>> >>>>>>> > >>>> >>>>>>> On Mon, Jul 30, 2018 at 5:31 PM John Roesler < > j...@confluent.io > >>>> > > >>>> >>>> wrote: > >>>> >>>>>>> > >>>> >>>>>>>> Thanks Guozhang, > >>>> >>>>>>>> > >>>> >>>>>>>> Thanks for that catch. to clarify, currently, events are > >>>> "late" only > >>>> >>>>>> when > >>>> >>>>>>>> they are older than the retention period. Currently, we > detect > >>>> this > >>>> >>>> in > >>>> >>>>>>> the > >>>> >>>>>>>> processor and record it as a "skipped-record". We then do not > >>>> >>>> attempt > >>>> >>>>>> to > >>>> >>>>>>>> store the event in the window store. If a user provided a > >>>> >>>>>> pre-configured > >>>> >>>>>>>> window store with a retention period smaller than the one > they > >>>> >>>> specify > >>>> >>>>>>> via > >>>> >>>>>>>> Windows#until, the segmented store will drop the update with > no > >>>> >>>> metric > >>>> >>>>>>> and > >>>> >>>>>>>> record a debug-level log. > >>>> >>>>>>>> > >>>> >>>>>>>> With KIP-328, with the introduction of "grace period" and > >>>> moving > >>>> >>>>>>> retention > >>>> >>>>>>>> fully into the state store, we need to have metrics for both > >>>> "late > >>>> >>>>>>> events" > >>>> >>>>>>>> (new records older than the grace period) and "expired window > >>>> >>>> events" > >>>> >>>>>>> (new > >>>> >>>>>>>> records for windows that are no longer retained in the state > >>>> >>>> store). I > >>>> >>>>>>>> already proposed metrics for the late events, and I've just > >>>> updated > >>>> >>>> the > >>>> >>>>>>> KIP > >>>> >>>>>>>> with metrics for the expired window events. I also updated > the > >>>> KIP > >>>> >>>> to > >>>> >>>>>>> make > >>>> >>>>>>>> it clear that neither late nor expired events will count as > >>>> >>>>>>>> "skipped-records" any more. > >>>> >>>>>>>> > >>>> >>>>>>>> -John > >>>> >>>>>>>> > >>>> >>>>>>>> On Mon, Jul 30, 2018 at 4:22 PM Guozhang Wang < > >>>> wangg...@gmail.com> > >>>> >>>>>>> wrote: > >>>> >>>>>>>> > >>>> >>>>>>>>> Hi John, > >>>> >>>>>>>>> > >>>> >>>>>>>>> Thanks for the updated KIP, +1 from me, and one minor > >>>> suggestion: > >>>> >>>>>>>>> > >>>> >>>>>>>>> Following your suggestion of the differentiation of > >>>> >>>> `skipped-records` > >>>> >>>>>>> v.s. > >>>> >>>>>>>>> `late-event-drop`, we should probably consider moving the > >>>> scenarios > >>>> >>>>>>> where > >>>> >>>>>>>>> records got ignored due the window not being available any > >>>> more in > >>>> >>>>>>>>> windowed > >>>> >>>>>>>>> aggregation operators from the `skipped-records` metrics > >>>> recording > >>>> >>>> to > >>>> >>>>>>> the > >>>> >>>>>>>>> `late-event-drop` metrics recording. > >>>> >>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>>>>>> Guozhang > >>>> >>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>>>>>> On Mon, Jul 30, 2018 at 1:36 PM, Bill Bejeck < > >>>> bbej...@gmail.com> > >>>> >>>>>> wrote: > >>>> >>>>>>>>> > >>>> >>>>>>>>>> Thanks for the KIP! > >>>> >>>>>>>>>> > >>>> >>>>>>>>>> +1 > >>>> >>>>>>>>>> > >>>> >>>>>>>>>> -Bill > >>>> >>>>>>>>>> > >>>> >>>>>>>>>> On Mon, Jul 30, 2018 at 3:42 PM Ted Yu < > yuzhih...@gmail.com> > >>>> >>>> wrote: > >>>> >>>>>>>>>> > >>>> >>>>>>>>>>> +1 > >>>> >>>>>>>>>>> > >>>> >>>>>>>>>>> On Mon, Jul 30, 2018 at 11:46 AM John Roesler < > >>>> j...@confluent.io > >>>> >>>>> > >>>> >>>>>>>>> wrote: > >>>> >>>>>>>>>>> > >>>> >>>>>>>>>>>> Hello devs, > >>>> >>>>>>>>>>>> > >>>> >>>>>>>>>>>> The discussion of KIP-328 has gone some time with no new > >>>> >>>>>> comments, > >>>> >>>>>>>>> so I > >>>> >>>>>>>>>>> am > >>>> >>>>>>>>>>>> calling for a vote! > >>>> >>>>>>>>>>>> > >>>> >>>>>>>>>>>> Here's the KIP: > >>>> https://cwiki.apache.org/confluence/x/sQU0BQ > >>>> >>>>>>>>>>>> > >>>> >>>>>>>>>>>> The basic idea is to provide: > >>>> >>>>>>>>>>>> * more usable control over update rate (vs the current > >>>> state > >>>> >>>>>> store > >>>> >>>>>>>>>>> caches) > >>>> >>>>>>>>>>>> * the final-result-for-windowed-computations feature > which > >>>> >>>>>>> several > >>>> >>>>>>>>>> people > >>>> >>>>>>>>>>>> have requested > >>>> >>>>>>>>>>>> > >>>> >>>>>>>>>>>> Thanks, > >>>> >>>>>>>>>>>> -John > >>>> >>>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>>>>>> -- > >>>> >>>>>>>>> -- Guozhang > >>>> >>>>>>>>> > >>>> >>>>>>>> > >>>> >>>>>>> > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> > >>>> >>>>>> -- > >>>> >>>>>> -- Guozhang > >>>> >>>>>> > >>>> >>>>> > >>>> >>>> > >>>> >>>> > >>>> > > >>>> > >>>> >