Thanks for reviving this KIP, Leah. I agree that we should not extend the scope of this KIP to potentially deprecate/rename the `default.windowed.[key|value].serde.inner` configs.
@Sophie: if you feel strong about it, let's do a separate KIP. +1 (binding) -Matthias On 1/19/21 3:00 PM, John Roesler wrote: > Hi all, > > I've just caught up on the thread, and FWIW, I'm still +1. > > Thanks, > -John > > On Mon, 2021-01-18 at 21:53 -0800, Guozhang Wang wrote: >> Read the above updates and the KIP's scope. Makes sense to me. +1 still >> counts :) >> >> On Wed, Jan 13, 2021 at 2:04 PM Sophie Blee-Goldman <sop...@confluent.io> >> wrote: >> >>> That sounds good to me. Thanks for reviving this >>> >>> Sophie >>> >>> On Wed, Jan 13, 2021 at 7:47 AM Leah Thomas <ltho...@confluent.io> wrote: >>> >>>> Hey all, >>>> >>>> Bringing this back up for discussion. >>>> >>>> It seems like the next steps are to: >>>> 1. rename the config "window.size.ms" >>>> 2. ensure that users set window size EITHER through the config OR through >>>> the constructor. On this note, it may make sense to remove the default >>> for >>>> the `window.size.ms` config, so that there won't be a fall back if the >>>> window size isn't set in either spot. WDYT? This could also address the >>>> issue of multiple window sizes within a streams app. >>>> >>>> I see what Sophie is saying about the `default.windowed.key.serde.inner` >>>> config, but I do think deprecating and moving those configs would >>> require a >>>> larger discussion. I'm open to looping them into this KIP if we feel like >>>> it's vital (or incredibly convenient with low cost to users), but my >>>> initial reaction is to leave that out for now and work within the current >>>> set-up for window size. >>>> >>>> Thanks for all the comments so far, >>>> Leah >>>> >>>> On Tue, Sep 29, 2020 at 10:44 PM Sophie Blee-Goldman < >>> sop...@confluent.io> >>>> wrote: >>>> >>>>> There are two cases where you need to specify the window size -- >>> directly >>>>> using a >>>>> Consumer (eg the console consumer) or reading as an input topic within >>>>> Streams. >>>>> We need a config for the first case, since you can't pass a >>> Deserializer >>>>> object to the >>>>> console consumer. In the Streams case, the reverse is true, and you >>> have >>>> to >>>>> pass in >>>>> an actual Serde object. >>>>> >>>>> Imo we should keep these two cases separate and not use the config for >>>> the >>>>> Streams >>>>> case at all. But that's hard to enforce (we'd have to strip the config >>>> out >>>>> of the user's >>>>> StreamsConfig if they tried to use it within Streams, for example) and >>> it >>>>> also puts us >>>>> in an awkward position due to the default.windowed.inner.serde.class >>>>> configs. If >>>>> they can specify the inner serde class through their Streams app >>> config, >>>>> they >>>>> should be able to specify the window size through config as well. >>>> Otherwise >>>>> we >>>>> either force a mix-and-match as Matthias described, or you just always >>>> have >>>>> to >>>>> specify both the inner class and the window size in the constructor, at >>>>> which >>>>> point, why even have the default.windowed.inner.serde.class config at >>>> all? >>>>> >>>>> ... >>>>> that's not a rhetorical question, actually. Personally I do think we >>>> should >>>>> deprecate the default.windowed.serde.inner.class and replace it with >>>>> separate >>>>> windowed.serializer.inner.class/windowed.deserializer.inner.class >>>> configs. >>>>> This >>>>> way it's immediately obvious that the configs are only for the >>>>> Consumer/Producer, >>>>> and you should construct your own TimeWindowedSerde with all the >>>> necessary >>>>> parameters for use in your Streams app. >>>>> >>>>> That might be too radical, and maybe the problem isn't worth the burden >>>> of >>>>> forcing users to change their code and replace the config with actual >>>> Serde >>>>> objects. But it should be an easy change to make, and if it isn't, >>> that's >>>>> probably a good sign that you're using the serde incorrectly somewhere. >>>>> >>>>> If we don't deprecate the default.windowed.serde.inner.class, then it's >>>>> less clear >>>>> to me how to proceed. The only really consistent thing to do seems to >>> be >>>> to >>>>> name and position the new window size config as a default config and >>>> allow >>>>> it to be used similar to the default inner class configs. Which, as >>>>> established >>>>> throughout this discussion, seems very very wrong >>>>> >>>>> So yes, I think we should just stick with the original name >>>> window.size.ms >>>>> . >>>>> Or >>>>> better yet, call it windowed.deserializer.window.size.ms and name the >>>>> default.windowed.serde.inner.class replacements >>>>> windowed.deserializer.inner.class and windowed.serializer.inner.class >>>>> >>>>> On Tue, Sep 8, 2020 at 2:07 PM Matthias J. Sax <mj...@apache.org> >>> wrote: >>>>> >>>>>> From my understanding, the KIP aims for the case when a user does not >>>>>> control the code, eg, when using the command line consumer (or >>> similar >>>>>> tools). >>>>>> >>>>>> If the user writes code, we should always encourage her to >>> instantiate >>>>>> the deserializer explicitly and not relying on reflection+config. >>>>>> >>>>>> I also don't think that the `default` prefix does make sense, as it >>>>>> indicates that there might be a non-default. However, IMHO, we should >>>>>> not allow "overwrite semantics" but rather throw an exception if the >>>>>> config is set and a window size is provided via the constructor. We >>>>>> should not allow to mix-and-match both and should stick to a strict >>>>>> either-or pattern. >>>>>> >>>>>> >>>>>> -Matthias >>>>>> >>>>>> On 9/8/20 11:52 AM, Guozhang Wang wrote: >>>>>>> Hi Sophie, >>>>>>> >>>>>>> Seems I do have some mis-understanding of the KIP's motivation here >>>> :) >>>>>> Just >>>>>>> for clarification my reasoning is that: >>>>>>> >>>>>>> 1) today Streams itself never uses a windowed deserializer itself >>>> since >>>>>> its >>>>>>> built-in operators only need the serializer and users do not need >>> to >>>>>>> override it, plus standby / restore active tasks would just copy >>> the >>>>>> bytes >>>>>>> directly. So this KIP's motivation is not for Stream's own code >>>>> anyways. >>>>>>> >>>>>>> 2) It is only when user specified serde is missing the window size, >>>>> which >>>>>>> is either when a) one is trying to read a source topic as windowed >>>>>> records >>>>>>> in Streams, this is a big blocker for KIP-300, and when b) one is >>>>> trying >>>>>> to >>>>>>> read a topic as windowed records in Consumer, we would have issues >>> if >>>>>> users >>>>>>> fail to use the appropriate serde constructs. >>>>>>> >>>>>>> I thought the main motivation of this KIP is for 2.a), in which we >>>>> would >>>>>>> just encourage the users to use the right constructor with the >>> window >>>>>> size >>>>>>> by deprecating the other constructs. But I'm not sure how this >>> would >>>>> help >>>>>>> with 2.b) since the proposal is on adding to StreamsConfig. If it >>> is >>>>> the >>>>>>> case, then I agree that probably we can just not add an extra >>> config >>>>> but >>>>>>> just deprecating the constructs. >>>>>>> >>>>>>> >>>>>>> Guozhang >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Sep 8, 2020 at 10:50 AM Sophie Blee-Goldman < >>>>> sop...@confluent.io >>>>>>> >>>>>>> wrote: >>>>>>> >>>>>>>> Hey Guozhang & Leah, >>>>>>>> >>>>>>>> I want to push back a bit on the assumption that we would fall >>> back >>>> on >>>>>> this >>>>>>>> config >>>>>>>> in the case of an unspecified window size in a Streams serde. I >>>> don't >>>>>> think >>>>>>>> it should >>>>>>>> be a default at all, either in name or in effect. To borrow the >>>>>> rhetorical >>>>>>>> question that >>>>>>>> John raised earlier: what is the default window size of an >>>>> application? >>>>>>>> >>>>>>>> Personally, I agree that that doesn't make much sense. Sure, if >>> you >>>>> only >>>>>>>> have a single >>>>>>>> windowed operation in your app then you could just specify the >>>> window >>>>>> size >>>>>>>> by config, >>>>>>>> but how is that any more ergonomic than specifying the window size >>>> in >>>>>> the >>>>>>>> Serde's >>>>>>>> constructor? If anything, it seems worse to put physical and >>> mental >>>>>>>> distance between >>>>>>>> the specification and the actual usage of such parameters. What if >>>> you >>>>>> add >>>>>>>> another >>>>>>>> windowed operation later, with a different size, and forget to >>>> specify >>>>>> the >>>>>>>> new size in >>>>>>>> the new Serde? Or what if you never specify a default window size >>>>>> config at >>>>>>>> all and >>>>>>>> accidentally end up using the default config value of >>>> Long.MAX_VALUE? >>>>>>>> Avoiding this >>>>>>>> possibility was/is one of the main motivations of this KIP, and >>> the >>>>>> whole >>>>>>>> point of >>>>>>>> deprecating the TimeWindowedSerde(innerClass) constructor. >>>>>>>> >>>>>>>> I actually would have advocated to remove this config entirely, >>> but >>>> as >>>>>> John >>>>>>>> pointed >>>>>>>> out, we still need it to configure things like the console >>> consumer >>>>>>>> >>>>>>>> On Tue, Sep 8, 2020 at 10:40 AM Leah Thomas <ltho...@confluent.io >>>> >>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Guozhang, >>>>>>>>> >>>>>>>>> Yes, the config would read them as a single window size. I think >>>> this >>>>>>>>> relates to John's comments about having variably sized windows, >>>> which >>>>>>>> this >>>>>>>>> config doesn't handle. I like the name change and updated the >>> wiki >>>> to >>>>>>>>> reflect that, and to clarify that the default value will still be >>>>>>>>> Long.MAX_VALUE. >>>>>>>>> >>>>>>>>> Thanks for your feedback! >>>>>>>>> Leah >>>>>>>>> >>>>>>>>> On Tue, Sep 8, 2020 at 11:54 AM Guozhang Wang < >>> wangg...@gmail.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hello Leah, >>>>>>>>>> >>>>>>>>>> Thanks for initiating this. I just have one minor clarification >>>>>>>> question >>>>>>>>>> here: the config "window.size.ms" seems to be used as the >>> default >>>>>>>> window >>>>>>>>>> size when reading from a topic that represents windowed records >>>>> right? >>>>>>>>> I.e. >>>>>>>>>> if there are multiple topics that represent windowed records but >>>>> their >>>>>>>>>> window sizes are different, with this config we can only read >>> them >>>>>>>> with a >>>>>>>>>> single window size? If yes, could we rename the config as " >>>>>>>>>> default.window.size.ms" and make that clear in the description >>> as >>>>>>>> well? >>>>>>>>>> Also we'd better also include its default value which I think >>>> would >>>>>>>> still >>>>>>>>>> be MAX_VALUE for compatibility. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Guozhang >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Sep 8, 2020 at 9:38 AM Leah Thomas < >>> ltho...@confluent.io> >>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hey all, >>>>>>>>>>> >>>>>>>>>>> We should be good to wrap up voting now that the discussion has >>>>> been >>>>>>>>>>> resolved. >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> Leah >>>>>>>>>>> >>>>>>>>>>> On Wed, Sep 2, 2020 at 7:23 PM Matthias J. Sax < >>> mj...@apache.org >>>>> >>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> +1 (binding) >>>>>>>>>>>> >>>>>>>>>>>> On 8/26/20 8:02 AM, John Roesler wrote: >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> I've just sent a new message to the DISCUSS thread. We >>>>>>>>>>>>> forgot to include the Scala API in the proposal. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> -John >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, 2020-08-24 at 18:00 -0700, Sophie Blee-Goldman >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> Thanks for the KIP! +1 (non-binding) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Sophie >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Aug 24, 2020 at 5:06 PM John Roesler < >>>>>>>> vvcep...@apache.org >>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks Leah, >>>>>>>>>>>>>>> I’m +1 (binding) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -John >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Aug 24, 2020, at 16:54, Leah Thomas wrote: >>>>>>>>>>>>>>>> Hi everyone, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'd like to kick-off the vote for KIP-659: Improve >>>>>>>>>>>>>>>> TimeWindowedDeserializer >>>>>>>>>>>>>>>> and TimeWindowedSerde to handle window size. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>> >>>> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Leah >>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> -- Guozhang >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >> > >