Thanks for reviving this KIP, Leah.

I agree that we should not extend the scope of this KIP to potentially
deprecate/rename the `default.windowed.[key|value].serde.inner` configs.

@Sophie: if you feel strong about it, let's do a separate KIP.


+1 (binding)


-Matthias

On 1/19/21 3:00 PM, John Roesler wrote:
> Hi all,
> 
> I've just caught up on the thread, and FWIW, I'm still +1.
> 
> Thanks,
> -John
> 
> On Mon, 2021-01-18 at 21:53 -0800, Guozhang Wang wrote:
>> Read the above updates and the KIP's scope. Makes sense to me. +1 still
>> counts :)
>>
>> On Wed, Jan 13, 2021 at 2:04 PM Sophie Blee-Goldman <sop...@confluent.io>
>> wrote:
>>
>>> That sounds good to me. Thanks for reviving this
>>>
>>> Sophie
>>>
>>> On Wed, Jan 13, 2021 at 7:47 AM Leah Thomas <ltho...@confluent.io> wrote:
>>>
>>>> Hey all,
>>>>
>>>> Bringing this back up for discussion.
>>>>
>>>> It seems like the next steps are to:
>>>> 1. rename the config "window.size.ms"
>>>> 2. ensure that users set window size EITHER through the config OR through
>>>> the constructor. On this note, it may make sense to remove the default
>>> for
>>>> the `window.size.ms` config, so that there won't be a fall back if the
>>>> window size isn't set in either spot. WDYT? This could also address the
>>>> issue of multiple window sizes within a streams app.
>>>>
>>>> I see what Sophie is saying about the `default.windowed.key.serde.inner`
>>>> config, but I do think deprecating and moving those configs would
>>> require a
>>>> larger discussion. I'm open to looping them into this KIP if we feel like
>>>> it's vital (or incredibly convenient with low cost to users), but my
>>>> initial reaction is to leave that out for now and work within the current
>>>> set-up for window size.
>>>>
>>>> Thanks for all the comments so far,
>>>> Leah
>>>>
>>>> On Tue, Sep 29, 2020 at 10:44 PM Sophie Blee-Goldman <
>>> sop...@confluent.io>
>>>> wrote:
>>>>
>>>>> There are two cases where you need to specify the window size --
>>> directly
>>>>> using a
>>>>> Consumer (eg the console consumer) or reading as an input topic within
>>>>> Streams.
>>>>> We need a config for the first case, since you can't pass a
>>> Deserializer
>>>>> object to the
>>>>> console consumer. In the Streams case, the reverse is true, and you
>>> have
>>>> to
>>>>> pass in
>>>>> an actual Serde object.
>>>>>
>>>>> Imo we should keep these two cases separate and not use the config for
>>>> the
>>>>> Streams
>>>>> case at all. But that's hard to enforce (we'd have to strip the config
>>>> out
>>>>> of the user's
>>>>> StreamsConfig if they tried to use it within Streams, for example) and
>>> it
>>>>> also puts us
>>>>> in an awkward position due to the  default.windowed.inner.serde.class
>>>>> configs. If
>>>>> they can specify the inner serde class through their Streams app
>>> config,
>>>>> they
>>>>> should be able to specify the window size through config as well.
>>>> Otherwise
>>>>> we
>>>>> either force a mix-and-match as Matthias described, or you just always
>>>> have
>>>>> to
>>>>> specify both the inner class and the window size in the constructor, at
>>>>> which
>>>>> point, why even have the default.windowed.inner.serde.class config at
>>>> all?
>>>>>
>>>>> ...
>>>>> that's not a rhetorical question, actually. Personally I do think we
>>>> should
>>>>> deprecate the default.windowed.serde.inner.class and replace it with
>>>>> separate
>>>>> windowed.serializer.inner.class/windowed.deserializer.inner.class
>>>> configs.
>>>>> This
>>>>> way it's immediately obvious that the configs are only for the
>>>>> Consumer/Producer,
>>>>> and you should construct your own TimeWindowedSerde with all the
>>>> necessary
>>>>> parameters for use in your Streams app.
>>>>>
>>>>> That might be too radical, and maybe the problem isn't worth the burden
>>>> of
>>>>> forcing users to change their code and replace the config with actual
>>>> Serde
>>>>> objects. But it should be an easy change to make, and if it isn't,
>>> that's
>>>>> probably a good sign that you're using the serde incorrectly somewhere.
>>>>>
>>>>> If we don't deprecate the default.windowed.serde.inner.class, then it's
>>>>> less clear
>>>>> to me how to proceed. The only really consistent thing to do seems to
>>> be
>>>> to
>>>>> name and position the new window size config as a default config and
>>>> allow
>>>>> it to be used similar to the default inner class configs. Which, as
>>>>> established
>>>>> throughout this discussion, seems very very wrong
>>>>>
>>>>> So yes, I think we should just stick with the original name
>>>> window.size.ms
>>>>> .
>>>>> Or
>>>>> better yet, call it windowed.deserializer.window.size.ms and name the
>>>>> default.windowed.serde.inner.class replacements
>>>>> windowed.deserializer.inner.class and windowed.serializer.inner.class
>>>>>
>>>>> On Tue, Sep 8, 2020 at 2:07 PM Matthias J. Sax <mj...@apache.org>
>>> wrote:
>>>>>
>>>>>> From my understanding, the KIP aims for the case when a user does not
>>>>>> control the code, eg, when using the command line consumer (or
>>> similar
>>>>>> tools).
>>>>>>
>>>>>> If the user writes code, we should always encourage her to
>>> instantiate
>>>>>> the deserializer explicitly and not relying on reflection+config.
>>>>>>
>>>>>> I also don't think that the `default` prefix does make sense, as it
>>>>>> indicates that there might be a non-default. However, IMHO, we should
>>>>>> not allow "overwrite semantics" but rather throw an exception if the
>>>>>> config is set and a window size is provided via the constructor. We
>>>>>> should not allow to mix-and-match both and should stick to a strict
>>>>>> either-or pattern.
>>>>>>
>>>>>>
>>>>>> -Matthias
>>>>>>
>>>>>> On 9/8/20 11:52 AM, Guozhang Wang wrote:
>>>>>>> Hi Sophie,
>>>>>>>
>>>>>>> Seems I do have some mis-understanding of the KIP's motivation here
>>>> :)
>>>>>> Just
>>>>>>> for clarification my reasoning is that:
>>>>>>>
>>>>>>> 1) today Streams itself never uses a windowed deserializer itself
>>>> since
>>>>>> its
>>>>>>> built-in operators only need the serializer and users do not need
>>> to
>>>>>>> override it, plus standby / restore active tasks would just copy
>>> the
>>>>>> bytes
>>>>>>> directly. So this KIP's motivation is not for Stream's own code
>>>>> anyways.
>>>>>>>
>>>>>>> 2) It is only when user specified serde is missing the window size,
>>>>> which
>>>>>>> is either when a) one is trying to read a source topic as windowed
>>>>>> records
>>>>>>> in Streams, this is a big blocker for KIP-300, and when b) one is
>>>>> trying
>>>>>> to
>>>>>>> read a topic as windowed records in Consumer, we would have issues
>>> if
>>>>>> users
>>>>>>> fail to use the appropriate serde constructs.
>>>>>>>
>>>>>>> I thought the main motivation of this KIP is for 2.a), in which we
>>>>> would
>>>>>>> just encourage the users to use the right constructor with the
>>> window
>>>>>> size
>>>>>>> by deprecating the other constructs. But I'm not sure how this
>>> would
>>>>> help
>>>>>>> with 2.b) since the proposal is on adding to StreamsConfig. If it
>>> is
>>>>> the
>>>>>>> case, then I agree that probably we can just not add an extra
>>> config
>>>>> but
>>>>>>> just deprecating the constructs.
>>>>>>>
>>>>>>>
>>>>>>> Guozhang
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Sep 8, 2020 at 10:50 AM Sophie Blee-Goldman <
>>>>> sop...@confluent.io
>>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hey Guozhang & Leah,
>>>>>>>>
>>>>>>>> I want to push back a bit on the assumption that we would fall
>>> back
>>>> on
>>>>>> this
>>>>>>>> config
>>>>>>>> in the case of an unspecified window size in a Streams serde. I
>>>> don't
>>>>>> think
>>>>>>>> it should
>>>>>>>> be a default at all, either in name or in effect. To borrow the
>>>>>> rhetorical
>>>>>>>> question that
>>>>>>>> John raised earlier: what is the default window size of an
>>>>> application?
>>>>>>>>
>>>>>>>> Personally, I agree that that doesn't make much sense. Sure, if
>>> you
>>>>> only
>>>>>>>> have a single
>>>>>>>> windowed operation in your app then you could just specify the
>>>> window
>>>>>> size
>>>>>>>> by config,
>>>>>>>> but how is that any more ergonomic than specifying the window size
>>>> in
>>>>>> the
>>>>>>>> Serde's
>>>>>>>> constructor? If anything, it seems worse to put physical and
>>> mental
>>>>>>>> distance between
>>>>>>>> the specification and the actual usage of such parameters. What if
>>>> you
>>>>>> add
>>>>>>>> another
>>>>>>>> windowed operation later, with a different size, and forget to
>>>> specify
>>>>>> the
>>>>>>>> new size in
>>>>>>>> the new Serde? Or what if you never specify a default window size
>>>>>> config at
>>>>>>>> all and
>>>>>>>> accidentally end up using the default config value of
>>>> Long.MAX_VALUE?
>>>>>>>> Avoiding this
>>>>>>>> possibility was/is one of the main motivations of this KIP, and
>>> the
>>>>>> whole
>>>>>>>> point of
>>>>>>>> deprecating the TimeWindowedSerde(innerClass) constructor.
>>>>>>>>
>>>>>>>> I actually would have advocated to remove this config entirely,
>>> but
>>>> as
>>>>>> John
>>>>>>>> pointed
>>>>>>>> out, we still need it to configure things like the console
>>> consumer
>>>>>>>>
>>>>>>>> On Tue, Sep 8, 2020 at 10:40 AM Leah Thomas <ltho...@confluent.io
>>>>
>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Guozhang,
>>>>>>>>>
>>>>>>>>> Yes, the config would read them as a single window size. I think
>>>> this
>>>>>>>>> relates to John's comments about having variably sized windows,
>>>> which
>>>>>>>> this
>>>>>>>>> config doesn't handle. I like the name change and updated the
>>> wiki
>>>> to
>>>>>>>>> reflect that, and to clarify that the default value will still be
>>>>>>>>> Long.MAX_VALUE.
>>>>>>>>>
>>>>>>>>> Thanks for your feedback!
>>>>>>>>> Leah
>>>>>>>>>
>>>>>>>>> On Tue, Sep 8, 2020 at 11:54 AM Guozhang Wang <
>>> wangg...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hello Leah,
>>>>>>>>>>
>>>>>>>>>> Thanks for initiating this. I just have one minor clarification
>>>>>>>> question
>>>>>>>>>> here: the config "window.size.ms" seems to be used as the
>>> default
>>>>>>>> window
>>>>>>>>>> size when reading from a topic that represents windowed records
>>>>> right?
>>>>>>>>> I.e.
>>>>>>>>>> if there are multiple topics that represent windowed records but
>>>>> their
>>>>>>>>>> window sizes are different, with this config we can only read
>>> them
>>>>>>>> with a
>>>>>>>>>> single window size? If yes, could we rename the config as "
>>>>>>>>>> default.window.size.ms" and make that clear in the description
>>> as
>>>>>>>> well?
>>>>>>>>>> Also we'd better also include its default value which I think
>>>> would
>>>>>>>> still
>>>>>>>>>> be MAX_VALUE for compatibility.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Guozhang
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Sep 8, 2020 at 9:38 AM Leah Thomas <
>>> ltho...@confluent.io>
>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hey all,
>>>>>>>>>>>
>>>>>>>>>>> We should be good to wrap up voting now that the discussion has
>>>>> been
>>>>>>>>>>> resolved.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Leah
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Sep 2, 2020 at 7:23 PM Matthias J. Sax <
>>> mj...@apache.org
>>>>>
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> +1 (binding)
>>>>>>>>>>>>
>>>>>>>>>>>> On 8/26/20 8:02 AM, John Roesler wrote:
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've just sent a new message to the DISCUSS thread. We
>>>>>>>>>>>>> forgot to include the Scala API in the proposal.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> -John
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, 2020-08-24 at 18:00 -0700, Sophie Blee-Goldman
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Thanks for the KIP! +1 (non-binding)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sophie
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Aug 24, 2020 at 5:06 PM John Roesler <
>>>>>>>> vvcep...@apache.org
>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks Leah,
>>>>>>>>>>>>>>> I’m +1 (binding)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Aug 24, 2020, at 16:54, Leah Thomas wrote:
>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'd like to kick-off the vote for KIP-659: Improve
>>>>>>>>>>>>>>>> TimeWindowedDeserializer
>>>>>>>>>>>>>>>> and TimeWindowedSerde to handle window size.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Leah
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> -- Guozhang
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
> 
> 

Reply via email to