There are two cases where you need to specify the window size -- directly
using a
Consumer (eg the console consumer) or reading as an input topic within
Streams.
We need a config for the first case, since you can't pass a Deserializer
object to the
console consumer. In the Streams case, the reverse is true, and you have to
pass in
an actual Serde object.

Imo we should keep these two cases separate and not use the config for the
Streams
case at all. But that's hard to enforce (we'd have to strip the config out
of the user's
StreamsConfig if they tried to use it within Streams, for example) and it
also puts us
in an awkward position due to the  default.windowed.inner.serde.class
configs. If
they can specify the inner serde class through their Streams app config,
they
should be able to specify the window size through config as well. Otherwise
we
either force a mix-and-match as Matthias described, or you just always have
to
specify both the inner class and the window size in the constructor, at
which
point, why even have the default.windowed.inner.serde.class config at all?

...
that's not a rhetorical question, actually. Personally I do think we should
deprecate the default.windowed.serde.inner.class and replace it with
separate
windowed.serializer.inner.class/windowed.deserializer.inner.class configs.
This
way it's immediately obvious that the configs are only for the
Consumer/Producer,
and you should construct your own TimeWindowedSerde with all the necessary
parameters for use in your Streams app.

That might be too radical, and maybe the problem isn't worth the burden of
forcing users to change their code and replace the config with actual Serde
objects. But it should be an easy change to make, and if it isn't, that's
probably a good sign that you're using the serde incorrectly somewhere.

If we don't deprecate the default.windowed.serde.inner.class, then it's
less clear
to me how to proceed. The only really consistent thing to do seems to be to
name and position the new window size config as a default config and allow
it to be used similar to the default inner class configs. Which, as
established
throughout this discussion, seems very very wrong

So yes, I think we should just stick with the original name window.size.ms.
Or
better yet, call it windowed.deserializer.window.size.ms and name the
default.windowed.serde.inner.class replacements
windowed.deserializer.inner.class and windowed.serializer.inner.class

On Tue, Sep 8, 2020 at 2:07 PM Matthias J. Sax <mj...@apache.org> wrote:

> From my understanding, the KIP aims for the case when a user does not
> control the code, eg, when using the command line consumer (or similar
> tools).
>
> If the user writes code, we should always encourage her to instantiate
> the deserializer explicitly and not relying on reflection+config.
>
> I also don't think that the `default` prefix does make sense, as it
> indicates that there might be a non-default. However, IMHO, we should
> not allow "overwrite semantics" but rather throw an exception if the
> config is set and a window size is provided via the constructor. We
> should not allow to mix-and-match both and should stick to a strict
> either-or pattern.
>
>
> -Matthias
>
> On 9/8/20 11:52 AM, Guozhang Wang wrote:
> > Hi Sophie,
> >
> > Seems I do have some mis-understanding of the KIP's motivation here :)
> Just
> > for clarification my reasoning is that:
> >
> > 1) today Streams itself never uses a windowed deserializer itself since
> its
> > built-in operators only need the serializer and users do not need to
> > override it, plus standby / restore active tasks would just copy the
> bytes
> > directly. So this KIP's motivation is not for Stream's own code anyways.
> >
> > 2) It is only when user specified serde is missing the window size, which
> > is either when a) one is trying to read a source topic as windowed
> records
> > in Streams, this is a big blocker for KIP-300, and when b) one is trying
> to
> > read a topic as windowed records in Consumer, we would have issues if
> users
> > fail to use the appropriate serde constructs.
> >
> > I thought the main motivation of this KIP is for 2.a), in which we would
> > just encourage the users to use the right constructor with the window
> size
> > by deprecating the other constructs. But I'm not sure how this would help
> > with 2.b) since the proposal is on adding to StreamsConfig. If it is the
> > case, then I agree that probably we can just not add an extra config but
> > just deprecating the constructs.
> >
> >
> > Guozhang
> >
> >
> >
> >
> >
> > On Tue, Sep 8, 2020 at 10:50 AM Sophie Blee-Goldman <sop...@confluent.io
> >
> > wrote:
> >
> >> Hey Guozhang & Leah,
> >>
> >> I want to push back a bit on the assumption that we would fall back on
> this
> >> config
> >> in the case of an unspecified window size in a Streams serde. I don't
> think
> >> it should
> >> be a default at all, either in name or in effect. To borrow the
> rhetorical
> >> question that
> >> John raised earlier: what is the default window size of an application?
> >>
> >> Personally, I agree that that doesn't make much sense. Sure, if you only
> >> have a single
> >> windowed operation in your app then you could just specify the window
> size
> >> by config,
> >> but how is that any more ergonomic than specifying the window size in
> the
> >> Serde's
> >> constructor? If anything, it seems worse to put physical and mental
> >> distance between
> >> the specification and the actual usage of such parameters. What if you
> add
> >> another
> >> windowed operation later, with a different size, and forget to specify
> the
> >> new size in
> >> the new Serde? Or what if you never specify a default window size
> config at
> >> all and
> >> accidentally end up using the default config value of Long.MAX_VALUE?
> >> Avoiding this
> >> possibility was/is one of the main motivations of this KIP, and the
> whole
> >> point of
> >> deprecating the TimeWindowedSerde(innerClass) constructor.
> >>
> >> I actually would have advocated to remove this config entirely, but as
> John
> >> pointed
> >> out, we still need it to configure things like the console consumer
> >>
> >> On Tue, Sep 8, 2020 at 10:40 AM Leah Thomas <ltho...@confluent.io>
> wrote:
> >>
> >>> Hi Guozhang,
> >>>
> >>> Yes, the config would read them as a single window size. I think this
> >>> relates to John's comments about having variably sized windows, which
> >> this
> >>> config doesn't handle. I like the name change and updated the wiki to
> >>> reflect that, and to clarify that the default value will still be
> >>> Long.MAX_VALUE.
> >>>
> >>> Thanks for your feedback!
> >>> Leah
> >>>
> >>> On Tue, Sep 8, 2020 at 11:54 AM Guozhang Wang <wangg...@gmail.com>
> >> wrote:
> >>>
> >>>> Hello Leah,
> >>>>
> >>>> Thanks for initiating this. I just have one minor clarification
> >> question
> >>>> here: the config "window.size.ms" seems to be used as the default
> >> window
> >>>> size when reading from a topic that represents windowed records right?
> >>> I.e.
> >>>> if there are multiple topics that represent windowed records but their
> >>>> window sizes are different, with this config we can only read them
> >> with a
> >>>> single window size? If yes, could we rename the config as "
> >>>> default.window.size.ms" and make that clear in the description as
> >> well?
> >>>> Also we'd better also include its default value which I think would
> >> still
> >>>> be MAX_VALUE for compatibility.
> >>>>
> >>>>
> >>>> Guozhang
> >>>>
> >>>>
> >>>> On Tue, Sep 8, 2020 at 9:38 AM Leah Thomas <ltho...@confluent.io>
> >> wrote:
> >>>>
> >>>>> Hey all,
> >>>>>
> >>>>> We should be good to wrap up voting now that the discussion has been
> >>>>> resolved.
> >>>>>
> >>>>> Cheers,
> >>>>> Leah
> >>>>>
> >>>>> On Wed, Sep 2, 2020 at 7:23 PM Matthias J. Sax <mj...@apache.org>
> >>> wrote:
> >>>>>
> >>>>>> +1 (binding)
> >>>>>>
> >>>>>> On 8/26/20 8:02 AM, John Roesler wrote:
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> I've just sent a new message to the DISCUSS thread. We
> >>>>>>> forgot to include the Scala API in the proposal.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> -John
> >>>>>>>
> >>>>>>> On Mon, 2020-08-24 at 18:00 -0700, Sophie Blee-Goldman
> >>>>>>> wrote:
> >>>>>>>> Thanks for the KIP! +1 (non-binding)
> >>>>>>>>
> >>>>>>>> Sophie
> >>>>>>>>
> >>>>>>>> On Mon, Aug 24, 2020 at 5:06 PM John Roesler <
> >> vvcep...@apache.org
> >>>>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Thanks Leah,
> >>>>>>>>> I’m +1 (binding)
> >>>>>>>>>
> >>>>>>>>> -John
> >>>>>>>>>
> >>>>>>>>> On Mon, Aug 24, 2020, at 16:54, Leah Thomas wrote:
> >>>>>>>>>> Hi everyone,
> >>>>>>>>>>
> >>>>>>>>>> I'd like to kick-off the vote for KIP-659: Improve
> >>>>>>>>>> TimeWindowedDeserializer
> >>>>>>>>>> and TimeWindowedSerde to handle window size.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Leah
> >>>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> -- Guozhang
> >>>>
> >>>
> >>
> >
> >
>
>

Reply via email to