Thanks for the update. I did a first pass over the updated KIP and think
it makes sense.
On 7/11/18 5:47 PM, John Roesler wrote:
> Hi all,
> I have updated KIP-328 with all the feedback I've gotten so far. Please
> take another look and let me know what you think!
I have updated KIP-328 with all the feedback I've gotten so far. Please
take another look and let me know what you think!
On Wed, Jul 11, 2018 at 12:28 AM Guozhang Wang wrote:
> That is a good point..
> I cannot think of a better option than documentation and warning,
That is a good point..
I cannot think of a better option than documentation and warning, and also
given that we'd probably better not reusing the function name `until` for
On Tue, Jul 10, 2018 at 3:31 PM, John Roesler wrote:
> I had some opportunity to reflect on the
I had some opportunity to reflect on the default for close time today...
Note that the current "close time" is equal to the retention time, and
therefore "close" today shares the default retention of 24h.
It would definitely break any application that today specifies a retention
time to set
That sounds good to me. I'll include that in the KIP.
On Mon, Jul 9, 2018 at 6:33 PM Guozhang Wang wrote:
> Let me clarify a bit on what I meant about moving `retentionPeriod` to
> In another discussion we had around KIP-319 / 330, that the
Let me clarify a bit on what I meant about moving `retentionPeriod` to
In another discussion we had around KIP-319 / 330, that the "retention
period" should not really be a window spec, but only a window store spec,
as it only affects how long to retain each window to be
Thanks for the reply, Guozhang,
Good! I agree, that is also a good reason, and I actually made use of that
in my tests. I'll update the KIP.
By the way, I chose "allowedLateness" as I was trying to pick a better name
than "close", but I think it's actually the wrong name. We don't want to
Thanks for your replies. As for the two options of the API, I think I'm
slightly inclined to the first option as well. My motivation is a bit
different, as I think of the first one maybe more flexible, for example:
KTable> table = ... count();
table.toStream().peek(..); // want to peek
Hey Matthias and Guozhang,
Sorry for the slow reply. I was mulling about your feedback and weighing
some ideas in a sketchbook PR: https://github.com/apache/kafka/pull/5337.
Your thought about keeping suppression independent of business logic is a
very good one. I agree that it would make more
I think I agree with Matthias for having dedicated APIs for windowed
operation final output scenario, PLUS separating the window close which the
"final output" would rely on, from the window retention time itself
(admittedly it would make this KIP effort larger, but if we believe we need
Thanks for the discussion. I am just catching up.
In general, I think we have different uses cases and non-windowed and
windowed is quite different. For the non-windowed case, suppress() has
no (useful) close or retention time, no final semantics, and also no
business logic impact.
On the other
I see. It seems like if we want to decouple 1) and 2), we need to alter the
definition of the window. Do you think it would close the gap if we added a
"window close" time to the window definition?
Obviously I'm too lazy on email replying diligence compared with you :)
Will try to reply them separately:
To reply your email on "Mon, Jul 2, 2018 at 8:23 AM":
I'm aware of this use case, but again, the
In fact, to push the idea further (which IIRC is what Matthias originally
proposed), if we can accept "Suppression#finalResultsOnly" in my last
email, then we could also consider whether to eliminate
We could always add it later, but you've both expressed doubt that
Hi again, Guozhang ;) Here's the second part of my response...
It seems like your main concern is: "if I'm a user who wants final update
semantics, how complicated is it for me to get it?"
I think we have to assume that people don't always have time to become
deeply familiar with all the nuances
Thanks for the clarification.
To answer your questions:
1. Yes, specifically Y < X makes sense and is by design.
The scenario is to support IQ queries over windows that are closed but not
evicted. For example, suppose we have a metrics application backed by
Streams. Let's say we do
Regarding the metrics: yeah I think I'm with you that the dropped records
due to window retention or emit suppression policies should be recorded
differently, and using this KIP's proposed metric would be fine. If you
also think we can use this KIP's proposed metrics to cover the window
Thanks for the explanation, that does make sense. I have some questions on
operations, but I'll just wait for the PR and tests.
On Wed, Jun 27, 2018 at 8:14 PM John Roesler wrote:
> Hi Bill,
> Thanks for the review!
> Your question is very much applicable to the KIP and not
Thanks for the review!
Your question is very much applicable to the KIP and not at all an
implementation detail. Thanks for bringing it up.
I'm proposing not to change the existing caches and configurations at all
Imagine you have a topology like this:
Hi John, thanks for the KIP.
Early on in the KIP, you mention the current approaches for controlling the
rate of downstream records from a KTable, cache size configuration and
Will these configuration parameters still be in effect for tables that
don't use suppression? For tables
Thanks for the feedback, Matthias,
It seems like in straightforward relational processing cases, it would not
make sense to bound the lateness of KTables. In general, it seems better to
have "guard rails" in place that make it easier to write sensible programs
than insensible ones.
But I'm still
Hello again all,
I realized today that I neglected to include metrics in the proposal. I
have added them just now.
On Tue, Jun 26, 2018 at 3:11 PM John Roesler wrote:
> Hello devs and users,
> Please take some time to consider this proposal for Kafka Streams:
Thanks for the KIP John.
One initial comments about the last example "Bounded lateness": For a
non-windowed KTable bounding the lateness does not really make sense,
Thus, I am wondering if we should allow `suppressLateEvents()` for this
case? It seems to be better to only allow it for
I noticed this (lack of primary parameter) as well.
What you gave as new example is semantically the same as what I suggested.
So it is good by me.
On Wed, Jun 27, 2018 at 7:31 AM, John Roesler wrote:
> Thanks for taking look, Ted,
> I agree this is a departure from the conventions
Thanks for taking look, Ted,
I agree this is a departure from the conventions of Streams DSL.
Most of our config objects have one or two "required" parameters, which fit
naturally with the static factory method approach. TimeWindow, for example,
requires a size parameter, so we can naturally say
I started to read this KIP which contains a lot of materials.
Do you think it would be more consistent with the rest of Streams data
structures by supporting `of` ?
On Tue, Jun 26, 2018
Hello devs and users,
Please take some time to consider this proposal for Kafka Streams:
KIP-328: Ability to suppress updates for KTables
The basic idea is to provide:
* more usable control over update rate (vs the current state store caches)
Mail list logo