Hi @guozhangwang ,

Thanks for the thoughtful feedback.

1. I think your argument about the serdes is sound.

* for any non-k/v-changing operation that produces a KTable, we will be able to 
forward serdes from upstream, through the operator, and to suppress
* the rest of the operators may change keys or values, but in all cases, it's 
possible to provide serdes at the key/value-changing operator, and then forward 
to suppress.

So in all cases, we don't need to ask for serdes in suppress, which I vastly 
prefer. Thanks!

2. for changelogs,

I think it would be much better if we offered tight semantics in all cases. 
Forcing people to reason about how the commit interval interplays with the 
suppression is needlessly complicated.

But I do think that we can still optimize it to avoid the extra changelog. The 
good news is that at this stage, realizing that we can get serdes without 
asking for them in the suppression config means that we don't have to worry 
about the changelog or commit behavior.

So this PR is not blocked on that conversation.

I'll take it as a design goal to avoid an extra changelog and spend some time 
to see what I can come up with. At the least, you've offered a way to do it by 
relaxing the suppression semantics.

3. Yes, I think that's a good long-term vision. And it would be all the more 
important to avoid an extra changelog if we wind up tacking a suppression on to 
every ktable.

3c. Thanks for that reference. It would be nice to have a simple control 
bounding the memory usage. However, I'm not sure I agree that that config 
should be allowed to alter the program we've been asked to execute. 

If we were to add a `streams.memory.bytes`, we will also have to consider what 
to do if it's overconstrained. Clearly, we cannot execute a Streams program 
within 7 bytes, so we would have some validation on startup that says "hey, you 
asked for no more than 7 bytes, but we need at least 800MB for this program". 
Rather than relaxing the suppression semantics, I'd advocate for explicit 
user-specified buffer sizes to be included in this arithmetic.

But that is again a problem for the future.

So in conclusion: I'll drop the serdes from the API, and forward from the 
source KTable instead. We should then be able to resume the review of this PR, 
right?

[ Full content available at: https://github.com/apache/kafka/pull/5567 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to