[ 
https://issues.apache.org/jira/browse/KAFKA-12710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330995#comment-17330995
 ] 

A. Sophie Blee-Goldman commented on KAFKA-12710:
------------------------------------------------

Thanks, I'd forgotten about that KIP. Being able to selectively disable 
optimizations would make enabling (at least some) optimizations by default much 
more palatable, if we can do them together that would be ideal.

> Consider enabling (at least some) optimizations by default
> ----------------------------------------------------------
>
>                 Key: KAFKA-12710
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12710
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: A. Sophie Blee-Goldman
>            Priority: Major
>
> Topology optimizations such as the repartition consolidation and source topic 
> changelog are extremely useful at reducing the footprint of a Kafka Streams 
> application on the broker. The additional storage and resource utilization 
> due to changelogs and repartitions is a very real pain point, and has even 
> been cited as the reason for turning to other stream processing frameworks in 
> the past (though of course I question that judgement)
> The repartition topic optimization, at the very least, should be enabled by 
> default. The problem is that we can't just flip the switch without breaking 
> existing applications during upgrade, since the location and name of such 
> topics in the topology may change. One possibility is to just detect this 
> situation and disable the optimization if we find that it would produce an 
> incompatible topology for an existing application. We can determine that this 
> is the case simply by looking for pre-existing repartition topics. If any 
> such topics are present, and match the set of repartition topics in the 
> un-optimized topology, then we know we need to switch the optimization off. 
> If we don't find any repartition topics, or they match the optimized 
> topology, then we're safe to enable it by default.
> Alternatively, we could just do a KIP to indicate that we intend to change 
> the default in the next breaking release and that existing applications 
> should override this config if necessary. We should be able to implement a 
> fail-safe and shut down if a user misses or forgets to do so, using the 
> method mentioned above.
> The source topic optimization is perhaps more controversial, as there have 
> been a few issues raised with regards to things like [restoring bad data and 
> asymmetric serdes|https://issues.apache.org/jira/browse/KAFKA-8037], or more 
> recently the bug discovered in the [emit-on-change semantics for 
> KTables|https://issues.apache.org/jira/browse/KAFKA-12508?focusedCommentId=17306323&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17306323].
>  However for this case at least there are no compatibility concerns. It's 
> safe to upgrade from using a separate changelog for a source KTable to just 
> using that source topic directly, although the reverse is not true. We could 
> even automatically delete the no-longer-necessary changelog for upgrading 
> applications



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to