[ https://issues.apache.org/jira/browse/KAFKA-5037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524421#comment-16524421 ]
Guozhang Wang commented on KAFKA-5037: -------------------------------------- If we want to only fix-forward this issue beyond 2.0. I'd propose the following to refactor the {{StreamsPartitionAssignor#assign()}} function as follows. The principle is that, unlike consumer who would assign partial available topics to members hoping that when the metadata gets refreshed again another rebalance is triggered to complete the full subscription, in Streams we would not start at all if not all the requested source topics are available. This is a simpler rule and easier for users to reason: in the past the most complaints are that "I need to be notified when this happens, but I still see application in RUNNING, and was not aware we are actually only executing partial topology". 1. At the beginning of {{assign}}, we first check that all the non-repartition source topics are included in the {{metadata}}. If not, we log an error at the leader and set an error in the Assignment userData bytes, indicating that leader cannot complete assignment and the error code would indicate the root cause of it. 2. Upon receiving the assignment, if the error is not NONE the streams will shutdown itself with a log entry re-stating the root cause interpreted from the error code. 3. With 1) / 2) above, the non repartition source topics should always been available, and hence we should never encounter the `NOT_AVAILABLE` case for num.partitions of those repartition topics, we will remove `NOT_AVAILABLE` from the possible values, and remove the corresponding logic in DefaultPartitionGrouper for `NOT_AVAILABLE`. Note that 1) above requires us to bump up the version for assignment userData again, which should be fine with version probing added in 2.0. There is an alternative walkaround, to use negative version numbers as the error codes, and as long as clients knows how to interpret them that is fine. Personally I felt it is a bit hacky and would suggest we just bump up the version. WDYT? > Infinite loop if all input topics are unknown at startup > -------------------------------------------------------- > > Key: KAFKA-5037 > URL: https://issues.apache.org/jira/browse/KAFKA-5037 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 0.10.2.0 > Reporter: Matthias J. Sax > Priority: Major > Labels: user-experience > > See discusion: https://github.com/apache/kafka/pull/2815 > We will need some rewrite on {{StreamPartitionsAssignor}} and to add much > more test for all kind of corner cases, including pattern subscriptions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)