[ 
https://issues.apache.org/jira/browse/KAFKA-5037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524421#comment-16524421
 ] 

Guozhang Wang commented on KAFKA-5037:
--------------------------------------

If we want to only fix-forward this issue beyond 2.0. I'd propose the following 
to refactor the {{StreamsPartitionAssignor#assign()}} function as follows. The 
principle is that, unlike consumer who would assign partial available topics to 
members hoping that when the metadata gets refreshed again another rebalance is 
triggered to complete the full subscription, in Streams we would not start at 
all if not all the requested source topics are available. This is a simpler 
rule and easier for users to reason: in the past the most complaints are that 
"I need to be notified when this happens, but I still see application in 
RUNNING, and was not aware we are actually only executing partial topology".

1. At the beginning of {{assign}}, we first check that all the non-repartition 
source topics are included in the {{metadata}}. If not, we log an error at the 
leader and set an error in the Assignment userData bytes, indicating that 
leader cannot complete assignment and the error code would indicate the root 
cause of it.

2. Upon receiving the assignment, if the error is not NONE the streams will 
shutdown itself with a log entry re-stating the root cause interpreted from the 
error code.

3. With 1) / 2) above, the non repartition source topics should always been 
available, and hence we should never encounter the `NOT_AVAILABLE` case for 
num.partitions of those repartition topics, we will remove `NOT_AVAILABLE` from 
the possible values, and remove the corresponding logic in 
DefaultPartitionGrouper for `NOT_AVAILABLE`.

Note that 1) above requires us to bump up the version for assignment userData 
again, which should be fine with version probing added in 2.0. There is an 
alternative walkaround, to use negative version numbers as the error codes, and 
as long as clients knows how to interpret them that is fine. Personally I felt 
it is a bit hacky and would suggest we just bump up the version.

WDYT?

> Infinite loop if all input topics are unknown at startup
> --------------------------------------------------------
>
>                 Key: KAFKA-5037
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5037
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 0.10.2.0
>            Reporter: Matthias J. Sax
>            Priority: Major
>              Labels: user-experience
>
> See discusion: https://github.com/apache/kafka/pull/2815
> We will need some rewrite on {{StreamPartitionsAssignor}} and to add much 
> more test for all kind of corner cases, including pattern subscriptions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to