ableegoldman commented on a change in pull request #8541: URL: https://github.com/apache/kafka/pull/8541#discussion_r416199492
########## File path: streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamsPartitionAssignor.java ########## @@ -713,23 +713,18 @@ private boolean assignTasksToClients(final Set<String> allSourceTopics, allTasks, clientStates, numStandbyReplicas()); final TaskAssignor taskAssignor; - if (highAvailabilityEnabled) { - if (lagComputationSuccessful) { - taskAssignor = new HighAvailabilityTaskAssignor( - clientStates, - allTasks, - statefulTasks, - assignmentConfigs); - } else { - log.info("Failed to fetch end offsets for changelogs, will return previous assignment to clients and " - + "trigger another rebalance to retry."); - setAssignmentErrorCode(AssignorError.REBALANCE_NEEDED.code()); - taskAssignor = new StickyTaskAssignor(clientStates, allTasks, statefulTasks, assignmentConfigs, true); - } + if (!lagComputationSuccessful) { Review comment: I buy the argument that it's weird (aka bad) to supply an input that may be invalid just because we know that particular implementation doesn't use the input. Of course, we don't actually input the lags directly; we just happen to supplement the `ClientState` with the computed lags _in some cases_, and the HATA just happens to use that. It seems only slightly less weird to _not_ invoke something because we haven't added some supplemental information that isn't even required in the first place. What if we add a marker interface (warning: terrible placeholder name coming) called `UsingClientLagsAssignor` that only HATA implements to indicate that it needs this extra information on the `ClientState` to do its job. This will also be useful if we want to eventually make the task assignor fully user customizable; I expect someone who implemented a custom assignor that does not require task lags would be surprised (and annoyed) to find out that their custom assignor was skipped because we couldn't compute superfluous client data. Or, we could just move the lag computation into the HATA. But personally I'd rather leave it in the SPA for the above reason ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org