[ https://issues.apache.org/jira/browse/KAFKA-12879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17485496#comment-17485496 ]
Colin McCabe commented on KAFKA-12879: -------------------------------------- Let me give a little context here on the behavior. The Producer and Consumer typically retry most operations if the partition in question doesn't exist. The thinking there is that if the user specified they want to consume from topic foo-0, they knew what they were doing, and we should just wait for foo-0 to appear. This is particularly useful because Kafka has eventually consistent metadata -- even after creating a topic, it may take a few seconds for every broker to become aware of the new topic. For the AdminClient, we usually don't retry if a topic doesn't exist. For example, if you try to delete a topic, we don't loop forever if the topic doesn't exist -- we just return UNKNOWN_TOPIC_OR_PARTITION immediately. You could view this as inconsistent, but being consistent with Producer / Consumer here would result in a somewhat useless API. People do not want their topic deletes to take a long time and then fail with TimeoutException if the topic doesn't exist. I would argue that listOffsets is more similar to the second case here. It's very rare that you would be invoking listOffsets on a partition that had just been created. Looping forever if the partition doesn't exist isn't really a useful behavior in most scenarios. It seems like Connect has a use case for this -- since Connect knows for sure that the topic exists (or will exist), it should do the retries itself, rather than pushing this into AdminClient. So I would argue we should just revert the change. Also, as to the "without documentation" part -- we do make an effort to document the exceptions admin methods can throw. We're missing a lot of them (PRs would be very welcome here!) For example, listPartitionReassignments documents that it can return UnknownTopicOrPartitionException, ClusterAuthorizationException, TimeoutException, etc. If we revert the change, we should also add this kind of documentation to the listOffsets function. > Compatibility break in Admin.listOffsets() > ------------------------------------------ > > Key: KAFKA-12879 > URL: https://issues.apache.org/jira/browse/KAFKA-12879 > Project: Kafka > Issue Type: Bug > Components: admin > Affects Versions: 2.8.0, 2.7.1, 2.6.2 > Reporter: Tom Bentley > Assignee: Kirk True > Priority: Major > > KAFKA-12339 incompatibly changed the semantics of Admin.listOffsets(). > Previously it would fail with {{UnknownTopicOrPartitionException}} when a > topic didn't exist. Now it will (eventually) fail with {{TimeoutException}}. > It seems this was more or less intentional, even though it would break code > which was expecting and handling the {{UnknownTopicOrPartitionException}}. A > workaround is to use {{retries=1}} and inspect the cause of the > {{TimeoutException}}, but this isn't really suitable for cases where the same > Admin client instance is being used for other calls where retries is > desirable. > Furthermore as well as the intended effect on {{listOffsets()}} it seems that > the change could actually affect other methods of Admin. > More generally, the Admin client API is vague about which exceptions can > propagate from which methods. This means that it's not possible to say, in > cases like this, whether the calling code _should_ have been relying on the > {{UnknownTopicOrPartitionException}} or not. -- This message was sent by Atlassian Jira (v8.20.1#820001)