`kafka-consumer-groups --describe --group ...` can result in NullPointerException for two reasons: 1) Fetcher.fetchOffsetsByTimes() may return too early, without sending list offsets request for topic partitions that are not in cached metadata. 2) `ConsumerGroupCommand.getLogEndOffsets()` and `getLogStartOffsets()` assumed that endOffsets()/beginningOffsets() which eventually call Fetcher.fetchOffsetsByTimes(), would return a map with all the topic partitions passed to endOffsets()/beginningOffsets() and that values are not null. Because of #1, null values were possible if some of the topic partitions were already known (in metadata cache) and some not (metadata cache did not have entries for some of the topic partitions). However, even with fixing #1, endOffsets()/beginningOffsets() may return a map with some topic partitions missing, when list offset request returns a non-retriable error. This happens in corner cases such as message format on broker is before 0.10, or maybe in cases of some other errors.
Testing: -- added unit test to verify fix in Fetcher.fetchOffsetsByTimes() -- did some manual testing with `kafka-consumer-groups --describe`, causing NPE. Was not able to reproduce any NPE cases with DescribeConsumerGroupTest.scala, ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) [ Full content available at: https://github.com/apache/kafka/pull/5627 ] This message was relayed via gitbox.apache.org for [email protected]
