`kafka-consumer-groups --describe --group ...` can result in 
NullPointerException for two reasons:
1)  Fetcher.fetchOffsetsByTimes() may return too early, without sending list 
offsets request for topic partitions that are not in cached metadata.
2) `ConsumerGroupCommand.getLogEndOffsets()` and `getLogStartOffsets()` assumed 
that endOffsets()/beginningOffsets() which eventually call 
Fetcher.fetchOffsetsByTimes(), would return a map with all the topic partitions 
passed to endOffsets()/beginningOffsets() and that values are not null.
Because of #1, null values were possible if some of the topic partitions were 
already known (in metadata cache) and some not (metadata cache did not have 
entries for some of the topic partitions). However, even with fixing #1, 
endOffsets()/beginningOffsets() may return a map with some topic partitions 
missing, when list offset request returns a non-retriable error. This happens 
in corner cases such as message format on broker is before 0.10, or maybe in 
cases of some other errors. 

Testing:
-- added unit test to verify fix in Fetcher.fetchOffsetsByTimes() 
-- did some manual testing with `kafka-consumer-groups --describe`, causing 
NPE. Was not able to reproduce any NPE cases with 
DescribeConsumerGroupTest.scala,

### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation 
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)


[ Full content available at: https://github.com/apache/kafka/pull/5627 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to