mjsax commented on code in PR #18587: URL: https://github.com/apache/kafka/pull/18587#discussion_r1920894860
########## clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java: ########## @@ -1134,6 +1129,13 @@ private ClusterAndWaitTime waitOnMetadata(String topic, Integer partition, long return new ClusterAndWaitTime(cluster, elapsed); } + private String getErrorMessage(Integer partitionsCount, String topic, Integer partition, long maxWaitMs) { + return partitionsCount == null ? + String.format("Topic %s not present in metadata after %d ms.", + topic, maxWaitMs) : + String.format("Partition %d of topic %s with partition count %d is not present in metadata after %d ms.", + partition, topic, partitionsCount, maxWaitMs); + } Review Comment: > Is there any value in including the partition value in the error message if it's non-null? Yes. `partition` is an input parameter to the method, and it's only `null` if the caller does not care about the metadata of a specific partitions (as pointed out by Andrew). So it's if not-null, the user cares and we should log it. If `partition != null`, it helps to identify the case of a non-existing partition the user wants to write into. Assume a topic with 4 partitions, and user specific partition 5 as explicit argument when calling `send()`. The producer would try to learn about the leader broker for partition 5, but it does not exist, and it would loop until `max.block.ms` is exhausted and finally fail with: `Partition 5 of topic <FOO> with partition count 4 is not present in metadata after XXX ms.` -- Then we can suspect that the topic may really only have 4 partitions... W/o `Partition 5` it's totally unclear why the metadata could not be found and why we hit `max.block.ms` on `send()`; the generic error `Topic %s not present in metadata after %d ms.` does not help us to identify this case. Or did you mean, if `partition == null`? For this case, I agree it's not really helpful to log `partition` (as the user does not care anyway), but it think we don't log it anyway given the current code... (cf below) > Seems to me that 3 messages would be best so that we never get a "null" as an ugly insert. wdyt? I believe the code is correct as-is. There is only two cases: 1. `partition == null`. If we get metadata successfully, it implies that `partitionCount != null` and we just move on and don't throw `TimeoutException`. If we cannot get any metadata for the topic, we stay with `partitionCount == null` and use `Topic %s not present in metadata after %d ms.` for the `TimeoutException`. 2. `partition != null`. For this case, if we get metadata successfully, we have a `partitionCount`, but we might still fail, if we don't get the metadata for the specified partition. For this case, we would use `Partition %d of topic %s with partition count %d is not present in metadata after %d ms` as error message. The other failure case is, that we don't get any metadata for the topic at all, and thus `partitionCount == null` and we still log the right thing. The fact that `partition != null` does not matter if `partitionCount == null`. Hence, case (3) is not an error case... If we did get metadata (`partitionCount != null`) and if the user does not care (`partition == null`), we will never throw `TimeoutException`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org