Hangleton commented on code in PR #13558:
URL: https://github.com/apache/kafka/pull/13558#discussion_r1198005710
##########
core/src/main/scala/kafka/zk/KafkaZkClient.scala:
##########
@@ -2113,7 +2113,12 @@ class KafkaZkClient private[zk] (zooKeeperClient:
ZooKeeperClient, isSecure: Boo
}
private class CheckedEphemeral(path: String, data: Array[Byte]) extends
Logging {
+ private var attempt = 0
+ private val maxAttempt = 5
+ private val backoffMs = 1000
Review Comment:
This could indeed be an important element to consider when configuring the
retry strategy. We probably don't want to retry past the session timeout
because that exposes to the possibility that the session used during the first
call is not valid anymore.
And, the session timeout can be used as well to calibrate the response times
to be expected, since the Zk client in Kafka does not make use of request
timeouts. Another timeout to consider could be the read timeout which is the
maximum time allowed without activity on a connection before that connection is
closed. The Zookeeper protocol includes heartbeats (pings) performed at least
once every half of the read timeout. The read timeout is set to be the session
timeout multiplied by 2/3 divided by the number of hosts registered in the
Zookeeper cluster.
I haven't looked at Apache Curator to see what approach is used for retries
there, but that could be an interesting entry point I suppose.
So the retry strategy here requires a bit more thoughts.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]