Hangleton commented on code in PR #13558:
URL: https://github.com/apache/kafka/pull/13558#discussion_r1198005710
##########
core/src/main/scala/kafka/zk/KafkaZkClient.scala:
##########
@@ -2113,7 +2113,12 @@ class KafkaZkClient private[zk] (zooKeeperClient:
ZooKeeperClient, isSecure: Boo
}
private class CheckedEphemeral(path: String, data: Array[Byte]) extends
Logging {
+ private var attempt = 0
+ private val maxAttempt = 5
+ private val backoffMs = 1000
Review Comment:
This could indeed be an important element to consider when configuring the
retry strategy. We probably don't want to retry past the session timeout
because that exposes to the possibility that the session used during the first
call is not valid anymore.
And, the session timeout can be used as well to calibrate the response times
to be expected, since the Zk client in Kafka does not use request timeouts.
Another timeout to consider could be the read timeout which is the maximum time
allowed without activity on a connection before that connection is closed. The
Zookeeper protocol includes heartbeats (pings) performed at least once every
half of the read timeout. The read timeout is set to be the session timeout
multiplied by 2/3 divided by the number of hosts registered in the Zookeeper
cluster.
I haven't looked at Apache Curator to see what approach is used for retries
there, but that could be an interesting entry point I suppose.
So the retry strategy here requires a bit more thoughts.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]